SERVING THE QUANTITATIVE FINANCE COMMUNITY

Polter
Posts: 2526
Joined: April 29th, 2008, 4:55 pm

### Choice of matrix library and interface

QuoteOriginally posted by: outrunQuoteOriginally posted by: CuchulainnQuoteOriginally posted by: outrunThat's thr exact point, if they use the top line (map, I suspect they do: they call their default "map_std") instead of the bottom line (unordered map) then that is 100 times slower! My hunch is that sparse should only be 5x slower than dense with hash-keys, mauve even on par due to the much higher cache hit on the smaller memory footprint.100 times slower more or less is what I see as well.Nice! Too bad I have a deadline tomorrow that I need to work on all night . Would be nice to see if changing map to unordered_map solves that!Um... if you happen to be testing on MSVC, I wouldn't expect any improvement right now.Take a look here: http://www.codeproject.com/KB/cross-pla ... Net.aspxIn particular, the "C++ vs C#: Hash tables" and the "C# vs C++: Sorted maps" parts."This test pits two types of .NET Dictionary against the equivalent Microsoft C++ hash_maps (or unordered_map in VS2010). First I tried an <int, int> hashtable, then a <string, string> hashtable.""Why is the difference so large? Well, I suspect Microsoft's (Dinkumware) hash_map is simply terrible.""Strangely, however, the MS STL's map<K,V> is actually faster than its hashtable (hash_map or unordered_map), although not as fast as my hash_map. "That being said, I expect the performance to improve in futures versions of MSVC (and there shouldn't be problems with other compilers).At the same time, boost::unordered_map also already behaves better on MSVC (not mentioned in the benchmark above, to my recollection):http://groups.google.com/group/comp.lan ... a34d3c618a

Cuchulainn
Posts: 59944
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Choice of matrix library and interface

Well, I wrote a CGM using Boost mapped_matrix vs dense matrix. Maybe some calls are not optimal, but it is not clear where bottleneck is. The code is here If we can run it on non MSVC we can compare apples with apples.BTW what will we do with sparse matrices in this project?
Last edited by Cuchulainn on November 20th, 2011, 11:00 pm, edited 1 time in total.

Polter
Posts: 2526
Joined: April 29th, 2008, 4:55 pm

### Choice of matrix library and interface

Quotein the first link, VS2010 C++ has the same performance as C#, right? That shows that both might be optimized now.Negative, "With an occasional exception, C# wins by a landslide! One particular C++ scenario, x86 VC10 with SSE2, runs better than the others and when handling strings, outperforms C#. However, in most situations VC++ loses big time. The first test, for example, runs over 9 times faster in .NET4 x64 compared to C++ VC10 SSE2."He only got an improvement when he used his own implementation instead of the standard (std::unordered_map) one:"Anyway, I can say with certainty that it's not the C++ compiler's fault. I know this because I wrote my own version of hash_map for my company, and my implementation runs roughly ten times faster (with int keys). I knew I got better performance from my hashtable, but I didn't realize how dramatic the difference could be until I saw the numbers. Here are the results based on my implementation: [...]As you can see, C++ is much more competitive now. It wins some races, and loses others. My hash_map is heavily inspired by the design of .NET's Dictionary, and I am unable to guess why my version outperformed C# while doing int queries and removals, yet not inserts."QuoteHowever, in the google groups like shows the performance gain of unordered_map 0.6sec vs map 17sec.Yes -- which emphasizes that it wasn't the C++ compiler's fault per se, but rather the particular implementation (std::unordered_map). Consequently, when switching to better implementations, the problem is addressed: both the custom implementation ("my hash_map") and another reference implementation (boost::unordered_map) have no performance problems. However, that still means that std::unordered_map has performance problems, and they appear to be rather serious on as recent version as MSVC 2010."Again, certain "bad" parts of the Microsoft (Dinkumware) STL, like hash_map/unordered_map, are dramatically slower than the equivalent C# and should be avoided."
Last edited by Polter on November 20th, 2011, 11:00 pm, edited 1 time in total.

Cuchulainn
Posts: 59944
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Choice of matrix library and interface

OOPS sorry, wrong one it should be:

Polter
Posts: 2526
Joined: April 29th, 2008, 4:55 pm

### Choice of matrix library and interface

Cuch, thanks for the code. Could you highlight the parts of code you're interested in profiling first? Ideally: put the calls to boost::timer member-functions such as these: boost::timer Timer; // start timing int two = 1 + 1; // perform time consuming operation std::cout << Timer.elapsed() << " s\n"; // report elapsed time Timer.restart(); // restart timing int four = two + two; // perform another time consuming operation std::cout << Timer.elapsed() << " s\n"; // report elapsed timeIncidentally, on MSVC:QuoteSetting environment for using Microsoft Visual Studio 2010 x86 tools.TestConjugateGradient.cppTestConjugateGradient.cpp(160) : warning C4018: '<' : signed/unsigned mismatchTestConjugateGradient.cpp(165) : warning C4018: '<' : signed/unsigned mismatchTestConjugateGradient.cpp(196) : warning C4018: '<' : signed/unsigned mismatchTestConjugateGradient.cpp(201) : warning C4018: '<' : signed/unsigned mismatchTestConjugateGradient.cpp(89) : warning C4018: '<=' : signed/unsigned mismatch; TestConjugateGradient.cpp(171) : see reference to function template instantiation 'void ConjugateGradient<boost::numeric::ublas::matrix<T>>(const M &,const boost::numeric::ublas::vector<T> &,boost::numeric::ublas::vector<T> &) with [T=double,M=boost::numeric::ublas::matrix<double>]' being compiledTestConjugateGradient.cpp(86) : warning C4101: 'tmp' : unreferenced local variable You should be able to get rid of some of those mismatch warnings by just getting rid of "long" loop counter (just change it to std::size_t).
Last edited by Polter on November 20th, 2011, 11:00 pm, edited 1 time in total.

Cuchulainn
Posts: 59944
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Choice of matrix library and interface

QuoteCuch, thanks for the code. Could you highlight the parts of code you're interested in profiling first? Ideally: put the calls to boost::timer member-functions such as these:Hi Polter,CGM for mapped_matrix.It was an exercise in mapping algebra to C++ but is not clear to me what the Complexity is, e.g. is x += a faster than x = x + a etc. I have no idea.
Last edited by Cuchulainn on November 20th, 2011, 11:00 pm, edited 1 time in total.

Polter
Posts: 2526
Joined: April 29th, 2008, 4:55 pm

### Choice of matrix library and interface

Last edited by Polter on November 20th, 2011, 11:00 pm, edited 1 time in total.

Polter
Posts: 2526
Joined: April 29th, 2008, 4:55 pm

### Choice of matrix library and interface

Cuch -- do you think you could cut all the code that is not directly relevant for (nor a required dependency of) the task being profiled?Since you're the author it might be somewhat more optimal than if either me or outrun were to do that :-)Just a quick and rough estimate of the total execution time:// Now try it with sparse matricesu::matrix<double> mm(n,n);$time TestConjugateGradient_matrix.exereal 0m0.059suser 0m0.000ssys 0m0.015s// Now try it with sparse matricesu::mapped_matrix<double> mm(n,n);$ time TestConjugateGradient_mapped_matrix.exereal 0m0.075suser 0m0.000ssys 0m0.015sIs this roughly what you get?
Last edited by Polter on November 20th, 2011, 11:00 pm, edited 1 time in total.