August 7th, 2012, 1:21 am
Just in case anyone is wondering: As previously mentioned, in version 0.3, I generalized the mersenne twister code to allow for arbitrary linear generators modulo finite. A month or so ago, I noticed that using the new code, MT19937 was about 25% slower than before (and thus 25% slower than boost. The test program is qfcl/test/speed_test). While this probably won't make much of a difference when generating actual variates (as opposed to raw PRNGs), I was not happy with this. I was in fact expecting a slight speed up with the new design. I looked at the assembly code output, and could not see anything obvious (in fact the new design has much smaller assembly output). Profiling, including measuring the latency of short code segments is rather nontrivial (the profiler included with VC10 seems to be useless for this). Since this seems to be a skill that a library designer should have, I decided it was a worthwhile effort to profile the PRNGs. I haven't drilled down quite far enough yet, but I have a few tools in qfcl/test: timer_latency, engine_speed and engine_profiler. We now also have a DescriptiveStatistics class for computing and displaying statistical properties of a given sequence of data (it was originally designed a long time ago, and is not quite how I would do things now) ... I couldn't find anything like this in boost.Here is example output from engine_profiler. More details are still needed...engine_profiler -n -r 40