Serving the Quantitative Finance Community

 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

sitmo C++ parallel random engine

February 4th, 2014, 1:53 pm

QuoteOriginally posted by: outrunQuoteOriginally posted by: MiloRambaldiQuoteOriginally posted by: outrunYes, here is a very simple C++11 version that has 128 threads. 1. Each thread initializes a local rng engine with a unique seed (based on incrementing a global counter)2. then each thread does it main job: it accumulated the results of 1.000.000 draws from the engine3. then -as each thread finishes- it updates a global result aggregationStep 1 and 3 use a mutex to prevent lockingoutput:128 threads generated 128000000 random numbers, the sum of them is 274892645070220693 code:Very nice! But one minor point: I don't think this is a "legitimate" way to do parallel random number generation, unless your PRNG has some special property that makes it safe. I have read papers about simulations giving completely false results by parallelizing MT with consecutive seeds. (This has been discussed in this forum. Apparently it was some sort of famous "catastrophe".) That's indeed the reason I implemented this. Different seeds *guarantee* non overlapping sequences. Other engines don't have this guarantee but this one does! It's basically encrypting the concatenation of a seed + a counter with an invertible cryptographic function. If you change the seed or counter you'll get a different random output that won't correlate in any statistical way with the original random number.How do the different seeds *guarantee* non overlapping sequences? If one creates M different seeds and then draws N numbers starting from each seed, there's always some chance that one of those M*N draws will stumble into one of the other M-1 seeds so that the remainder of the draws on one thread will match the draws found at the beginning of another thread. The chance of an overlap, like the chance of two people in a room sharing a birthday, can be surprisingly high, especially for high thread counts (e.g. GPUs or HPC). The cardinality of possible matches would be O(M^2*N) which would be compared to the size of the total period of the RNG. In the example you gave of M=128, N=1000000, the chance of a overlap would be very high for any RNG with a period of 2^32 even though the example only makes less than 2^27 draws. For a very high period-length Mersenne Twister, the chance of a overlap is extremely low, but never zero.Or do you have a clever way to ensure that your choice of the M sequential seeds are each spaced MORE than N draws apart in the period space of the RNG?
 
User avatar
Cuchulainn
Posts: 20254
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

sitmo C++ parallel random engine

February 15th, 2014, 11:57 am

Is there any new updates on the prng software? Cool (?) feature request would be to add these functions so that users don't have to do it themselves Thanks
Last edited by Cuchulainn on February 14th, 2014, 11:00 pm, edited 1 time in total.
 
User avatar
Polter
Posts: 1
Joined: April 29th, 2008, 4:55 pm

sitmo C++ parallel random engine

February 15th, 2014, 9:40 pm

BTW, there's an interesting post by Ulrich Drepper which is somewhat on topic /* ;-) */, http://www.akkadia.org/drepper/blog/html/randomcc.html
 
User avatar
Polter
Posts: 1
Joined: April 29th, 2008, 4:55 pm

sitmo C++ parallel random engine

February 15th, 2014, 9:45 pm

As for the interface discussion, these two proposals (from the 2014-01-pre-Issaquah mailing) raise some issues (with proposed solutions), might be worth a look:- A `sample` (random sampling) Proposalhttp://www.open-std.org/jtc1/sc22/wg21/docs/pa ... n3842.pdf- Random Number Generation is Not Simple!http://www.open-std.org/jtc1/sc22/wg21/ ... dfThoughts?
 
User avatar
Cuchulainn
Posts: 20254
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

sitmo C++ parallel random engine

February 19th, 2014, 7:31 am

QuoteOriginally posted by: outrunNice ideas, I have no strong opinions about them..I once tried to add a multivariate normal distribution to boost math (the density pdf, cdf etc, *not* random sampling) and that was already too complicated: what to do with the cov, the x vector...? I couldn't solve the discussions at the time...A thing I would design different is the returning of a array container in the link in the first post (randomcc). There is a need for some pattern like than when generating random sample paths from an SDE. I would provide some "fill_sample_path(output_it, SDE, S0?, t0,dt? -or- t_iterator , ...)" function (this is just a sketch, I haven't thought this through except for the container access part) that gets passed an iterator of SDE::state elements that gets filled (random access for techniques like Brownian bridge, ...for forward Euler path generation a forward iterator would suffice) instead of returning a container.Thoughts?Write some working code up so that people can get an idea of how it works. Then they can give feedback.BTW what's against containers? In almost all cases developers use std::vector<double> anyways?
Last edited by Cuchulainn on February 18th, 2014, 11:00 pm, edited 1 time in total.
 
User avatar
Cuchulainn
Posts: 20254
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

sitmo C++ parallel random engine

February 19th, 2014, 7:46 am

QuoteOriginally posted by: outrunI'd like to know this: why do you want NextDouble() instead of std::uniform_real_distributionProbably because you're in different language? .NET? That would lead to a generic .NET interface wrapper in top of C++11 engines?It does not have be called NextDouble()!! It's the signature of the function that is more important. And it is _not_ .NET-specific.The issue is prng must be extended by each client code in each language. No problem, just saying.
Last edited by Cuchulainn on February 18th, 2014, 11:00 pm, edited 1 time in total.
 
User avatar
Cuchulainn
Posts: 20254
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

sitmo C++ parallel random engine

February 19th, 2014, 7:49 am

Instead of the iterator idea as 'driver' I find interfaces in .NET (as in the days of COM) leads to narrow interfaces and loose coupling.
Last edited by Cuchulainn on February 18th, 2014, 11:00 pm, edited 1 time in total.
 
User avatar
Cuchulainn
Posts: 20254
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

sitmo C++ parallel random engine

February 19th, 2014, 8:21 am

QuoteOriginally posted by: outrunQuoteOriginally posted by: CuchulainnQuoteOriginally posted by: outrunNice ideas, I have no strong opinions about them..I once tried to add a multivariate normal distribution to boost math (the density pdf, cdf etc, *not* random sampling) and that was already too complicated: what to do with the cov, the x vector...? I couldn't solve the discussions at the time...A thing I would design different is the returning of a array container in the link in the first post (randomcc). There is a need for some pattern like than when generating random sample paths from an SDE. I would provide some "fill_sample_path(output_it, SDE, S0?, t0,dt? -or- t_iterator , ...)" function (this is just a sketch, I haven't thought this through except for the container access part) that gets passed an iterator of SDE::state elements that gets filled (random access for techniques like Brownian bridge, ...for forward Euler path generation a forward iterator would suffice) instead of returning a container.Thoughts?Write some working code up so that people can get an idea of how it works. Then they can give feedback.BTW what's against containers? In almost all cases developers use std::vector<double> anyways?A container is well defines and everybody knows them, just google the definition. It's useless to force people to use vector<double>, what the benefit of that limitation? Do you think writing a template and using design concepts is too difficult? For SDE one could use a vector of states to store a path. The state for stoch vol models could eg be 2 doubles, my hunch is that the SDE type should provide its "state type".As I mentioned, provide a simple C++ code case to test your hypothesis. And let developers give their opinions!And many developers are not all that comfortable with the STL concepts. And at the end of the day most developers use std::vector<double> anyways.However, I do see the point of using generic concepts, but it you know it's a helluva lot of work. An engineering compromise is usually what happens with developers.
Last edited by Cuchulainn on February 18th, 2014, 11:00 pm, edited 1 time in total.