Serving the Quantitative Finance Community

 
User avatar
MiloRambaldi
Posts: 1
Joined: July 26th, 2010, 12:27 am

sitmo C++ normal distribution

April 28th, 2014, 5:30 am

QuoteOriginally posted by: outrunMT is not a *very* good random engine (but better than most), e.g. the engine I've posted is much better random quality, same speed, O(1) discard. I think I can manage to get it accepted to boost which would make it easier to use. In particular MT fails these test:Yes, I agree the new generators would be an excellent addition to boost::random. They have clear advantages.However, I am very skeptical of your claim that the new engine produces better quality pseudo-random numbers. All linear generators over a finite field (such as MT) fail the linear complexity test. They are not cryptographically secure (I don't know if your new generators are?). But this is generally only a requirement for cryptographic applications (see e.g. Section 5.1 of "TestU01: A C Library for Empirical Testing of Random Number Generators").
Last edited by MiloRambaldi on April 27th, 2014, 10:00 pm, edited 1 time in total.
 
User avatar
Cuchulainn
Posts: 20252
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

sitmo C++ normal distribution

October 19th, 2014, 2:25 pm

QuoteOriginally posted by: MiloRambaldiQuoteOriginally posted by: outrunQuoteOriginally posted by: MiloRambaldiI was looking at QuantLib's implementation and it is closely tied to the MT19937 (Mersenne Twister) generator. That's a design flaw, they should have used orthogonal concepts and reuse standard concepts, abstract away specifics as much as possible. Now they can only eat their own spaghetti food. It's an old project, it's alway easy to judge in retrospect but e.g. Efficient C++ by Scott Meyers has been around for ages. QuoteIt only uses 24-bits of the MT output to avoid some correlation issue.Do you know if these issues have been resolved in boost? I cannot find the post you once made here about ziggurat quality issues.The issue seems to be with MT, not with ziggarat. MT is standardised algorithm (with some constants) which should be *identical* across implementations, so boost can't 'fix' MT, nor ziggurat, the combination is apparently invalid? It would be interesting if you have some more info about that issue?To be fair, QuantLib's polar Box-Muller appears to be well-designed with appropriate orthogonality and abstraction. Perhaps there is some reason that their Ziggurat hard-codes MT. I plan to look into this further very soon.Several possible reasons for putting BM into history:1. slow: it uses log, sqrt, sin, cos2. Peter Jaeckel in his book discusses nasty side-effects with this method; it goes haywire with pseudo-random algorithms.3. It is claimed that BM is mucho slower than Ziggurat, especially for large numbers of random numbers.Are these good reason for not using BM?QuoteThat's a design flaw, they should have used orthogonal concepts and reuse standard concepts, abstract away specifics as much as possible.In fairness, the QL code is shipping now.
Last edited by Cuchulainn on October 18th, 2014, 10:00 pm, edited 1 time in total.
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

sitmo C++ normal distribution

October 19th, 2014, 2:46 pm

QuoteOriginally posted by: CuchulainnQuoteOriginally posted by: MiloRambaldiQuoteOriginally posted by: outrunQuoteOriginally posted by: MiloRambaldiI was looking at QuantLib's implementation and it is closely tied to the MT19937 (Mersenne Twister) generator. That's a design flaw, they should have used orthogonal concepts and reuse standard concepts, abstract away specifics as much as possible. Now they can only eat their own spaghetti food. It's an old project, it's alway easy to judge in retrospect but e.g. Efficient C++ by Scott Meyers has been around for ages. QuoteIt only uses 24-bits of the MT output to avoid some correlation issue.Do you know if these issues have been resolved in boost? I cannot find the post you once made here about ziggurat quality issues.The issue seems to be with MT, not with ziggarat. MT is standardised algorithm (with some constants) which should be *identical* across implementations, so boost can't 'fix' MT, nor ziggurat, the combination is apparently invalid? It would be interesting if you have some more info about that issue?To be fair, QuantLib's polar Box-Muller appears to be well-designed with appropriate orthogonality and abstraction. Perhaps there is some reason that their Ziggurat hard-codes MT. I plan to look into this further very soon.Several possible reasons for putting BM into history:1. slow: it uses log, sqrt, sin, cos2. Peter Jaeckel in his book discusses nasty side-effects with this method; it goes haywire with pseudo-random algorithms.3. It is claimed that BM is mucho slower than Ziggurat, especially for large numbers of random numbers.Are these good reason for not using BM?QuoteThat's a design flaw, they should have used orthogonal concepts and reuse standard concepts, abstract away specifics as much as possible.In fairness, the QL code is shipping now.Maybe not. Ziggurat has stochastic latency and consumes a stochastic number of RNs which may be unacceptable in some cases. The claims about relative speed make assumptions that are not true in practice. For example, BM on a GPU is 20X faster than Ziggurat on a CPU. I also seem to recall some differences between BM and Z in terms of how well they fill the space of floating point numbers but don't remember which one had which flaws.There's no one perfect RNG and anyone doing important work using them must be aware of the flaws of each and every method.
 
User avatar
Cuchulainn
Posts: 20252
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

sitmo C++ normal distribution

October 19th, 2014, 2:58 pm

QuoteOriginally posted by: Traden4AlphaQuoteOriginally posted by: CuchulainnQuoteOriginally posted by: MiloRambaldiQuoteOriginally posted by: outrunQuoteOriginally posted by: MiloRambaldiI was looking at QuantLib's implementation and it is closely tied to the MT19937 (Mersenne Twister) generator. That's a design flaw, they should have used orthogonal concepts and reuse standard concepts, abstract away specifics as much as possible. Now they can only eat their own spaghetti food. It's an old project, it's alway easy to judge in retrospect but e.g. Efficient C++ by Scott Meyers has been around for ages. QuoteIt only uses 24-bits of the MT output to avoid some correlation issue.Do you know if these issues have been resolved in boost? I cannot find the post you once made here about ziggurat quality issues.The issue seems to be with MT, not with ziggarat. MT is standardised algorithm (with some constants) which should be *identical* across implementations, so boost can't 'fix' MT, nor ziggurat, the combination is apparently invalid? It would be interesting if you have some more info about that issue?To be fair, QuantLib's polar Box-Muller appears to be well-designed with appropriate orthogonality and abstraction. Perhaps there is some reason that their Ziggurat hard-codes MT. I plan to look into this further very soon.Several possible reasons for putting BM into history:1. slow: it uses log, sqrt, sin, cos2. Peter Jaeckel in his book discusses nasty side-effects with this method; it goes haywire with pseudo-random algorithms.3. It is claimed that BM is mucho slower than Ziggurat, especially for large numbers of random numbers.Are these good reason for not using BM?QuoteThat's a design flaw, they should have used orthogonal concepts and reuse standard concepts, abstract away specifics as much as possible.In fairness, the QL code is shipping now.Maybe not. Ziggurat has stochastic latency and consumes a stochastic number of RNs which may be unacceptable in some cases. The claims about relative speed make assumptions that are not true in practice. For example, BM on a GPU is 20X faster than Ziggurat on a CPU. I also seem to recall some differences between BM and Z in terms of how well they fill the space of floating point numbers but don't remember which one had which flaws.There's no one perfect RNG and anyone doing important work using them must be aware of the flaws of each and every method.3. I agree, no silver bullet. But points 1, 2 are kind of fundamental and need to addressed as well.I don't why we should compare GPUs with CPU. The main issue was accuracy and not efficiency, well not completely.
Last edited by Cuchulainn on October 18th, 2014, 10:00 pm, edited 1 time in total.
 
User avatar
Cuchulainn
Posts: 20252
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

sitmo C++ normal distribution

October 19th, 2014, 3:09 pm

There are TWO variants of Box Muller; which one are ye referring to?BTW Quantlib seems to use the Polar BM and qua design (structure) it is similar to Boost Random and C++11 random. The interface function names are different. So what? IMO it is a big endian/little endian/tomato/tomayto difference.
Last edited by Cuchulainn on October 18th, 2014, 10:00 pm, edited 1 time in total.
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

sitmo C++ normal distribution

October 19th, 2014, 3:22 pm

QuoteOriginally posted by: CuchulainnQuoteOriginally posted by: Traden4AlphaQuoteOriginally posted by: CuchulainnQuoteOriginally posted by: MiloRambaldiQuoteOriginally posted by: outrunQuoteOriginally posted by: MiloRambaldiI was looking at QuantLib's implementation and it is closely tied to the MT19937 (Mersenne Twister) generator. That's a design flaw, they should have used orthogonal concepts and reuse standard concepts, abstract away specifics as much as possible. Now they can only eat their own spaghetti food. It's an old project, it's alway easy to judge in retrospect but e.g. Efficient C++ by Scott Meyers has been around for ages. QuoteIt only uses 24-bits of the MT output to avoid some correlation issue.Do you know if these issues have been resolved in boost? I cannot find the post you once made here about ziggurat quality issues.The issue seems to be with MT, not with ziggarat. MT is standardised algorithm (with some constants) which should be *identical* across implementations, so boost can't 'fix' MT, nor ziggurat, the combination is apparently invalid? It would be interesting if you have some more info about that issue?To be fair, QuantLib's polar Box-Muller appears to be well-designed with appropriate orthogonality and abstraction. Perhaps there is some reason that their Ziggurat hard-codes MT. I plan to look into this further very soon.Several possible reasons for putting BM into history:1. slow: it uses log, sqrt, sin, cos2. Peter Jaeckel in his book discusses nasty side-effects with this method; it goes haywire with pseudo-random algorithms.3. It is claimed that BM is mucho slower than Ziggurat, especially for large numbers of random numbers.Are these good reason for not using BM?QuoteThat's a design flaw, they should have used orthogonal concepts and reuse standard concepts, abstract away specifics as much as possible.In fairness, the QL code is shipping now.Maybe not. Ziggurat has stochastic latency and consumes a stochastic number of RNs which may be unacceptable in some cases. The claims about relative speed make assumptions that are not true in practice. For example, BM on a GPU is 20X faster than Ziggurat on a CPU. I also seem to recall some differences between BM and Z in terms of how well they fill the space of floating point numbers but don't remember which one had which flaws.There's no one perfect RNG and anyone doing important work using them must be aware of the flaws of each and every method.3. I agree, no silver bullet. But points 1, 2 are kind of fundamental and need to addressed as well.I don't why we should compare GPUs with CPU. The main issue was accuracy and not efficiency, well not completely.Points 1 and 3 are about efficiency, right? And we seem to be most concerned with RNG performance on extremely large sample sizes which is also when we'd be most likely to think about a GPU solution. Thus, the poor performance of Z on GPUs becomes a major downside of Z for high N.Point 2 is certainly a major concern. Yet Ziggurat also uses sequences of RNs in a structured way that could make it prone to analogous flaws. The fact that BM has flaws is not evidence that Z lacks flaws.
 
User avatar
Cuchulainn
Posts: 20252
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

sitmo C++ normal distribution

October 20th, 2014, 5:17 am

QuoteYet Ziggurat also uses sequences of RNs in a structured way that could make it prone to analogous flaws. The fact that BM has flaws is not evidence that Z lacks flaws.Is it possible to 'reproduce' these issues? Or is it difficult?This looks likes a good discussion
Last edited by Cuchulainn on October 19th, 2014, 10:00 pm, edited 1 time in total.
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

sitmo C++ normal distribution

October 20th, 2014, 11:32 am

QuoteOriginally posted by: CuchulainnQuoteYet Ziggurat also uses sequences of RNs in a structured way that could make it prone to analogous flaws. The fact that BM has flaws is not evidence that Z lacks flaws.Is it possible to 'reproduce' these issues? Or is it difficult?This looks likes a good discussionAs with most semi-competent RNGs, producing the issue is extremely hard and reproducing it is easier. I do not "know" that Ziggurat has any problems with any particular uniform RNG algorithms but would not be surprised if they exist because of Ziggurat's non-uniform handling of RNsI'd be very surprised if Ziggurat is worse than BM on structured artifacts. But who knows?!?!But the point still remains that BM beats Ziggurat in SIMD computing architectures and hard realtime applications. It frightens me that the C# code you cited is not guaranteed to halt.I'd also note that the C# code you cited consumes at least 3 RNs to create each Gaussian RN. If the underlying RNG is expensive (e.g., a complex algorithm, hardware RNG, or limited pool of RNs), then BM will be more efficient.Which goes back to the original point that different RNGs are "best" in different situations.
 
User avatar
Cuchulainn
Posts: 20252
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

sitmo C++ normal distribution

October 21st, 2014, 10:15 am

From QL documentation they say they don't use traditional BM and its variants do not preserve a sequence's low discrepancy. Moro.. When it is OK to use BM? "It (BM) was an invention of great ingenuity and insight at the time, but now it has had its day."- P.Jaeckel
Last edited by Cuchulainn on October 20th, 2014, 10:00 pm, edited 1 time in total.
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

sitmo C++ normal distribution

October 21st, 2014, 10:45 am

QuoteOriginally posted by: CuchulainnFrom QL documentation they say they don't use traditional BM and its variants do not preserve a sequence's low discrepancy. Moro.. When it is OK to use BM? "It (BM) was an invention of great ingenuity and insight at the time, but now it has had its day."- P.JaeckelMethinks that if GPU and other SIMD architectures become more prevalent in HPC, the same will be said of Ziggurat.Final pronouncements such as Jaeckel's seem more a mark of arrogance than intelligence.
 
User avatar
Cuchulainn
Posts: 20252
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

sitmo C++ normal distribution

October 21st, 2014, 11:01 am

QuoteOriginally posted by: Traden4AlphaQuoteOriginally posted by: CuchulainnFrom QL documentation they say they don't use traditional BM and its variants do not preserve a sequence's low discrepancy. Moro.. When it is OK to use BM? "It (BM) was an invention of great ingenuity and insight at the time, but now it has had its day."- P.JaeckelMethinks that if GPU and other SIMD architectures become more prevalent in HPC, the same will be said of Ziggurat.Final pronouncements such as Jaeckel's seem more a mark of arrogance than intelligence.I think he was referring to the Neave effect which has some basis in reality. Should BM be used? When yes and when no?
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

sitmo C++ normal distribution

October 21st, 2014, 11:46 am

QuoteOriginally posted by: CuchulainnQuoteOriginally posted by: Traden4AlphaQuoteOriginally posted by: CuchulainnFrom QL documentation they say they don't use traditional BM and its variants do not preserve a sequence's low discrepancy. Moro.. When it is OK to use BM? "It (BM) was an invention of great ingenuity and insight at the time, but now it has had its day."- P.JaeckelMethinks that if GPU and other SIMD architectures become more prevalent in HPC, the same will be said of Ziggurat.Final pronouncements such as Jaeckel's seem more a mark of arrogance than intelligence.I think he was referring to the Neave effect which has some basis in reality. Should BM be used? When yes and when no?The Neave effect suggests that BM may be fine as long as one does not use LCG for the underlying RNG. Similarly, Ziggurat may have Neave effect-like problems if there's any structure at all in the tails of the underlying RNG.The larger issue is that there's at least three district layers in a most RNGs and the layers can interact in nasty ways.Level 1: the underlying random bit maker (e.g., LCG, MT, hardware, etc.)Level 2: the random bit data type transformer (e.g., transforming uniform integers to uniform floats)Level 3: the distribution shaper (e.g., BM, Ziggurat, inverse transform, etc.) Each of the levels has different effects on artifacts, performance, and suitability. The Neave effect is caused by interactions between choices at levels 1 and 3. The missing tail and missing near-zero effects that we've discussed in the past are artifacts of level 2 that may be amplified or attenuated by level 3. Whether and how one creates LDS depends on level 3. Some level 3 algorithms suck on SIMD architectures such as GPUs.None of the choices is perfect, all have side-effects or interdependencies, and it's up to the designer to make the trade-offs in choosing the elements of these three levels.
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

sitmo C++ normal distribution

October 21st, 2014, 12:44 pm

QuoteOriginally posted by: outrunZiggurat: non-constant but bounded time, fast, needs multiple independent random uniform samplesAre you sure it's bounded? The code and pseudocode that's I've seen seems to be unbounded.