Serving the Quantitative Finance Community

 
User avatar
Cuchulainn
Posts: 22924
Joined: July 16th, 2004, 7:38 am

Efficient choosing algorithm

October 5th, 2013, 4:37 pm

QuoteOriginally posted by: Traden4AlphaQuoteOriginally posted by: CuchulainnQuoteOriginally posted by: outrunYes on a functional level, but I can't look inside those calls. If these are precisely documented, what else do you need to know? O(n) Analogy: in the olde mainframe days you had to pre-estimate:a. the number of operationsb. the number of words in memory.If you know a. and b. for shuffle() then finito, yes?Assuming one trusts the documentation! That's not a problem; most developers don't document their work. Plan B: adopt NIH and write your own algorithm.
Last edited by Cuchulainn on October 4th, 2013, 10:00 pm, edited 1 time in total.
 
User avatar
Cuchulainn
Posts: 22924
Joined: July 16th, 2004, 7:38 am

Efficient choosing algorithm

October 6th, 2013, 9:07 am

QuoteOriginally posted by: outrunI expect the C++ random shuffle to have zero probability of generating an unshuffled sequence? If so then it's not usable because that will cause statistical biases in the generated sample sequences!¿ Knuth 3.4.2 Algorithm P ¿
 
User avatar
Cuchulainn
Posts: 22924
Joined: July 16th, 2004, 7:38 am

Efficient choosing algorithm

October 7th, 2013, 9:21 am

QuoteOriginally posted by: outrunQuoteOriginally posted by: CuchulainnQuoteOriginally posted by: outrunI expect the C++ random shuffle to have zero probability of generating an unshuffled sequence? If so then it's not usable because that will cause statistical biases in the generated sample sequences!¿ Knuth 3.4.2 Algorithm P ¿I still need to check this, what would you expect the outcome to be? How many times will it print "Yes", 0 or +/-1000?Not sure, new way of thinking for me Has it got to do with riffling? And this looks relevant.
Last edited by Cuchulainn on October 6th, 2013, 10:00 pm, edited 1 time in total.
 
User avatar
acastaldo
Posts: 14
Joined: October 11th, 2002, 11:24 pm

Efficient choosing algorithm

October 7th, 2013, 5:05 pm

The following is the Fisher-Yates shuffle, also known as Knuth 3.4.2 Algorithm P: /* To shuffle an array a() of n elements (indices 0..n-1): */for i from n - 1 downto 1 do j <- random integer with 0 <= j <= i exchange a(j) and a(i)end forthis is known to give each permutation with equal probabilitynotice that i may be equal to j, which causes an "empty shuffle" sometimes; this is the secret of why (1,2) is not always shuffled to (2,1), sometimes it is left the same (hopefully half the time!)
 
User avatar
Cuchulainn
Posts: 22924
Joined: July 16th, 2004, 7:38 am

Efficient choosing algorithm

October 8th, 2013, 8:12 am

QuoteOriginally posted by: outrunExcellent acastaldo, .. that's smart allowing for i to be j, and that will indeed solve it.I have opened a post on Programming to inquire if this algo is in C++ 11. BTW is it possible to devise a test to see if the algo is 'good'?
Last edited by Cuchulainn on October 7th, 2013, 10:00 pm, edited 1 time in total.
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

Efficient choosing algorithm

October 8th, 2013, 1:16 pm

QuoteOriginally posted by: outrunQuoteOriginally posted by: CuchulainnQuoteOriginally posted by: outrunExcellent acastaldo, .. that's smart allowing for i to be j, and that will indeed solve it.I have opened a post on Programming to inquire if this algo is in C++ 11. BTW is it possible to devise a test to see if the algo is 'good'?Yes, a good algorithm will generate all possible permutations of the elements with equal probability. A bad algorithm will have a bias in the probabilities. Eg when supplied with a set of elements in some initial order (like a vector) a bad algorithm might *never* return the elements in their original order. That's what the snippet with the loop to 6000 tests. In one of 6 shuffles is should leave the set of 3 elements in their original order (3 elements -> 3!=6 possible orderings of the elements)There would be other tests, too. Given an input of 1 to N, the position-weighted moments of a infinite run of permutations should have easily-calculated values. Any non-uniformity of the shuffler will resolve as statistical discrepancy between the theoretical value and the computational/empirical value of those moments.
 
User avatar
Cuchulainn
Posts: 22924
Joined: July 16th, 2004, 7:38 am

Efficient choosing algorithm

October 10th, 2013, 11:02 am

QuoteThis simplifies the statistical test we have to do, right? We don't need to test the random engine, only the shuffle. We might even feed it with a fake generator and prove it's validity by examining all possible internal code paths?That's an interesting remark indeed. Then you test the shuffle w/o getting ruffled by rand(). Many applications won't need 2 * 10^7 Rns anyways.
Last edited by Cuchulainn on October 9th, 2013, 10:00 pm, edited 1 time in total.
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

Efficient choosing algorithm

October 10th, 2013, 12:02 pm

QuoteOriginally posted by: CuchulainnQuoteThis simplifies the statistical test we have to do, right? We don't need to test the random engine, only the shuffle. We might even feed it with a fake generator and prove it's validity by examining all possible internal code paths?That's an interesting remark indeed. Then you test the shuffle w/o getting ruffled by rand(). Many applications won't need 2 * 10^7 Rns anyways.Only the smallest toy problems are safe with such an RNG. If N≥11, there's more than 2 * 10^7 possible shuffle orders. If rand() has only 2 * 10^7 unique values, and you have a mere N=15 items, then shuffle() will miss 99.9985% of the possible shuffle orders.