SERVING THE QUANTITATIVE FINANCE COMMUNITY

 
User avatar
zerdna
Posts: 3856
Joined: July 14th, 2002, 3:00 am

The Matchstick Problem

July 29th, 2004, 4:09 pm

Quotenot bad for someone that doesnt know that 1/x^2 is well behaved at infinity. If the distribution function tails off like 1/x^2 or slower, it's first moment is infinite, as well as higher moments. That's what i wrote to you. Most importantly, you cannot tell asymptotic behavior of a function by looking at a graph of function between 1 and 100 like you claimed you can. And this asymptotic behavior is the only thing that matters for telling if the average is infinite or not. Whether a center of the distribution moves with N or not is irrelevant.
Last edited by zerdna on July 31st, 2004, 10:00 pm, edited 1 time in total.
 
User avatar
alexandreC
Posts: 678
Joined: June 9th, 2004, 11:35 pm

The Matchstick Problem

July 29th, 2004, 4:57 pm

Last edited by alexandreC on October 19th, 2004, 10:00 pm, edited 1 time in total.
 
User avatar
daveangel
Posts: 17031
Joined: October 20th, 2003, 4:05 pm

The Matchstick Problem

July 29th, 2004, 6:24 pm

Agreed its extremely dodgy but it seems to fit my intuition that the ratio is closer to 3 than to 1. Lets call the ratio of large to small R1 and small to large R2 then we know that E(R1.R2) = 1. We also know that E(R2) is 2ln2 -1 then this is the herioc bit E(R1) = E(r1.R2)/E(r2) = 1/ (2ln2 - 1 ) ...
Last edited by daveangel on July 28th, 2004, 10:00 pm, edited 1 time in total.
knowledge comes, wisdom lingers
 
User avatar
mikebell
Topic Author
Posts: 1698
Joined: July 1st, 2003, 5:23 am

The Matchstick Problem

July 30th, 2004, 1:10 am

FWIW, zerdna's solution is closest to my way of thinking. The curves are fairly close.3? Simulations are always in the 18-25 range.
 
User avatar
spacemonkey
Posts: 443
Joined: August 14th, 2002, 3:17 am

The Matchstick Problem

July 30th, 2004, 1:24 pm

Assume that the smallest piece can be broken into pieces that are integer multiples of size 1/N (not including 0 since this assumes that the stick is not broken at all). Each size is equally likely. The ratio takes values of (1-i/N)/(i/N) for all integers i<N with probability 1/N. The expected value of the ratio of sizes is then,sum_{i=1}^{N-1} (1-i/N)/i.This is of course finite and increases without limit as N goes to infinity which is just as you would expect since the mean assuming a uniform distribution is infinite. If you perform a numerical simulation of this problem then this is the calculation that you are actually doing, since the computer can only produce a finite set of values from its random number generator. Quote3? Simulations are always in the 18-25 range. Strangely enough, if you assume a value for N of say 2^32 then this calculation gives an expected value of about 21.8.Anybody like to comment?
Last edited by spacemonkey on July 29th, 2004, 10:00 pm, edited 1 time in total.
 
User avatar
zerdna
Posts: 3856
Joined: July 14th, 2002, 3:00 am

The Matchstick Problem

July 30th, 2004, 2:12 pm

QuoteOriginally posted by: spacemonkeyAssume that the smallest piece can be broken into pieces that are integer multiples of size 1/N (not including 0 since this assumes that the stick is not broken at all). Each size is equally likely. The ratio takes values of (1-i/N)/(i/N) for all integers i<N with probability 1/N. The expected value of the ratio of sizes is then,sum_{i=1}^{N-1} (1-i/N)/i.This is of course finite and increases without limit as N goes to infinity which is just as you would expect since the mean assuming a uniform distribution is infinite. If you perform a numerical simulation of this problem then this is the calculation that you are actually doing, since the computer can only produce a finite set of values from its random number generator. Quote3? Simulations are always in the 18-25 range. Strangely enough, if you assume a value for N of say 2^32 then this calculation gives an expected value of about 21.8.Anybody like to comment?It seems that you are implying that random number generator could produce just 2^32 numbers that are spaced by the same step 1/N=2^(-32). What you are saying could be only a function of some "faulty" algorithm, because machine allows you to represent numbers small like 2^(-1054). The random generator that i used, for example, is matlab implementation of Marsaglia algorithm that represents smaller numbers with smaller granularity with lower frequency. It guarantees in fact that all floating numbers that machine could represent between 2^-53 and (1-2^(-53)) are produced by the generator and a total of over 2^1492 different numbers is produced. Note that idea of floating format is to represent numbers with different absolute accuracy and constant relative accuracy, unlike your assumption of fixed absolute accuracy.
Last edited by zerdna on July 29th, 2004, 10:00 pm, edited 1 time in total.
 
User avatar
spacemonkey
Posts: 443
Joined: August 14th, 2002, 3:17 am

The Matchstick Problem

July 30th, 2004, 8:56 pm

I am not an expert on pseudo random number generators, but I do know that there are commonly used rn generators that do only produce 2^32 different numbers, regardless of the machine precision. For example the random number generators from numerical recipies in C are limited in this way (I think), and the built in rand() function is limited to 32768 values. I don't know about the generator that you used and I am not necessarily claiming that your results are wrong, but I do think that this is an issue that people need to take into account when trying to simulate this problem.
Last edited by spacemonkey on July 29th, 2004, 10:00 pm, edited 1 time in total.
 
User avatar
zerdna
Posts: 3856
Joined: July 14th, 2002, 3:00 am

The Matchstick Problem

July 30th, 2004, 11:57 pm

QuoteI am not an expert on pseudo random number generators, but I do know that there are commonly used rn generators that do only produce 2^32 different numbers, regardless of the machine precision. For example the random number generators from numerical recipies in C are limited in this way (I think), and the built in rand() function is limited to 32768 values. NR is a textbook, not a description of actual software -- i think they use it as an illustration of how bad generators were before and write in capital letters "DO NOT USE". Now why do you say number 2^32 has nothing to do with machine precision? NR states clearly that this is an artefact of pseudorandom algorithms based on modulo arithmetic in case of single precision, correct? If my integer is 64 bit wide, don't you think 64 substitutes 32 and there are 2^64 integers? What is your understanding on where the number 32 came from in the first place?QuoteI don't know about the generator that you used and I am not necessarily claiming that your results are wrong, but I do think that this is an issue that people need to take into account when trying to simulate this problem.I know about it and i told you, rand() from matlab. You could look up specifics on mathworks.com. I am not necessary claiming my results are wrong either, i gave only back-of-the-envelope idea and a very simple simulation output -- hardly qualifies as results. What i am claiming that the mode value has nothing to do with your hypothesis. You asked to comment on your result which i did. I am saying that your model doesn't describe general random generators and doesn't have correct parameters for double precision numbers, therefore a similar numerical value hardly proves anything. Writing this simulation is 5 minutes, less than to write a post -- just do it. Would you mind also posting how you came up with 21 at N=2^32 for the value of your sum?
Last edited by zerdna on July 30th, 2004, 10:00 pm, edited 1 time in total.
 
User avatar
spacemonkey
Posts: 443
Joined: August 14th, 2002, 3:17 am

The Matchstick Problem

July 31st, 2004, 4:27 pm

For a linear congurential generator the next value is given by the formula,x_{n+1)=(a*x_n +b) mod mIt is the mod m that determines the maximum integer that this generator of random integers can generate. The recommended generator from NR has m of m=2^31-1. The largest integer that can be stored in an int on a 32 bit machine is 2^32-1 but the reason that this machine limit is unrelated is because it can be circumvented by a better random number generator, or equally totally wasted by a bad one (for example rand() is limited to 32768 on my 32 bit machine). Calculating the expected value using any computer simulation is equivalent to using a discrete distribution for the lengths involved. Assume that the length of the small stick can be any integer multiple of 1/2N (from 1/2N to 1/2) with equal probability the the expected ratio is, This is slightly different to the formula I posted before which was slightly wrong, you can calculate the answers using maple or mathematica. The idea is the same. For N=2^32 I get a value of 44.5. ALL simulations of the expected value performed by a computer will be limited in this way. The model might not be an exact model of what is being done since that depends on the exact nature of the simulation and the way it is done. My parameters are almost certainly wrong.The fact is that when simulating a continuous distribution on a digital computer you are in fact substituting a discrete distribution and there are situations where that discrete distribution will not be close to the correct answer especially if the exact answer does not exist.
Last edited by spacemonkey on July 30th, 2004, 10:00 pm, edited 1 time in total.
 
User avatar
daveangel
Posts: 17031
Joined: October 20th, 2003, 4:05 pm

The Matchstick Problem

July 31st, 2004, 4:29 pm

I have to say = this is what I love about this website. We start of with a match stick problem and end up with a discussion on randomn number generators. Well done!!
knowledge comes, wisdom lingers
 
User avatar
spacemonkey
Posts: 443
Joined: August 14th, 2002, 3:17 am

The Matchstick Problem

July 31st, 2004, 4:29 pm

By rand() I mean the C library function.
 
User avatar
zerdna
Posts: 3856
Joined: July 14th, 2002, 3:00 am

The Matchstick Problem

July 31st, 2004, 11:08 pm

QuoteThis is slightly different to the formula I posted before which was slightly wrong, you can calculate the answers using maple or mathematica. The idea is the same. For N=2^32 I get a value of 44.5. That is what you should have gotten, because this is the same thing i calculated before and for which the log approximation that i posted works. Now you wll get approximately 21 if you put N=10^5, not N=2^32=4*10^9. So do you agree that your result is completely irrelevant or not?QuoteALL simulations of the expected value performed by a computer will be limited in this way. The model might not be an exact model of what is being done since that depends on the exact nature of the simulation and the way it is done. My parameters are almost certainly wrong.The fact is that when simulating a continuous distribution on a digital computer you are in fact substituting a discrete distribution and there are situations where that discrete distribution will not be close to the correct answer especially if the exact answer does not exist.There are limitations, there are just not what you described. Marsaglia alg for example generates numbers with varying absolute discreteness as i said before.
Last edited by zerdna on July 31st, 2004, 10:00 pm, edited 1 time in total.
 
User avatar
spacemonkey
Posts: 443
Joined: August 14th, 2002, 3:17 am

The Matchstick Problem

August 1st, 2004, 3:20 pm

Irrelevant? No not at all! Maybe it doesn't change the answer, but it is still interesting (well to me anyway). In fact it seems to me that the non-existence of the expected value, and the fact that you can't use the central limit theorem and everything that goes with it, is the only thing that makes this an interesting question at all.If I was going to answer the original question I would first try to calculate the expected value, which is infinite. If I did a simulation to find the expected value I would get a meaningless result (unless I actually did something wrong in my previous posts), even with a really good r.n. generator, because the sample average will always exist but the expected value doesn't. So then I would try to find the most likely value of the sample average of 100,000 matches. This I think is what you (Zerdna) did, although maybe I misunderstood that as well. I dont know if using simulation to find this maximum is affected by the same problem. Would it matter if I was looking at the sample of 100,000,000 matches? Or 10? What if I had a really bad random number generator?. I was hoping for some comment on this and maybe some more general comments about this problem with random number generators. For what it is worth, I did a simulation in Matlab, and I get more or less the same graph that you did with the most likely value at about 25. My results give me an expected value for my simulated distribution of about 40ish. Since this is wrong, how can I trust the rest of my distribution?Am I the only person here who thinks this is interesting?Maybe I should have started a new thread.
Last edited by spacemonkey on July 31st, 2004, 10:00 pm, edited 1 time in total.
 
User avatar
mensa0
Posts: 191
Joined: January 20th, 2004, 8:56 am

The Matchstick Problem

September 1st, 2004, 1:49 pm

I just saw this thread and find it very interesting. My first thought is, "What is the probability of getting a ratio of 1?" There's only one way, obviously, when L1 = L2 =.5. What is the probability of getting a ratio of, say, 1.5? If L1=.6 and thus L2=.4, the ratio is 1.5 but it is also 1.5 if L1=.4 and L2=.6. The "L's" just switch places. So it is twice as likely to get a ratio of 1.5 than it is to get a ratio of 1.0. All ratios other than 1.0 are equally likely and have no upper bound (and L1 or L2 cannot be zero). I believe the average of the ratios is unbounded above since L1 can be arbitrarily close to one and thus L2 is arbitrarily close to zero. Why wouldn't the average of the ratios be unbounded as well? In the discrete formulation of this, the average appears to approach 20 or so, but this seems to me to be a failure of the discrete version to converge to the continuous version.Neat problem!Mike
 
User avatar
ScilabGuru
Posts: 297
Joined: October 16th, 2001, 2:14 pm

The Matchstick Problem

September 1st, 2004, 9:39 pm

Distributuion of shorter part can be obtained from elastisity theory. It is almost impossible to break a match from the edge. The elalstisity gives me (being specialist in mechanics 20 years ago trully speaking I don't remember) the necessary force ~ leverage, namely (1-x)/x. Having finite power in our fingers we then can estimate interval for x. The distribution will be closer to unifrom if we have glass matches and just drop them on the floor. Bu this is trivial.

  • Advertisement

ABOUT WILMOTT

PW by JB

Wilmott.com has been "Serving the Quantitative Finance Community" since 2001. Continued...


Twitter LinkedIn Instagram

JOBS BOARD

JOBS BOARD

Looking for a quant job, risk, algo trading,...? Browse jobs here...


GZIP: On