Serving the Quantitative Finance Community

 
User avatar
beata
Topic Author
Posts: 0
Joined: April 15th, 2004, 12:27 pm

minimum sample size Binomial

January 4th, 2007, 7:49 am

Hi,Can one remind me on basic statistics:I was wondering how to determine n (n is the minimum sample size) for a population with 10 000 observations. I know that the observations are independent. Confidence level is 95%. Observations are Binomially distributed.I want to be 95% confident that my sample has representative number of successes for the whole polulation. So how small sample size n should I choose?Thanks!B.
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

minimum sample size Binomial

January 4th, 2007, 10:02 am

What do you mean by "representative number of successes"? Regardless of n<N (where n is the size of the sample and N is the size of the population), there will always be some statistical discrepancy between the observed number of successes in an n-sample subpopulation versus that which would be expected from the value of N*p (the total number of successes in the entire population). For large n << N, the sigma on the sample estimate of p is SQRT(p*(1-p)/n), which implies that the sigma on the observed number of successes is roughly SQRT(n*p*(1-p)). As you can see, for n<<N, the higher the n, the better the estimate of p but the worse the error in the number of successes (n*p). As n approaches N, things change with both the sample estimate of p and the number of observed successes converging on the population values.
 
User avatar
beata
Topic Author
Posts: 0
Joined: April 15th, 2004, 12:27 pm

minimum sample size Binomial

January 4th, 2007, 11:55 am

and if I say that I accept some level of descrepancy (mainly I say 95% is good enough), can I then find minimum n?
 
User avatar
beata
Topic Author
Posts: 0
Joined: April 15th, 2004, 12:27 pm

minimum sample size Binomial

January 4th, 2007, 2:31 pm

sorry Traden4Alpha I did not notice your question: What do you mean by "representative number of successes"? I should ask my question probably a bit differently: if we take a sample of size n and there are no errors in this sample, which n should I choose in order to say with the 95% certainty that there are no errors in the whole population?thanks!B.