October 9th, 2002, 7:03 pm
QuoteOriginally posted by: palskySuppose you try to represent the dynamic of monthly stocks returns with a GBM. Interesting question is : how much data do I need to get a correct estimation of the variance (such as + or - 0,5% per example) and how much monthly data do I need to estimate the mean? I've done a theoretical computation using the postulate that the data comes "really" from a geometric brownian motion and I get rather depressing answers, such as 4,976 years for the mean.And that's not counting with the obvious fact that the geometric brownian motion is a rather poor fit.Do I go wrong?One of the properties of geometric brownian motion is that you can (in theory) estimate the variance exactly with any sample, however short, but estimating the mean has an error inversely proportional to the standard deviation of the length of the interval. For stock returns the monthly standard deviation is an order of magnitude higher than the mean, so you need 100 months just to get down to 100% error. 0.5% means 40,000 times that or 4 million months.So my estimate is almost 100 times yours. But both are absurd, obviously we do not expect parameters to remain constant over more than a few years, even if data were available. The only practical way to estimate mean stock returns is to use the average of many stocks over long periods of time, and to accept errors of significant fractions of the mean. No extra sampling or clever math is going to help you.For most purposes, knowing the mean is not very important. It is what it is. That's what the market gives us. Like it or hate it, you either take that or the risk-free rate. Knowing the distribution is important for planning purposes, but the mean is of mostly academic interest.Estimating the variance is much easier. As troywilson suggested, you can use more frequent data. Using the monthly high/low prices adds most of the efficiency of tick-by-tick data, and it's easier to get and work with. You also have implied option volatility to help you.The fact that GBM is not a good model doesn't matter much for mean estimation. On one hand, it has a significant effect on volatility estimation. On the other hand, you're probably not interested in mathematical variance, but on some generalized measure of distribution spread. This can be pinned down pretty accurately.