April 19th, 2012, 7:51 am
Hi Brian,Your question was clear actually, but the answer depends on your assumptions. The correlation and covariance are both related statistical measures that summarizes your data. You can see correlation as the adjusted(normalized) covariance. I think the figure below explains what I was trying to explain quite well:The image is from wikipedia, and was initially created by Francis Anscombe. As the associated page underlines, all four series have the same mean (7.5), standard deviation (4.12), correlation (0.816) and regression line (y = 3 + 0.5x). Let's assume now that they are your successive X-Y return observations, and the right-most X value is the last observation which you're trying to estimate. The correlations will give an estimation on what your error rate will be if you do a linear regression(proof needed-)). But as you can see, your estimation error rate will inevitably also depend on the time-evolution.So if you want a short answer, use linear regression on the X-Y pairs. This means that you assume gaussianity of errors on both of your series. If you want a more robust approach, with time-variant relationships, use a kalman filter, which also assumes gaussianity but updates the underlying linear model recursively. Depending on how much complexity you want to introduce, you can use extended kalman filter or other methods under non-linearity assumptions.Whatever method/model you use, it is important to understand about their assumptions, and verify that they hold for your data.I hope this helps.