Serving the Quantitative Finance Community

deano1949
Topic Author
Posts: 3
Joined: December 14th, 2013, 8:36 pm

### How to find the best Bayesian posterior distribution given observed data?

Hi Guys,In order to get posterior distribution in Bayesian Inference, I need to assume a prior distribution and a likelihoods of observed data. According to Fisher's information, we can proxy prior distribution from likelihoods distribution. it makes that the assumed distribution of likelihoods is so important. so my question is what method is the best to obtain likelihood distribution? Assumed you are given a bunch of observed financial data and you dont have any prior knowledge of it, how to get the most appropriate distribution?One of the methods is maximum likelihood. Assume the data is normal distributed ~N(x,y), then seek x and y which maximise its probability. Personally, I doubt it is correct method, as it is an approach of frequentist, which might not be suitable for Bayesian. What method do you guys use to find distribution of likelihoods.Any comment is appreciated. Langyu

Alan
Posts: 10718
Joined: December 19th, 2001, 4:01 am
Location: California
Contact:

### How to find the best Bayesian posterior distribution given observed data?

Your implied premise is problematic. Say your data is a return series $\{r_i\}$ Initially, people explored the notion that the $r_i$ were best described as IID draws from some univariate density $p(r)$.In that context, your question makes sense -- you want to know the best distribution.But I'd say this approach was generally abandoned by the late 60's, as it failed to account for things like volatility clustering.However, if you want to catch up to the 60's try this classic by FamaNowdays, people tend to look at multivariate process models, often in continuous-time, often with hidden states that must be estimated.So they don't start with a distribution, but a parameterized process. Parameters and hidden statesare estimated (ideally and normatively) by maximum likelihood or good approximations. With the voluminous time series data in finance, such inference rarely depends on priors. p.s. I will add that priors play a strong role in the following sense. People tend to fall in love with classes of favorite processes.So, results from a particular researcher or group tend to always be from their favorite class. Given the vagaries of the whole(academic) notion that a stochastic process actually describes financial returns, and the zillions of possibleprocesses, this is inevitable.
Last edited by Alan on January 6th, 2014, 11:00 pm, edited 1 time in total.

deano1949
Topic Author
Posts: 3
Joined: December 14th, 2013, 8:36 pm

### How to find the best Bayesian posterior distribution given observed data?

Thanks for your comment Alan. Time series process is a classical approach, however, in terms of forecast, its power is relatively low, often over-fitting, and it is not flexible of adjusting coefficients (required often review). Therefore, I am thinking a bayesian approach to forecast financial return.I agree with you that with a large number of observed data, inference is dominated by likelihoods. However, now I have very limited data on the instrument that I am analysing. In this case, prior does matter. Surely, when time moving along, we collect more data, then prior is less meaningful.

Alan
Posts: 10718
Joined: December 19th, 2001, 4:01 am
Location: California
Contact:

### How to find the best Bayesian posterior distribution given observed data?

I see. Well, as a general comment, these forums can be quite helpful, but you need to be specific.As your question is currently posed, I doubt you'll get much useful. But, if you said something like, just to pick a security in the news,"I am trying to estimate the pdf for Twitter's stock price in one year, but don't have much history.Have tried X, Y, and Z." Then, you would probably get some good suggestions.

deano1949
Topic Author
Posts: 3
Joined: December 14th, 2013, 8:36 pm

### How to find the best Bayesian posterior distribution given observed data?

Thanks so much for your advice. It is really helpful. I am a fresher here, learning the rules and culture of this forum. TBH It is pretty cool and useful forum, I should had discovered it lot earlier. Somehow, I think I found the solution. For the bayesian inference, I shouldn't maximise likelihood of observed data, instead, I should maximise posterior. The method is called Maximum Posterior. Using this method, I would be able to calculate backward to obtain variables in the likelihood function, which maximise posterior. It is said such way would resolve over-fitting issue. Another good thing about bayesian is that it not only just give you a point estimate, but also a range of estimates with confidence interval.

Alan
Posts: 10718
Joined: December 19th, 2001, 4:01 am
Location: California
Contact:

### How to find the best Bayesian posterior distribution given observed data?

outrun, yes, exactly,deano1949, Well, maximum likelihood certainly has confidence intervals. If overfitting is an issue, it suggests you are trying to predict returns ...
Last edited by Alan on January 8th, 2014, 11:00 pm, edited 1 time in total.

Cuchulainn
Posts: 65000
Joined: July 16th, 2004, 7:38 am
Location: Drosophila melanogaster
Contact:

### How to find the best Bayesian posterior distribution given observed data?

QuoteOriginally posted by: outrunI think Alan himself did something like this for stoch vol models in an upcomming book? Indeed estmate the paramaters *and latent variables (vol)* to maximise the likelihood of observed data.@list posted there?
"Compatibility means deliberately repeating other people's mistakes."
David Wheeler

http://www.datasimfinancial.com
http://www.datasim.nl

chocolatemoney
Posts: 322
Joined: October 8th, 2008, 6:50 am

### How to find the best Bayesian posterior distribution given observed data?

QuoteOriginally posted by: deano1949Thanks so much for your advice. It is really helpful. I am a fresher here, learning the rules and culture of this forum. TBH It is pretty cool and useful forum, I should had discovered it lot earlier. Somehow, I think I found the solution. For the bayesian inference, I shouldn't maximise likelihood of observed data, instead, I should maximise posterior. The method is called Maximum Posterior. Using this method, I would be able to calculate backward to obtain variables in the likelihood function, which maximise posterior. It is said such way would resolve over-fitting issue. Another good thing about bayesian is that it not only just give you a point estimate, but also a range of estimates with confidence interval.Hi, thanks for posting on the topic. I don't understand why MAP would prevent over-fitting (in general and as opposed to ML).I suppose it is just a matter of degrees of freedom. As per the choice between MAP and ML, I am not an expert but I see ML just as a special case of MAP (uniform prior).I'd be great if you could point to some resources on the topic.

chocolatemoney
Posts: 322
Joined: October 8th, 2008, 6:50 am

### How to find the best Bayesian posterior distribution given observed data?

Follow-up thoughts: if your sample is small, some priors may in-fact reduce the degrees of freedom of your model, by constraining the parameters in a margin. I missed the part where the OP mentions that he's dealing with limited sample sizeI'd be interested in reading something more on the topic. Any recommendation?
Last edited by chocolatemoney on January 20th, 2014, 11:00 pm, edited 1 time in total.

AnalyticalVega
Posts: 2260
Joined: January 16th, 2013, 5:03 am

### How to find the best Bayesian posterior distribution given observed data?

QuoteOriginally posted by: chocolatemoneyFollow-up thoughts: if your sample is small, some priors may in-fact reduce the degrees of freedom of your model, by constraining the parameters in a margin. I missed the part where the OP mentions that he's dealing with limited sample sizeI'd be interested in reading something more on the topic. Any recommendation?If the sample test data is small, you are more likely to have bias in your estimator. The amount of overfitting is a probability (PBO) that needs to be calculated.