Serving the Quantitative Finance Community

 
User avatar
momentumpartners
Topic Author
Posts: 1
Joined: March 6th, 2009, 4:18 pm

copula-based variable selection

November 23rd, 2009, 5:50 pm

Hi,I would like to select some variables in a very large sample, to carry out a regression with variables as independent as possible. More precisely, I have 500 variables that I can use in a multiple regression, to explain the move of a stock A. But I want to select the variables I have to keep, so I have finally around 20 useful variables out of 500.To fulfill this selection, I thought I could use a model using copula (i don't want to use a linear regression) to characterize the dependence between my variable and then to remove the source of high dependence. I think method such as ICA or copula component analysis are likely to achieve this selection, but I didn't really find any concrete example or clear paper on that.COuld anyone tell me if I am completely wrong or not, if copula can effectively be used to achieve this selection and if there is any other good method for that?Thanks a lot!!
 
User avatar
pnrodriguez
Posts: 1
Joined: December 19th, 2008, 1:12 pm

copula-based variable selection

November 24th, 2009, 8:58 pm

Hi,If you want to use only several variables that resemble the variance (kurtosis) of your dataset, then use PCA (ICA). If you want to create a multivariate distribution given marginals, then copulas. But from my understanding, you want to eliminate irrelevant explanatory variables. There are a myriad of techniques that can be used to eliminate or filter irrelevant variables (Filters, Wrappers, Combinations, Feature Selection, etc). I like to use Random Forest for these kinds of experiments, although the choice, is of course, personal (background in ML). Another one that I like use, it is called ADHOC, Automatic Discovery in High Order Correlations. Hopefully you are able to find a set of inputs which help the learning method to generalize in independent samples. Hope this helps!
 
User avatar
rcohen
Posts: 8
Joined: November 15th, 2001, 12:06 pm

copula-based variable selection

November 25th, 2009, 10:32 am

QuoteOriginally posted by: momentumpartners...More precisely, I have 500 variables that I can use in a multiple regression, to explain the move of a stock A. But I want to select the variables I have to keep, so I have finally around 20 useful variables out of 500.Momentumpartners,Before you try to explain the move of stock A, can you first explain the following:1. Why do you think you potentially need 500 (!!!) variables, or even 20, to explain the move of a stock and2. unless one is gifted with some fantastic imagination, how the hell could anyone come up with 500, or even 20, independent economic/market variables to stick into a regression?
Last edited by rcohen on November 24th, 2009, 11:00 pm, edited 1 time in total.