Log of variable

JosephFrank · November 20th, 2005, 9:10 pm

Hi,I am wondering when or why do we take the log of a certain independent variable when doing regressions? what is the rule of thumb?JF

Alan · November 21st, 2005, 12:49 am

Two typical rationales:(A) There is some good reason to believe Y = A X^B times (1 + error)(B) Y is strictly positive (volatility, a probability, a range, etc.) , but the OLS assumption requires normally distributed errors. A big error will violate the positivity of Y. So use log Y. regards,

NamelessWonder · November 21st, 2005, 1:35 pm

Does OLS assume normally distributed errors? I thought the general assumption was iid and also independent from X (the predicting var). Normality is commonly taken but not required. Or am i wrong?

Alan · November 21st, 2005, 2:05 pm

You're probably right. A better statement of my point wouldhave been that commonly chosen error distribution modelsmay violate positivity.regards,

Felpeyu · November 21st, 2005, 2:55 pm

The OLS, per se, doesn't need normality of residuals but if you have to do inference you'll need that assumption, for example t-test or f-test. A reason for using the log transformation is avoid the heterocedastik problem that appears when the variance isn't constant. In this case, the log transformation is a possible solution.

Aaron · November 21st, 2005, 4:53 pm

In addition to the theoretical answers given below, the log transformation is often used for data analytic reasons. If you start your analysis by plotting your data, you look at the shape. If it curves suggesting y = x^2 or y = exp(x), you try transforming the independent variable with a square root or a log, or transforming the dependent variable with a square or exponential. You can use other transformations, like cube roots or more complicated ones, but for most practical data analysis, simple is good unless you have a theoretical reason to make it complicated. Remember the direction of the curve doesn't matter, a downward sloping curve might be y = -x^2 or y = -exp(x). If you can estimate or extrapolate an x value that gives a minimum y, you can make that the zero point before transforming.

KL · November 23rd, 2005, 10:09 am

If i am not mistaken a log variable also has a different interpretation. The variable is no longer interpreted at the levels but as a rate of change(%). So for a semi log function of Y = ln(X1) + c. For a 1% change in X1 leads to a x unit change in Y. We usually logs when we wish to interpret the values in rates of change an not in values.