cost function in linear regression

wh408 · November 14th, 2014, 8:47 am

Hi guys, I am not sure if this question has been asked before. Except computation reasons, why do not we minimize the distance between the points (x,y) and the straight line that we are trying to estimate. The usual cost function is to minimize the squares of the vertical distances. If we swap x and y, the cost function will minimize the squares of horizontal distances. Thus the usual cost funtion is "directional" which is not always resonable in my point of view, as sometimes I do not want to regard x or y to be dependent variable/explanatory variables.

bearish · November 14th, 2014, 10:46 am

What you describe is known as total regression or Deming regression Deming, and is popular in some fields. It really depends on what you assume about the structure of errors in your sample/model.

Traden4Alpha · November 14th, 2014, 12:26 pm

Bearish has the answer for a type of regression that does not ascribe the cost function to one of the two direction. It minimizes the squares of the orthogonal distances from the line whatever it's orientation.The reason to pick on direction over the other is tied to beliefs, knowledge, or data about about the quality of the measurements of x and y n the context of the model. If you believe that x is imposed, known, or measured perfectly, but y may be subject to some kind of noise or influences, then choose the y = f(x) formulation with vertical distance cost function. If y is perfect but x is noisy, then use x = f(y). If both are noisy, then use Deming.