January 3rd, 2018, 5:07 pm
Hi @outrun. Thanks. Two questions.
1) As I understand this, in this example the matrix to be factored or inverted isn't square.
2) While it is linear with error squared, I may add weightings to the errors in the future, so keeping generic is good.
I am trying to get around the size limitation with GPR. Specifically:
I have 100000 or more training patterns. There is a lot of noise in these. Instead of letting these be the model, I will choose 1000 points (I'll call these support points), perhaps using low discrepency numbers for their x vector coordinates. Their x positioning will be fixed, however I'll try to optimise the values of their y to give me an optimal objective function (in this case error squared) on the 100000 or more test points.
With GPR, the forecasting step is, yhat = k* inv(K) y. Now, if I keep the x vectors of my support points fixed, and optimise for their y, then k* and K will be constant. In my case, the dimensions will be:
100000x1 = 100000x1000 1000*1000 1000x1
I intent to replace inv(K) y with M. M is an unknown column vector of 1000 points. Given I know k, I can fit yhat to y by finding optimal M. Once I'm happy with this, I can factor M to give me the y values of my 1000 support points.
I suppose this is a combination of SVM and GPR. Use hundreds of thousands of data points and find the y values of a handful (hundred or thousand) of support points in order to fit the data best.
I know @outrun is across GPR. Think that would work? My initial prototyping looks ok. I am using ConjugateGradient so far.
Apologies for my non-Tex maths.