I think this idée fixe on a grand unifying theory is leading nowhere. Give it up. It's a distraction.
There are bigger challenges.
// No apostrophe in Finnegans
Concerning your comment over litterature, I could not say, I did not read Finnegans wakeThat's like saying that the "Finnegans Wake" and "Fifty Shades of Grey" are the same thing, because they both contain English words.
Eq. 5.1 in Bishop's book concerns regression problems. In such case you're not optimising x_i's, because they are fixed - they are the data. You're optimising the weights or changing the kernel itself.
Once you depart from the regression problem and start thinking about classification, for example, the kernel - NN analogy breaks down.
Kernel methods also enforce different boundary conditions on kernel functions than what we require from kernel functions. Kernel functions live in different functional spaces.
It's not enough for the symbols to be exchangeable![]()
Cuchullain, I am not developing the BET (Big Everything Theory). I am just saying that it probably exists, and that kernels could do the job.I think this idée fixe on a grand unifying theory is leading nowhere. Give it up. It's a distraction.
There are bigger challenges.
// No apostrophe in Finnegans
We did so many tests for this method...concerning this very precise question, I already answered above: see this post for instance.Bear in mind that people use the same set of MC paths to evaluate a whole portfolio, too. This is not just a performance optimisation, also improves the accuracy of the hedging (less noise).Yes. For instance in this post, you can see this effect: the samples points are computed within 20 secs (a 12-th dimensional process AFAIR). Then a portfolio of 10000 complex product is evaluated within less than 10 secs.I understand your last paragraph as saying that I could price different derivative payoffs for the same underlying price process, right?
Did you run large scale tests? Take a portfolio of 10,000 options. Price them to the same level of error using a) your method and b) standard Monte Carlo tricks. What is the computation time for a) and b)?
I put an equality sign because both approach are equivalents. To be really clear, it should be better for AI guys to notice this equivalence right now, if they plan one day to compute properly error estimations, or to avoid bothering future generations with unnecessary complexity...@"Quoting your own words, PDE methods have "shapeshifting" properties."
I don't think I wrote something like that. I don't even know what it would mean. I wrote that parameterised functions (here NNs) have, metaphorically speaking, shapeshifting properties - as opposed to fixed-shaped kernels. That's the essential property of NNs, which motives many of its applications.
The way I see it is that there are different approaches to solving optimisation problems, and they often determine the choice of different techniques for their implementations. To say something obvious, since the problem being solved doesn't change, the approaches are equivalent up to a certain level of detail. I wouldn't put an equality sign between your method and an NN optimisation, though, e.g. because your method expects the positions of particles
Ok. But we are both mathematicians : give me a problem that one method can solve, and the other can not.There's at least one person from the AI community here who's trying to tell you that you're mistaken about the equivalence and explaining clearly why (however difficult it is to pin down what you mean).
This is a Linkedin post. The MC Pricer is basically an encapsulation of boost libraries. Computation time is not very important there, as we were looking to a reference price.Your LinkedIn post only gives the computation time for your method, it doesn't give the computation time for the Monte Carlo method which gives the same accuracy. And it's not clear from it how you implemented the MC pricer.
This sounds naive: "The overall error on the entire portfolio of 1024 AutoCall is 0.2% (relative error on price), which corresponds to a convergence factor of rate 1 / N with N = 512". What about the constant factor? Convergence rate O(1/N) refers to how error eps(N) scales with N, not that eps = 1/N for some particular value of eps and N.
And first you're saying that your method converges at rate 1/N^2, but then you report rate 1/N for an AutoCall portfolio.
And buried in a footnote is the shocker news that your exciting 1/N^2 convergence rate does not work for bounded variation functions, a class which includes most reasonable payoff functions one can think about (e.g. (S(T) - K)+) ! So for all practical purposes, your method has the same convergence properties as Sobol numbers...
I think the problem you have in selling your method comes from the fact that you're overselling it too much and are not presenting its strengths and limitations clearly enough.
Could you develop ?It seems you're in a catch-22.
Interesting remark. Yes, you are correct, these methods converge also at rate (LnN)^{D-1} /N if you consider a kernel generating the functional space corresponding to Koksma-Hlawka inequality. Dont expect a miracle: Koksma-Hlawka are already optimal estimations, you can't expect a sampling method to beat it.your method has the same convergence properties as Sobol numbers...
The answer is in the post : there are only 10, 11, 12 different reference prices in this test, but with 1024, 2048, 4096 combinations of underlyings for D=10,11,12. Thus I ran 10, 11, 12 times the Monte-Carlo pricer.Computational time is very important for people who need to deliver risk numbers to the boss at 7am every morning. These are the people you're selling your stuff to.
And you didn't answer my question (again) - did you run the MC pricer separately for every option, or once for the whole portfolio? It doesn't affect only the computational time, but also the accuracy...
We did both : it is proven theoretically, and we also tested numerically for a lot of examples. I need to publish all this.Well, my criticism was even simpler: it's just not accurate or correct to say "0.2% (relative error on price), which corresponds to a convergence factor of rate 1 / N with N = 512" even if, in fact, 1/512 ~= 0.002. You CANNOT estimate the convergence rate for a single value of N. Either you prove it analytically (in the limit N -> infinity) or you try a series of N's and fit a linear function to error as a function of log N.
No : a call option is a function having a gradient of bounded variation. It is one order smoother than a barrier option. Rephrasing : the gradient of a call option is a barrier option.You seem to be contradicting your LinkedIn post now (or I don't understand something). In it you wrote that "a bounded variation function, a function class for which we know that the convergence rates of a sampling method can not exceed 1 / N, not 1/N^2." The payoff of a call option, (S - K)+, is a bounded variation function, hence the convergence rate should be limited to 1/N.