Here is an interesting paper for friends relevant to this thread. An Artificial Neural Network Representation of the SABR Stochastic Volatility Model by William Mcghee.

- Cuchulainn
**Posts:**58095**Joined:****Location:**Amsterdam-
**Contact:**

Unfortunately, the run-time performance of both ANN (3 hours) and FDM(20 days!) are disappointing. Must be chalking up a yuge electricity bill?Here is an interesting paper for friends relevant to this thread. An Artificial Neural Network Representation of the SABR Stochastic Volatility Model by William Mcghee.

With 'traditional' FDM popular consensus says it can be done in 15 minutes.

IMO, I am becoming more and more convinced that NN and PDE don't mix, really. Maybe that 'Eureka moment' will come...

For this kind of problem, I would expect Lewis/Papadoupolis FDM to be a good baseline.

https://arxiv.org/abs/1801.06141

- FaridMoussaoui
**Posts:**356**Joined:**

I just read the paper per diagonals

Isn't the 475 hours for solving 300k FDM as the goal was to generate 3 millions volatilities (10 at a time)?

Isn't the 475 hours for solving 300k FDM as the goal was to generate 3 millions volatilities (10 at a time)?

- Cuchulainn
**Posts:**58095**Joined:****Location:**Amsterdam-
**Contact:**

Update

Section 4.3

1. 1st and 2nd derivatives are not calculated using NN directly (by AD, whatever)... no reason given why. I suspect difficult to implement etc.

2. Instead author uses cubic splines and its derivatives. Something I have done some work on while back for FEM.

OK then, mathematically, each time you differentiate a spline it gets worser and worser (overshoot) as I discuss in mathematical detail and in numbers in the recent 2nd edition of my C++ book.

Two screen shoots coming up.

Section 4.3

1. 1st and 2nd derivatives are not calculated using NN directly (by AD, whatever)... no reason given why. I suspect difficult to implement etc.

2. Instead author uses cubic splines and its derivatives. Something I have done some work on while back for FEM.

OK then, mathematically, each time you differentiate a spline it gets worser and worser (overshoot) as I discuss in mathematical detail and in numbers in the recent 2nd edition of my C++ book.

Two screen shoots coming up.

Last edited by Cuchulainn on December 1st, 2018, 11:05 pm, edited 8 times in total.

- Cuchulainn
**Posts:**58095**Joined:****Location:**Amsterdam-
**Contact:**

- Cuchulainn
**Posts:**58095**Joined:****Location:**Amsterdam-
**Contact:**

Thanks to @billy7 who asked me to clarify the small discrepancies (and my precise answer).

- FaridMoussaoui
**Posts:**356**Joined:**

You can have a look to their NN library, derived from TensorFlow on github.

https://github.com/deepmind/sonnet

Or visit their shared libraries at https://github.com/deepmind

https://github.com/deepmind/sonnet

Or visit their shared libraries at https://github.com/deepmind

I know their libraries pretty well

- Cuchulainn
**Posts:**58095**Joined:****Location:**Amsterdam-
**Contact:**

Are they Bayesian or non-Bayesian? Probably black boxes..I know their libraries pretty well

Of course, gemome data is not finance data.

- katastrofa
**Posts:**7042**Joined:****Location:**Alpha Centauri

That's nothing impressive. They used the full power of Google data centres and solved it brute force. That's why emails worked so slowly. (That was, obviously - ?, a joke. I don't know what resources they used compared to their competition, but that's something that should be taken into account into.)

Last edited by katastrofa on December 4th, 2018, 6:55 am, edited 1 time in total.

- FaridMoussaoui
**Posts:**356**Joined:**

They are open source, written in python. They moved from Torch7 (another NN open source project) to TensorFlow (by Google).Are they Bayesian or non-Bayesian? Probably black boxes..

Of course, gemome data is not finance data.

- Cuchulainn
**Posts:**58095**Joined:****Location:**Amsterdam-
**Contact:**

So, NNs can describe but cannot explain. If so, where's the AI part?They are open source, written in python. They moved from Torch7 (another NN open source project) to TensorFlow (by Google).Are they Bayesian or non-Bayesian? Probably black boxes..

Of course, gemome data is not finance data.

- katastrofa
**Posts:**7042**Joined:****Location:**Alpha Centauri

The "Why?" a.k.a. deductive nomological model of reasoning from classical physics doesn't apply to statistics. It's full of hindsight bias.

- Cuchulainn
**Posts:**58095**Joined:****Location:**Amsterdam-
**Contact:**

Let's take a step backwards.

Gradient descent methods as we know and love them are nothing more than Euler's (ugh) method applied to ODEs (aka*gradient system*):

[$]dx/dt = - grad f(x) = -\nabla f(x)[$] where [$]f[$] is the function to be minimised.

The local minima of [$]f[$] are the*critical points* of this ODE system. (Poincaré-Lyapunov-Bendixson theory).

I have tried it on a number of benchmark unconstrained optimisation problem and solved using C++ boost::odeint without any of the infamous learning rate fudges.

We use more robust ODE solvers than Euler which means that we don't get the well-documented issues with GD methods.

For SGD, you might ask how is the discrete equivalent of an SDE.

Gradient descent methods as we know and love them are nothing more than Euler's (ugh) method applied to ODEs (aka

[$]dx/dt = - grad f(x) = -\nabla f(x)[$] where [$]f[$] is the function to be minimised.

The local minima of [$]f[$] are the

I have tried it on a number of benchmark unconstrained optimisation problem and solved using C++ boost::odeint without any of the infamous learning rate fudges.

We use more robust ODE solvers than Euler which means that we don't get the well-documented issues with GD methods.

For SGD, you might ask how is the discrete equivalent of an SDE.

GZIP: On