### Re: If you are bored with Deep Networks

Posted:

**September 11th, 2018, 5:50 pm**ISayMoo wrote:Not really. I'm getting slightly tired of this ping-pong to be honest

More ping, less pong.

SERVING THE QUANTITATIVE FINANCE COMMUNITY

https://forum.wilmott.com/

Page **20** of **21**

Posted: **September 11th, 2018, 5:50 pm**

ISayMoo wrote:Not really. I'm getting slightly tired of this ping-pong to be honest

More ping, less pong.

Posted: **September 12th, 2018, 6:45 am**

Cuchulainn wrote:ISayMoo wrote:Not really. I'm getting slightly tired of this ping-pong to be honest

Mostly pong to date, I'm waiting on some ping.

I'm waiting for your comments about probabilistic line search. And for the results of your and Paul's project of teaching the NNs the differential operator

Posted: **September 12th, 2018, 9:59 am**

ISayMoo wrote:Cuchulainn wrote:ISayMoo wrote:Not really. I'm getting slightly tired of this ping-pong to be honest

Mostly pong to date, I'm waiting on some ping.

I'm waiting for your comments about probabilistic line search. And for the results of your and Paul's project of teaching the NNs the differential operator

I am waiting for your numerical experiments on the first issue. This thread is not a project and there is no project leader.

// I recall your post

OK, so this looks like a reasonably decent paper: Probabilistic Line Searches for Stochastic Optimization

They discuss convergence guarantees briefly in Sec. 3.4.1. Experiments look encouraging, but I'd like to test them on something more challenging.

//

What are your findings, ISM? I have some views, but first yours!

Posted: **September 12th, 2018, 3:15 pm**

I read the paper and I think it's a decent, well-tested idea. I didn't have the time to run numerical experiments.

Posted: **September 12th, 2018, 3:41 pm**

There's also Pang.

Posted: **September 13th, 2018, 6:33 am**

He's building self-driving cars now. Let's not talk about Pang.

Posted: **September 13th, 2018, 2:32 pm**

ISayMoo wrote:I read the paper and I think it's a decent, well-tested idea. I didn't have the time to run numerical experiments.

Assuming you had time, what is the main technical challenge?

1. Code for Cubic spline and BVN (I have them in my new C++ book)

2. Section 3.1 is not clear to me (what's a Gaussian proces prior?)

3. Putting it all together

4. Other?

?

Posted: **September 13th, 2018, 9:04 pm**

"(what's a Gaussian proces prior?"

Bayesian prior for a Gaussian process.

Bayesian prior for a Gaussian process.

Posted: **September 14th, 2018, 5:47 pm**

ISayMoo wrote:"(what's a Gaussian proces prior?"

Bayesian prior for a Gaussian process.

Don't suppose you have an intro for the impatient? Gracias.

Posted: **September 15th, 2018, 9:04 am**

ISayMoo wrote:"(what's a Gaussian proces prior?"

Bayesian prior for a Gaussian process.

I'm not sure you're right. I think they use a standard machinery of statistical learning for regularisation problems and describe it in some twisted jargon (I cannot read it, sorry). The problem of minimising a penalised loss function in linear regression, splines, ML algos, etc., generally can be phrased as

min (f in H) [L(y, f(x)) + alpha * J(f)],

where x and y are data, L is a loss function, J is the penalty, alpha is a constant which will balance the smoothness and errors of the fit f (under- v overfitting).

In machine learning, J is defined on functions f which live in a reproducing kernel Hilbert space (BTW, the concept was developed by Stanisław Zaremba, one of the greatest Polish mathematicians). The space has properties which enable reducing the infinite-dimensional minimalisation problem to a finite dimensional one.

The above can be rephrased in the Bayesian framework (which is quite popular in ML from what I can see in Google searches) for f defined as a kernel integral w/r to some Borel measure. The measure can be interpreted as a stochastic process, and in this case it's usually generated by an alpha-stable distribution like Gaussian. Putting a prior on it corresponds to putting a prior on the space of f. You can treat it as a prior in posterior inference.

Since in such a condensed form it doesn't sound anything like it's supposed to sound, I can lend you a book on statistical learning

Posted: **September 15th, 2018, 12:37 pm**

With some effort, the above description can be made into a precise mathematical explanation.

Posted: **September 15th, 2018, 12:53 pm**

Sure! It can be even made into a statistical learning course for quants, £1000 per person per day.

Posted: **September 15th, 2018, 1:19 pm**

Thanks Katastrofa, I didn't know about the connection to RKHS.

Posted: **September 15th, 2018, 1:41 pm**

ISayMoo wrote:Thanks Katastrofa, I didn't know about the connection to RKHS.

Functional Analysis is coming in from the cold. Useful,representation theorems.

Posted: **September 15th, 2018, 1:42 pm**

katastrofa wrote:Sure! It can be even made into a statistical learning course for quants, £1000 per person per day.

Is it hands-on?