ISayMoo wrote:Not really. I'm getting slightly tired of this ping-pong to be honest

More ping, less pong.

- Cuchulainn
**Posts:**56663**Joined:****Location:**Amsterdam-
**Contact:**

ISayMoo wrote:Not really. I'm getting slightly tired of this ping-pong to be honest

More ping, less pong.

Cuchulainn wrote:ISayMoo wrote:Not really. I'm getting slightly tired of this ping-pong to be honest

Mostly pong to date, I'm waiting on some ping.

I'm waiting for your comments about probabilistic line search. And for the results of your and Paul's project of teaching the NNs the differential operator

- Cuchulainn
**Posts:**56663**Joined:****Location:**Amsterdam-
**Contact:**

ISayMoo wrote:Cuchulainn wrote:ISayMoo wrote:Not really. I'm getting slightly tired of this ping-pong to be honest

Mostly pong to date, I'm waiting on some ping.

I'm waiting for your comments about probabilistic line search. And for the results of your and Paul's project of teaching the NNs the differential operator

I am waiting for your numerical experiments on the first issue. This thread is not a project and there is no project leader.

// I recall your post

OK, so this looks like a reasonably decent paper: Probabilistic Line Searches for Stochastic Optimization

They discuss convergence guarantees briefly in Sec. 3.4.1. Experiments look encouraging, but I'd like to test them on something more challenging.

//

What are your findings, ISM? I have some views, but first yours!

I read the paper and I think it's a decent, well-tested idea. I didn't have the time to run numerical experiments.

- katastrofa
**Posts:**6113**Joined:****Location:**Alpha Centauri

There's also Pang.

He's building self-driving cars now. Let's not talk about Pang.

- Cuchulainn
**Posts:**56663**Joined:****Location:**Amsterdam-
**Contact:**

ISayMoo wrote:I read the paper and I think it's a decent, well-tested idea. I didn't have the time to run numerical experiments.

Assuming you had time, what is the main technical challenge?

1. Code for Cubic spline and BVN (I have them in my new C++ book)

2. Section 3.1 is not clear to me (what's a Gaussian proces prior?)

3. Putting it all together

4. Other?

?

"(what's a Gaussian proces prior?"

Bayesian prior for a Gaussian process.

Bayesian prior for a Gaussian process.

- Cuchulainn
**Posts:**56663**Joined:****Location:**Amsterdam-
**Contact:**

ISayMoo wrote:"(what's a Gaussian proces prior?"

Bayesian prior for a Gaussian process.

Don't suppose you have an intro for the impatient? Gracias.

- katastrofa
**Posts:**6113**Joined:****Location:**Alpha Centauri

ISayMoo wrote:"(what's a Gaussian proces prior?"

Bayesian prior for a Gaussian process.

I'm not sure you're right. I think they use a standard machinery of statistical learning for regularisation problems and describe it in some twisted jargon (I cannot read it, sorry). The problem of minimising a penalised loss function in linear regression, splines, ML algos, etc., generally can be phrased as

min (f in H) [L(y, f(x)) + alpha * J(f)],

where x and y are data, L is a loss function, J is the penalty, alpha is a constant which will balance the smoothness and errors of the fit f (under- v overfitting).

In machine learning, J is defined on functions f which live in a reproducing kernel Hilbert space (BTW, the concept was developed by Stanisław Zaremba, one of the greatest Polish mathematicians). The space has properties which enable reducing the infinite-dimensional minimalisation problem to a finite dimensional one.

The above can be rephrased in the Bayesian framework (which is quite popular in ML from what I can see in Google searches) for f defined as a kernel integral w/r to some Borel measure. The measure can be interpreted as a stochastic process, and in this case it's usually generated by an alpha-stable distribution like Gaussian. Putting a prior on it corresponds to putting a prior on the space of f. You can treat it as a prior in posterior inference.

Since in such a condensed form it doesn't sound anything like it's supposed to sound, I can lend you a book on statistical learning

- Cuchulainn
**Posts:**56663**Joined:****Location:**Amsterdam-
**Contact:**

With some effort, the above description can be made into a precise mathematical explanation.

- katastrofa
**Posts:**6113**Joined:****Location:**Alpha Centauri

Sure! It can be even made into a statistical learning course for quants, £1000 per person per day.

Thanks Katastrofa, I didn't know about the connection to RKHS.

- Cuchulainn
**Posts:**56663**Joined:****Location:**Amsterdam-
**Contact:**

ISayMoo wrote:Thanks Katastrofa, I didn't know about the connection to RKHS.

Functional Analysis is coming in from the cold. Useful,representation theorems.

Last edited by Cuchulainn on September 15th, 2018, 1:52 pm

- Cuchulainn
**Posts:**56663**Joined:****Location:**Amsterdam-
**Contact:**

katastrofa wrote:Sure! It can be even made into a statistical learning course for quants, £1000 per person per day.

Is it hands-on?