SERVING THE QUANTITATIVE FINANCE COMMUNITY

• 1
• 2
• 3
• 4
• 5
• 21

outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

### Re: If you are bored with Deep Networks

Cuchulainn wrote:
outrun wrote:
katastrofa wrote:
I would expect that there very few minima because of the high dimension, but maybe my intuition is completely wrong...

I have the same feeling, the chance of having zero derivatives in all direction in high dimensions is nearly zero. Also recent research on that (and also older in physics): https://stats.stackexchange.com/questio ... lue-to-the

Talk is cheap. Prove it.

You first need to prove that that function is a potential loss function of a deep neural network. I'm missing a lot of dimensions for a start! Your NN is stuck in an off topic dogma. Every method has issues you can find with a quick Google search in 80s textbooks,: the question is to prove if those issues are applicable... And if you are capable of doing so!

So, proof that deep NN have lots of relevant local minima that make them act horrible. Then proof that the set of tools -like stoch gradient descent, dropout, activation functions like selu etc- cause deep NN to act horrible. Also really understand that we have high dimensions and a loss function spanned by a finite set of training examples! That we want to generalize, not memorize.

This all reminders me of someone saying that B&S is all wrong. Exactly the same behaviour.

ISayMoo
Topic Author
Posts: 944
Joined: September 30th, 2015, 8:30 pm

### Re: If you are bored with Deep Networks

And what the hell is your point, T4A?

Cuchulainn
Posts: 56690
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: If you are bored with Deep Networks

ISayMoo wrote:

Sure, My knowledge is mostly by osmosis.
Some time ago I received a bunch of books on NNs from the publisher as a nice gesture for reviewing some OO books, One is Freeman and Skapura 1992 which is mathematical/precise  (in comparison, Goodfellow et al feels like a history book) and it discusses NN in continuous time using ODEs.
The other stuff was just from Wiki.
CTRNNs  use systems of ODEs which I have few issues with, either mathematically or numerically, The question is what their applicability is. BTW vector and matrix(!) ODEs can be solved in Boost.
They say a discrete time NN is a special CTNN in which the ODE has been replaced by a difference equation.

I heard that JP Aubin (top-class mathematician)  has a book (CUP) on continuous time NN.

They must be a reason why no one is talking about /publishing CT stuff???

Cuchulainn
Posts: 56690
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: If you are bored with Deep Networks

outrun wrote:
Cuchulainn wrote:
outrun wrote:
I have the same feeling, the chance of having zero derivatives in all direction in high dimensions is nearly zero. Also recent research on that (and also older in physics): https://stats.stackexchange.com/questio ... lue-to-the

Talk is cheap. Prove it.

You first need to prove that that function is a potential loss function of a deep neural network. I'm missing a lot of dimensions for a start! Your NN is stuck in an off topic dogma. Every method has issues you can find with a quick Google search  in 80s textbooks,: the question is to prove if those issues are applicable... And if you are capable of doing so!

So, proof that deep NN have lots of relevant local minima that make them act horrible. Then proof that the set of tools -like stoch gradient descent, dropout, activation functions like selu etc- cause deep NN to act horrible. Also really understand that we have high dimensions and a loss function spanned by a finite set of training examples! That we want to generalize, not memorize.

This all reminders me of someone saying that B&S is all wrong. Exactly the same behaviour.

So, I take it you cannot solve this  problem using NN? I think it is a fundemental chasm between maths and engineering.

...the source of all great mathematics is the special case, the concrete example. It is frequent in mathematics that every instance of a concept of seemingly generality is, in essence, the same as a small and concrete special case.”

-- Paul Halmos

outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

### Re: If you are bored with Deep Networks

Cuchulainn wrote:
outrun wrote:
Cuchulainn wrote:
Talk is cheap. Prove it.

You first need to prove that that function is a potential loss function of a deep neural network. I'm missing a lot of dimensions for a start! Your NN is stuck in an off topic dogma. Every method has issues you can find with a quick Google search  in 80s textbooks,: the question is to prove if those issues are applicable... And if you are capable of doing so!

So, proof that deep NN have lots of relevant local minima that make them act horrible. Then proof that the set of tools -like stoch gradient descent, dropout, activation functions like selu etc- cause deep NN to act horrible. Also really understand that we have high dimensions and a loss function spanned by a finite set of training examples! That we want to generalize, not memorize.

This all reminders me of someone saying that B&S is all wrong. Exactly the same behaviour.

So, I take it you cannot solve this  problem using NN? I think it is a fundemental chasm between maths and engineering.

...the source of all great mathematics is the special case, the concrete example. It is frequent in mathematics that every instance of a concept of seemingly generality is, in essence, the same as a small and concrete special case.”

-- Paul Halmos

What do you mean "you can't solve it with NN"? Can you at least specify the problem more clearly? Where does NN come in?

Is your question "can you come up with a NN that has this specific loss function"?

The way I see it is that you presented a textbook case we all have seen at school and which is not applicable. It's an instance of a difficult problem for gradient descent. If I would follow you logical reasoning I would in return present a convect function and ask you to proof that GD always gets stuck in a local minima! That's why get get all these reactions. If you want a sensible discussion then we can continue but this is a waste of time.

Eg do you know that in DL we change the loss function in every gradient step? Also that we don't know the true loss function(!) because it doesn't exist and hence instead work with an approximation? We can't have a sensible discussion if you don't know these things.

ISayMoo
Topic Author
Posts: 944
Joined: September 30th, 2015, 8:30 pm

### Re: If you are bored with Deep Networks

Cuchulainn wrote:
Rule in mathematics: if you make an algo easy in  one aspect it will make it difficult somewhere else. Essential difficulties remain.

No Free Lunch Theorems for Optimization

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: If you are bored with Deep Networks

ISayMoo wrote:
And what the hell is your point, T4A?
Math is not robust in the broader physical sense.

Math may be extremely even perfectly provably robust within it's close-world domain of logic but to the extent that there are any discrepancies between the selected math and the selected physical system, the results of all that math may be logically correct and practically wrong.

Cuchulainn
Posts: 56690
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: If you are bored with Deep Networks

ISayMoo wrote:
And what the hell is your point, T4A?

Math is not robust in the broader physical sense.

Math may be extremely even perfectly provably robust within it's close-world domain of logic but to the extent that there are any discrepancies between the selected math and the selected physical system, the results of all that math may be logically correct and practically wrong.

And that's where our roads diverge. Do you really believe what you have written?

You don't know what you are saying about what maths is and what it is not.

Math is not robust in the broader physical sense.
BS. A wee bit of humility is in order here, T4A.

Meshuggah

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: If you are bored with Deep Networks

Cuchulainn wrote:
ISayMoo wrote:
And what the hell is your point, T4A?

Math is not robust in the broader physical sense.

Math may be extremely even perfectly provably robust within it's close-world domain of logic but to the extent that there are any discrepancies between the selected math and the selected physical system, the results of all that math may be logically correct and practically wrong.

And that's where our roads diverge. Do you really believe what you have written?

You don't know what you are saying about what maths is and what it is not.

Math is not robust in the broader physical sense.
BS. A wee bit of humility is in order here, T4A.

Meshuggah
Yes I do believe that.

It seems like pure arrogance to believe that today's theorized mathematical models of the physical world are correct in the same absolute deductive logical way that a mathematical theorem is correct within some axiomatic system.

We agree that in math there are no 'maybes.' But looking at the history and philosophy of science shows that science is full of 'maybes'. Sure, scientists can rule out certain theories as being inconsistent with the accumulating pot of data but unless one has constructed the entire set of all theories and empirically ruled out every theory but one, there's always more than one theory in the set of maybes. Currently, we don't even have the set of all theories let alone empirical confidence in rejecting every theory but one among the known theories.

There are too many unexplained phenomenon and untested predictions to have absolute confidence that any given mathematical system does more than very closely approximate physical reality within the limited energy, time scales, and spatial scales that humans has so far observed.

Yes, a wee bit of humility is in order here. But it is those who have faith in the math du jour that need more humility.

Cuchulainn
Posts: 56690
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: If you are bored with Deep Networks

We agree that in math there are no 'maybes.'
We don't agree. Never mind.

[font=-apple-system, Helvetica Neue, Helvetica, sans-serif]Yes, a wee bit of humility is in order here. But it is those who have faith in the math du jour that need more humility.[/font]
The 'math du jour' is gradient descent etc. these days. Unfortunately, a discussion is not possible as the die has been cast.
What I notice is the range of maths in NN is quite limited. I am just the messenger.

AI has a checkered past; what's different now?

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: If you are bored with Deep Networks

Cuchulainn wrote:
We agree that in math there are no 'maybes.'
We don't agree. Never mind.

[font=-apple-system, Helvetica Neue, Helvetica, sans-serif]Yes, a wee bit of humility is in order here. But it is those who have faith in the math du jour that need more humility.[/font]
The 'math du jour' is gradient descent etc. these days. Unfortunately, a discussion is not possible as the die has been cast.
What I notice is the range of maths in NN is quite limited. I am just the messenger.

AI has a checkered past; what's different now?
I'm quite confused because not long ago you said 'may' has no place in math so I thought we agreed on that.

As for NNs, I agree with you that they are imperfect and that creating mathematical foundations for NN would be useful.

Where we seem to disagree is whether said foundations are necessary or even possible.

First, I would assert they are not necessary because provable robustness of NN is less important than comparative performance with other adaptive learning systems (including human brains). There's no mathematical proof that human brains are robust and plenty of empirical data that shows they aren't.
We seem to be rapidly reaching the point (if we're not there already) where AI makes a much better automobile driver than does a human. Some humans seemed to have crashed many a car! I agree that said AI is not a 100% safe and may fail under conditions that would not have fooled a human driver but it will succeed under many more conditions that do fool human drivers.

But more importantly, I'm skeptical that adaptive learning algorithms (including NNs) can have tractable proven properties beyond some trivial findings. My suspicion is that it does not take much of an NN (or any other category of learning algorithm) to run into some variant of the Turing's Halting or Godel's Incompleteness problem. Even if the bounded scale of a specific NN guarantees decidability, I look at the complexity of the proof for the 4-color map theorem and wonder if the 1 million node NN problem is so complex that no human will ever be able to understand it.

Emerson M. Pugh wrote:
'If our brains were simple enough for us to understand them, we'd be so simple that we couldn't.'

Cuchulainn
Posts: 56690
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: If you are bored with Deep Networks

I'm quite confused because not long ago you said 'may' has no place in math so I thought we agreed on that.

That interpretation is not what is meant.
In maths replace 'may' by 'necessary and/or sufficient' conditions for something to be true. That's not the same thing.

Cuchulainn
Posts: 56690
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: If you are bored with Deep Networks

Reset: do you think we can can OP back on track? The topic was

What is really impressive is that the new architecture is much more robust to adversarial examples - a bane of deep learning.

The goal is not to extol the virtues of DL but to look over the precipice, yes?

This thread is stuck in  a local minimum. Some variation on the subject matter would do no harm. I can't imagine stochastic gradients are the only show in town.

If this is not possible, it might an idea to start a new thread and keep focused on topic.

Cuchulainn
Posts: 56690
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: If you are bored with Deep Networks

ISayMoo wrote:
And what the hell is your point, T4A?

Math is not robust in the broader physical sense.

Math may be extremely even perfectly provably robust within it's close-world domain of logic but to the extent that there are any discrepancies between the selected math and the selected physical system, the results of all that math may be logically correct and practically wrong.

Yeah, what did mathematicians ever do for us?

Cuchulainn
Posts: 56690
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: If you are bored with Deep Networks

Is your question "can you come up with a NN that has this specific loss function"?

No, that's not my question.

My question is how (stochastic) gradient method to compute weights.  It is a building block for DL. A valid question how it performs on test (school??) cases.

https://en.wikipedia.org/wiki/Test_func ... timization

Maybe the question is wrong.

BTW loss function == objective function == error == cost  (good to know). Lots of overloaded terms.