SERVING THE QUANTITATIVE FINANCE COMMUNITY

  • 1
  • 2
  • 3
  • 4
  • 5
  • 34
 
User avatar
katastrofa
Posts: 8991
Joined: August 16th, 2007, 5:36 am
Location: Alpha Centauri

Re: If you are bored with Deep Networks

November 11th, 2017, 9:11 pm

Quantum annealers (e.g. D-wave) find global minima. What deep networks may do, unlike classical algorithms, is finding correlations of the minima positions (more complex than e.g. momentum methods) - it's a quantum algorithm's capability. Introducing memory might possibly help too...
 
User avatar
Cuchulainn
Posts: 61624
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: If you are bored with Deep Networks

November 12th, 2017, 9:35 pm

Just one thing .. NN use discrete data structures. Here and there you see ODEs being used.
Thoughts?

Another source of continuous-nonlinear RNNs arose through a study of adaptive behavior in real time, which led to the derivation of neural networks that form the foundation of most current biological neural network research (Grossberg, 1967, 1968b, 1968c). These laws were discovered in 1957-58 when Grossberg, then a college Freshman, introduced the paradigm of using nonlinear systems of differential equations to model how brain mechanisms can control behavioral functions. The laws were derived from an analysis of how psychological data about human and animal learning can arise in an individual learner adapting autonomously in real time. Apart from the Rockefeller Institute student monograph Grossberg (1964), it took a decade to get them published. 

I feel less nervous with ODE (more robust) than with Hessians and gradients. But they might be unavoidable?

and ?

The counterpropagation network is a hybrid network. It consists of an outstar network and a competitive filter network. It was developed in 1986 by Robert Hecht-Nielsen. It is guaranteed to find the correct weights, unlike regular back propagation networks that can become trapped in local minimums during training.

This sounds reasonable but experts' opinions would be welcome.
http://www.datasimfinancial.com
http://www.datasim.nl

Every Time We Teach a Child Something, We Keep Him from Inventing It Himself
Jean Piaget
 
User avatar
ISayMoo
Topic Author
Posts: 2293
Joined: September 30th, 2015, 8:30 pm

Re: If you are bored with Deep Networks

November 12th, 2017, 10:32 pm

Links?
 
User avatar
outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

Re: If you are bored with Deep Networks

November 13th, 2017, 6:39 am

I had that book in the 90, the Kohonen Network is a 1 hot encoder, very inefficient. At the time I worked in extended it to a m over n code and full 2^n code using a kernel trick to cast point into high dimensions. Learning was however very unstable.

If we have 100 samples drawn from some distribution and some flexible function, how would you fit that function to make it represent the distribution the samples came from? That's the basic statistical view you first need to understand before you can think about the local minima (which are not relevant in practice).

What would eg be the best fit in your opinion? Why would you want to fit it, why not just have a lookup table with your samples?
 
User avatar
Cuchulainn
Posts: 61624
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: If you are bored with Deep Networks

November 13th, 2017, 3:29 pm

That's the basic statistical view you first need to understand before you can think about the local minima (which are not relevant in practice).

What's wrong with a global minimum? Or is some defence for the fact that gradient descent only finds local minima?

I asked this question about 5 times already but no answer to date. NN literature suggest locals are sub-optimal.

regular back propagation networks that can become trapped in local minimums during training.
What is this saying? Local minima are bad?
http://www.datasimfinancial.com
http://www.datasim.nl

Every Time We Teach a Child Something, We Keep Him from Inventing It Himself
Jean Piaget
 
User avatar
katastrofa
Posts: 8991
Joined: August 16th, 2007, 5:36 am
Location: Alpha Centauri

Re: If you are bored with Deep Networks

November 13th, 2017, 3:58 pm

What's the dimension of a typical NN, ISayMoo? I presume it's high. Ergo, how many minima of either kind we can expect to find there? My another question seems similar to Cuchulainn's, in model selection (which is in a way what NNs do), the "best" model is bad, because it always promotes overfitting. Don't we have the same problem with a global minimum?
 
User avatar
Cuchulainn
Posts: 61624
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: If you are bored with Deep Networks

November 13th, 2017, 4:20 pm

ImageFor example, an eggholder function? Not to mention yuge gradients.
http://www.datasimfinancial.com
http://www.datasim.nl

Every Time We Teach a Child Something, We Keep Him from Inventing It Himself
Jean Piaget
 
User avatar
katastrofa
Posts: 8991
Joined: August 16th, 2007, 5:36 am
Location: Alpha Centauri

Re: If you are bored with Deep Networks

November 13th, 2017, 4:33 pm

I would expect that there very few minima because of the high dimension, but maybe my intuition is completely wrong...
 
User avatar
Cuchulainn
Posts: 61624
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: If you are bored with Deep Networks

November 13th, 2017, 4:39 pm

I would expect that there very few minima because of the high dimension, but maybe my intuition is completely wrong...
It's got lots of minima!
BTW my DE algo fails for the Griewank function
https://www.sfu.ca/~ssurjano/griewank.html
http://www.datasimfinancial.com
http://www.datasim.nl

Every Time We Teach a Child Something, We Keep Him from Inventing It Himself
Jean Piaget
 
User avatar
outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

Re: If you are bored with Deep Networks

November 13th, 2017, 5:15 pm

I would expect that there very few minima because of the high dimension, but maybe my intuition is completely wrong...
I have the same feeling, the chance of having zero derivatives in all direction in high dimensions is nearly zero. Also recent research on that (and also older in physics): https://stats.stackexchange.com/questio ... lue-to-the
 
User avatar
outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

Re: If you are bored with Deep Networks

November 13th, 2017, 5:30 pm

The most widely used method is  stochastic gradient, not full  batch gradient descent. A typical NN is between 10k and 10mln dimensional.

If you fit a mixtures density to a small set of points then the global optimum (of maximum likelihood) will be a set of Dirac delta functions, it won't likely  perform any good when evaluated against new samples!
 
User avatar
Cuchulainn
Posts: 61624
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: If you are bored with Deep Networks

November 13th, 2017, 8:37 pm

I would expect that there very few minima because of the high dimension, but maybe my intuition is completely wrong...
I have the same feeling, the chance of having zero derivatives in all direction in high dimensions is nearly zero. Also recent research on that (and also older in physics): https://stats.stackexchange.com/questio ... lue-to-the
Talk is cheap. Prove it.

Image
Last edited by Cuchulainn on November 13th, 2017, 8:39 pm, edited 1 time in total.
http://www.datasimfinancial.com
http://www.datasim.nl

Every Time We Teach a Child Something, We Keep Him from Inventing It Himself
Jean Piaget
 
User avatar
ISayMoo
Topic Author
Posts: 2293
Joined: September 30th, 2015, 8:30 pm

Re: If you are bored with Deep Networks

November 13th, 2017, 8:37 pm

Quantum annealers (e.g. D-wave) find global minima. What deep networks may do, unlike classical algorithms, is finding correlations of the minima positions (more complex than e.g. momentum methods) - it's a quantum algorithm's capability. Introducing memory might possibly help too...
Some people say it does
" Our analysis suggests that the convergence issues may be fixed by endowing such algorithms with “long-term memory” of past gradients"
(The criticism I heard of their fix is that it fixes convergence in some cases, but makes it worse in others.)
 
User avatar
Cuchulainn
Posts: 61624
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: If you are bored with Deep Networks

November 13th, 2017, 8:41 pm

Quantum annealers (e.g. D-wave) find global minima. What deep networks may do, unlike classical algorithms, is finding correlations of the minima positions (more complex than e.g. momentum methods) - it's a quantum algorithm's capability. Introducing memory might possibly help too...
Some people say it does
" Our analysis suggests that the convergence issues may be fixed by endowing such algorithms with “long-term memory” of past gradients"
(The criticism I heard of their fix is that it fixes convergence in some cases, but makes it worse in others.)
Ergo, the method is not robust.

Rule in mathematics: if you make an algo easy in  one aspect it will make it difficult somewhere else. Essential difficulties remain.

Nothing wrong with throwing the kitchen sink at the problem but it has to be more than fixes if you want to go into one of them robot cars.
http://www.datasimfinancial.com
http://www.datasim.nl

Every Time We Teach a Child Something, We Keep Him from Inventing It Himself
Jean Piaget
 
User avatar
Traden4Alpha
Posts: 23951
Joined: September 20th, 2002, 8:30 pm

Re: If you are bored with Deep Networks

November 14th, 2017, 12:05 am

tl&dr: we don't understand how deep networks learn
Ah, but does it matter?  If the goal is science, then "yes".  If the goal is practical solutions, then "no".
I matters a lot! If you don't understand the mathematical foundations, you're groping around in the dark and progress is very slow. People who want to push AI forward are very keen on understanding the mathematical foundations of NN learning, because only then will we be able to create systems which can "learn to learn", and create AGI (artificial general intelligence).
We don't have true mathematical foundations in any science, only a best-fit-so-far set of theories expressed in math. Admitted, those best-fit-so-far maths are amazingly accurate. But are the correct? The entire history of science is just a series of discarded "foundations" with no guarantees that today's math is correct.

At best, the only foundation in science is the growing set of observations and experimental outcomes that must explained by whatever theories du jour are bubbling up. The monotonic growth is data seems to suggest there's monotonic growth in confidence in the math du jour but there's no guarantee that some observation tomorrow won't totally destroy todays foundation (except as a convenient approximation like F-ma).
Humans seem perfectly comfortable using their brains despite having no clue how they work.
We do have some clues, and I wouldn't say that we are "perfectly comfortable" with the current state of knowledge about how ours (and other animals') brains work - we don't know how to treat depression and other mental disorders, we don't know how to optimally teach and train people, etc.
We know less about brains than we know about neural nets which isn't surprising in that artificial neural nets are modeled on natural neural nets. Certainly there's no solid mathematical foundation for the brain. We don't know why we must sleep (which seems like an extremely maladaptive thing to do). We don't know how anesthetics and pain killers actually work. And what mathematical foundation predicts the placebo effect? And what the hell are all those glial cells and astrocytes doing?
ABOUT WILMOTT

PW by JB

Wilmott.com has been "Serving the Quantitative Finance Community" since 2001. Continued...


Twitter LinkedIn Instagram

JOBS BOARD

JOBS BOARD

Looking for a quant job, risk, algo trading,...? Browse jobs here...


GZIP: On