Serving the Quantitative Finance Community

  • 1
  • 3
  • 4
  • 5
  • 6
  • 7
  • 12
 
User avatar
ISayMoo
Posts: 2332
Joined: September 30th, 2015, 8:30 pm

Re: DL and PDEs

November 25th, 2017, 3:02 am

Neural Networks are very good at fitting noise. Have you read about this numerical experiments?
"Conventional wisdom attributes small generalization error either to properties of the model family, or to the regularization techniques used during training. 

Through extensive systematic experiments, we show how these traditional approaches fail to explain why large neural networks generalize well in practice. Specifically, our experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data. This phenomenon is qualitatively unaffected by explicit regularization, and occurs even if we replace the true images by completely unstructured random noise. We corroborate these experimental findings with a theoretical construction showing that simple depth two neural networks already have perfect finite sample expressivity as soon as the number of parameters exceeds the number of data points as it usually does in practice."

About saddle points:
https://arxiv.org/abs/1710.07406
http://www.offconvex.org/2017/07/19/saddle-efficiency/
 
User avatar
jclune
Posts: 1
Joined: July 10th, 2013, 2:43 pm

Re: DL and PDEs

November 27th, 2017, 9:08 pm

Softmax is used to map the elements of a vector to values that can be interpreted as class probabilties. E.g. v=[16,-3,4] -> p=[0.7, 0.01, 0.29], i.e. all number in the range [0,1] and they sum to 1.

Just like you use S=e^X in the lognormal model to enforce you model to have stock prices that are positive.

Softmax is non-linear so it's not affine.
Hello! I have more experience with ML and DL than trading so excuse my beginner question but softmax can be used in the last layer for class probabilities [P(up), P(down)] or [P(buy), P(hold), P(sell)] right? I currently make models without softmax, so the output is predicted return at t+1. The loss function is trivial and it learns very accurate predictions. After that, I could simply choose static thresholds for predicted return to map to (buy, hold, sell) signals and run simulations but it doesn't seem optimal... 
Can anyone please share an example of an end-to-end model that outputs (buy, hold, sell); i can imagine trivial labels for (buy, sell) based on profit but not sure how to define (buy, hold, sell) without being subjective. Rather than an end-to-end model, does anyone know a good way to map from predicted return to (buy, hold, sell) in general? Thanks so much! 
 
User avatar
outrun
Posts: 4573
Joined: January 1st, 1970, 12:00 am

Re: DL and PDEs

November 27th, 2017, 9:34 pm

You can, but you shouldn't approach it as a class probabilities and predictions type of problem. You need to factor in the risk of being wrong, the uncertainty of the state the market it is.. 

So, Instead attack it with reinforcement learning (RL): learn optimal trading behaviour while being half-informed without going through an intermediate prediction model.  Realistic order execution simulation during training is also an important aspect, which is a challenge itself.

If RL is new to you then a good start it wikipedia https://en.wikipedia.org/wiki/Reinforcement_learning  where the terminology like value function, policy are explained.  That will help you work through papers like the famous  DQN paper (about learning to play the Atari games), and this later A3C one: https://arxiv.org/abs/1602.01783
 
User avatar
ISayMoo
Posts: 2332
Joined: September 30th, 2015, 8:30 pm

Re: DL and PDEs

November 27th, 2017, 10:42 pm

I've seen (modestly) profitable trading models which were basically a binary classifier (for liquidity removal).
 
User avatar
Cuchulainn
Posts: 20254
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

Re: DL and PDEs

November 28th, 2017, 10:18 am

About saddle points:
https://arxiv.org/abs/1710.07406
I am not sure what this blog is trying to say. But it looks like that they are using stuff from other (unrereferenced) sources. Anyways, I think the approach is similar to homotopy methods for optimisation (I used homotopy for nonlinear systems a while back for semiconductor simulation and it avoids many well-known and recurring problems with 'cookbook' methods).

http://www.sandia.gov/~dmdunla/publicat ... 5-7495.pdf

I tried an ever-popular search 'homotopy for neural..' and it throws up a yuge amount stuff like this.(as early as 1992)

http://lib.dr.iastate.edu/cgi/viewconte ... ntext=qnde

Thinking out loud; don't you really need interval arithmetic? (errors in input measurement).
 
User avatar
ISayMoo
Posts: 2332
Joined: September 30th, 2015, 8:30 pm

Re: DL and PDEs

November 28th, 2017, 10:44 am

 
User avatar
Cuchulainn
Posts: 20254
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

Re: DL and PDEs

November 28th, 2017, 3:10 pm

I've seen (modestly) profitable trading models which were basically a binary classifier (for liquidity removal).
NFL in action?(?)
 
User avatar
ISayMoo
Posts: 2332
Joined: September 30th, 2015, 8:30 pm

Re: DL and PDEs

November 28th, 2017, 3:56 pm

No footballers were involved.
 
User avatar
Cuchulainn
Posts: 20254
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

Re: DL and PDEs

November 28th, 2017, 7:02 pm

General question: I reckon you can build your own NN library to do things instead of/ as well as going through Python front end? These days you have yuge multi-core machines so parallelisation is a non-issue.
And there is only a finite set of building blocks to assemble and integrate. 

What about data storage?
 
User avatar
Cuchulainn
Posts: 20254
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

Re: DL and PDEs

December 5th, 2017, 10:04 am

Computational Neuroscience: What is the difference between modelling a network of neurons using differential equations and using linear matrix algebra?

Paul King, fmr UC Berkeley Redwood Center for Theoretical Neuroscience

When people use differential equations, they are interested in dynamics. When they use linear algebra, they are interested in representational structure.

Differential equations are ideal for modeling things like the evolution of continuous variables over time and also systems with feedback. They are often used to model the internals of a single neuron.

Linear algebra is ideal at modeling probabilities and statistics across a large number of identical variables, like a bunch of neurons in a network, a set of connection weights, or abstract probabilities that may not correspond directly to neurons.

Unfortunately, the spikes generated by real neurons throw a wrench into both types of models. And feedback networks with spikes, such as are found in the brain, are extraordinarily unwieldy. A new type of math may be needed to handle these types of networks, but for now people make do with linear algebra.
 
User avatar
lovenatalya
Posts: 187
Joined: December 10th, 2013, 5:54 pm

Re: DL and PDEs

April 8th, 2018, 3:31 am

1. NN are typically 10k to 10mln dimensional functions. DE would be very slow, you would need at least the double of number agents but in practice many more or else you would be searching in a subspace. Each agent represent an instance of the network, so big memory consumption and computations on it.


I don't get this answer, not at all. Are you saying finding a local minimum is OK?

1. In every branch, we desire a global minimum >> local minimum, even in DL (have a look at Goodfellow figure 4.3 page 81).
You don't want a global minimum on the training set, because that would be overfitting: early stopping
2. Gradient descent methods fined local minimum, not necessarily global. Is that serious?
Yes.
3, Overfitting is caused by high-order polynomials (yes?). I don't see what the relationship is with finding minima.
4. More evidence is needed on "how slow" DE is (the good news is that always give a global minimum).
5. The Cybenko universal approximation theorem seems to have little coupling to anything in mainstream numerical approximation. Maybe it is not necessary, but maybe say that. Borel measures  and numerical accuracy are not a good mix IMO.

Mathematically, it feels that this approach is not even wrong..
Your points feel the same way to me too :) more precision would be welcome :)
Good. At the end of the day we still have the laws of (numerical) mathematics .. it's like gravity.

Well spotted, I took on faith that they have a deep network somewhere, but apparently they don't. Lying rascals!
As Pope (Alexander) said "A little knowledge is a dangerous thing". IMO it takes [3.6] years to really learn Galerkin++.

History of FEM 

http://www.siam.org/books/fr26/FEMB-1-0 ... uction.pdf



BTW the first FEM maths book was by String + Fix, the former my academic grandfather, We were exposed to FEM in  second year undergrad. 
@Cuchulainn:

What is your opinion on this PDE and SPDE paper?
 
User avatar
Cuchulainn
Posts: 20254
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

Re: DL and PDEs

April 8th, 2018, 12:50 pm

It's a very ambitious article..
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

Re: DL and PDEs

April 8th, 2018, 1:13 pm

Computational Neuroscience: What is the difference between modelling a network of neurons using differential equations and using linear matrix algebra?

Paul King, fmr UC Berkeley Redwood Center for Theoretical Neuroscience

When people use differential equations, they are interested in dynamics. When they use linear algebra, they are interested in representational structure.

Differential equations are ideal for modeling things like the evolution of continuous variables over time and also systems with feedback. They are often used to model the internals of a single neuron.

Linear algebra is ideal at modeling probabilities and statistics across a large number of identical variables, like a bunch of neurons in a network, a set of connection weights, or abstract probabilities that may not correspond directly to neurons.

Unfortunately, the spikes generated by real neurons throw a wrench into both types of models. And feedback networks with spikes, such as are found in the brain, are extraordinarily unwieldy. A new type of math may be needed to handle these types of networks, but for now people make do with linear algebra.
Saying that one must accurately model the spikes in a neuron to understand/replicate intelligence may be like saying one must accurately model the flapping of birds wings to understand/replicate flight. In both cases, the short-term dynamics may be less important than the more static structural elements of the system.
 
User avatar
Cuchulainn
Posts: 20254
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

Re: DL and PDEs

April 8th, 2018, 3:48 pm

Computational Neuroscience: What is the difference between modelling a network of neurons using differential equations and using linear matrix algebra?

Paul King, fmr UC Berkeley Redwood Center for Theoretical Neuroscience

When people use differential equations, they are interested in dynamics. When they use linear algebra, they are interested in representational structure.

Differential equations are ideal for modeling things like the evolution of continuous variables over time and also systems with feedback. They are often used to model the internals of a single neuron.

Linear algebra is ideal at modeling probabilities and statistics across a large number of identical variables, like a bunch of neurons in a network, a set of connection weights, or abstract probabilities that may not correspond directly to neurons.

Unfortunately, the spikes generated by real neurons throw a wrench into both types of models. And feedback networks with spikes, such as are found in the brain, are extraordinarily unwieldy. A new type of math may be needed to handle these types of networks, but for now people make do with linear algebra.
Saying that one must accurately model the spikes in a neuron to understand/replicate intelligence may be like saying one must accurately model the flapping of birds wings to understand/replicate flight.  In both cases, the short-term dynamics may be less important than the more static structural elements of the system.
And sometimes the short-term behaviour in the boundary/initial layer (e.g. fluid dynamics) needs to be modelled. There are called stiff ODEs. The problem determines which solution to adopt..
 
User avatar
Traden4Alpha
Posts: 3300
Joined: September 20th, 2002, 8:30 pm

Re: DL and PDEs

April 8th, 2018, 5:13 pm

Computational Neuroscience: What is the difference between modelling a network of neurons using differential equations and using linear matrix algebra?

Paul King, fmr UC Berkeley Redwood Center for Theoretical Neuroscience

When people use differential equations, they are interested in dynamics. When they use linear algebra, they are interested in representational structure.

Differential equations are ideal for modeling things like the evolution of continuous variables over time and also systems with feedback. They are often used to model the internals of a single neuron.

Linear algebra is ideal at modeling probabilities and statistics across a large number of identical variables, like a bunch of neurons in a network, a set of connection weights, or abstract probabilities that may not correspond directly to neurons.

Unfortunately, the spikes generated by real neurons throw a wrench into both types of models. And feedback networks with spikes, such as are found in the brain, are extraordinarily unwieldy. A new type of math may be needed to handle these types of networks, but for now people make do with linear algebra.
Saying that one must accurately model the spikes in a neuron to understand/replicate intelligence may be like saying one must accurately model the flapping of birds wings to understand/replicate flight.  In both cases, the short-term dynamics may be less important than the more static structural elements of the system.
And sometimes the short-term behaviour in the boundary/initial layer (e.g. fluid dynamics) needs to be modelled. There are called stiff ODEs. The problem determines which solution to adopt..
Absolutely! But is it needed in this case? It's always a challenge to determine the correct level of abstraction and to model the relevant parts of the system.

(Note that even the spike model is an approximated high-level abstraction of the true physical phenomena which is driven by the dynamics of tens of thousands of individual ion channels embedded in the neuron's membrane responding to neurotransmitters and to the voltage effects induced by the opening of neighboring ion channels. And then there's the next level of the model of each ion-channel as a polypeptide molecule of a few hundred thousand Daltons that changes shape under various influences.)