SERVING THE QUANTITATIVE FINANCE COMMUNITY

JohnLeM
Posts: 376
Joined: September 16th, 2008, 7:15 pm

### Re: If you are bored with Deep Networks

Test code
// Interval strategy on (O,inf):
//
// 1. Truncation to large finite T
//  2. Transform (0,inf) to (0,1)
//
// 2. tends to give more accurate rounded results.
//
// (C) Datasim Education BV 2020
//

#include <boost/numeric/ublas/vector.hpp>
#include <boost/numeric/ublas/matrix.hpp>
#include <boost/numeric/ublas/io.hpp>
namespace ublas = boost::numeric::ublas;

#include <boost/numeric/odeint.hpp>
namespace Bode = boost::numeric::odeint;

int main()
{

// 2X2 matrices
//
// A1 == symmetric and positive definite (pd)
// A2 == symmetric and NOT positive definite (pd)
// A3 == NOT symmetric and positive definite(pd)
// A4 == NOT symmetric and NOT positive definite(pd)
std::size_t r = 2; std::size_t c = 2;

ublas::matrix<double> A1(r, c);
A1(0, 0) = 2; A1(0, 1) = 1; A1(1, 0) = 1; A1(1, 1) = 2;

ublas::vector<double> b1(r);
b1[0] = 4;  b1[1] = 5;

// ODE Solver, x = (1 2) is solution in all cases
ublas::vector<double> x(r); x[0] = x[1] = 0.0;
ublas::vector<double> x2(r);  x2[0] = x2[1] = 0.0;

// Integrate on [L,T]
// EXX. Try T = 0.1 0.25, 0.5, 0.75, 0.95. 0.9999 etc.
double L = 0.0; double T = 0.99584959825795;
double dt = 1.0e-5;

// Cash Karp (Ford Cortina)
std::size_t steps = Bode::integrate(ode, x, L, T, dt);

std::cout << "Number of steps " << steps << '\n';
std::cout << "Solution " << x << '\n';

// BS upmarket model
Bode::bulirsch_stoer<state_type, value_type> bsStepper;

std::size_t steps3 = Bode::integrate_adaptive(bsStepper, ode, x2, L, T, dt);

std::cout << "Number of steps, Bulirsch-Stoer " << steps3 << '\n';
std::cout << "Solution II " << x2 << '\n';

return 0;
}
Thanks for the sharing the boost::odeint example, I was not aware of this boost library.
By the way, reading the posts, I am not sure to understand your point. Is not SGD an optimization method, as all those present in boost::odeint ? Isn'it this discussion equivalent to compare for instance a Godunov scheme to a glimm one for numerical analysis of PDEs ?

Cuchulainn
Posts: 61540
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: If you are bored with Deep Networks

By the way, reading the posts, I am not sure to understand your point. Is not SGD an optimization method, as all those present in boost::odeint ?

Boost odeint is a generic ODE solver...(?) It can be used to solve optimisation problems in dynamical systems as my example above shows.
The authors of odeint do have examples of Hamiltonian systems which is a special case of dynamical systems.

Isn'it this discussion equivalent to compare for instance a Godunov scheme to a glimm one for numerical analysis of PDEs ?

In which aspects is it equivalent? AFAIK Godunov is for conservation laws(?) I don't see much overlap.
http://www.datasimfinancial.com
http://www.datasim.nl

Every Time We Teach a Child Something, We Keep Him from Inventing It Himself
Jean Piaget

JohnLeM
Posts: 376
Joined: September 16th, 2008, 7:15 pm

### Re: If you are bored with Deep Networks

By the way, reading the posts, I am not sure to understand your point. Is not SGD an optimization method, as all those present in boost::odeint ?

Boost odeint is a generic ODE solver...(?) It can be used to solve optimisation problems in dynamical systems as my example above shows.
The authors of odeint do have examples of Hamiltonian systems which is a special case of dynamical systems.

Isn'it this discussion equivalent to compare for instance a Godunov scheme to a glimm one for numerical analysis of PDEs ?

In which aspects is it equivalent? AFAIK Godunov is for conservation laws(?) I don't see much overlap.
You are right, Godunov scheme, as Glimm schemes, are indeed numerical schemes designed to solve both a particular problem, that are conservation laws, described by hamiltonians. For me, the connection with ODE minimization problem is that both methods basically uses gradient descent algorithms, that are energy-minimization based, more precisely entropic ones, to design an ODE, that is a dynamical system, consistent with the conservation law under study. Both are optimization problems solving. The difference being that

- Godunov schemes are gradient-based minimization methods, as are a bunch of other finite-difference type methods. They try to minimize the total system entropy.
- Glimm schemes are also gradient minimization methods, but they use random selection to achieve that.

To try clarifying the connection : Glimm schemes are Stochastic gradient Descent based algorithms, Godunov corresponds to a particular finite-difference based minimization method. I guess that I could use boost ODEINT to describe both resulting numerical schemes ?

Cuchulainn
Posts: 61540
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: If you are bored with Deep Networks

A Review on Neural Network Models of Schizophrenia and Autism Spectrum Disorder

https://arxiv.org/pdf/1906.10015.pdf
http://www.datasimfinancial.com
http://www.datasim.nl

Every Time We Teach a Child Something, We Keep Him from Inventing It Himself
Jean Piaget

ISayMoo
Topic Author
Posts: 2292
Joined: September 30th, 2015, 8:30 pm

### Re: If you are bored with Deep Networks

Learning guarantees for Stochastic Gradient Descent

In a wide range of problems, SGD is superior to GD.

Cuchulainn
Posts: 61540
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: If you are bored with Deep Networks

http://www.datasimfinancial.com
http://www.datasim.nl

Every Time We Teach a Child Something, We Keep Him from Inventing It Himself
Jean Piaget

Cuchulainn
Posts: 61540
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: If you are bored with Deep Networks

Despite some evidence for top-down connections in the brain, there does not appear to be a global objective that is optimized by backpropagating error signals. Instead, the biological brain is highly modular and learns predominantly based on local information.

https://arxiv.org/pdf/1905.11786.pdf

In addition to lacking a natural counterpart, the supervised training of neural networks with end-to-end backpropagation suffers from practical disadvantages as well. Supervised learning requires labeled inputs, which are expensive to obtain. As a result, it is not applicable to the majority of available data, and suffers from a higher risk of overfitting, as the number of parameters required for a deep model often exceeds the number of labeled datapoints at hand. At the same time, end-to-end backpropagation creates a substantial memory overhead in a naïve implementation, as the entire computational graph, including all parameters, activations and gradients, needs to fit in a processing unit’s working memory. Current approaches to prevent this require either the recomputation of intermediate outputs [Salimans and Bulatov, 2017] or expensive reversible layers [Jacobsen et al., 2018]. This inhibits the application of deep learning models to high-dimensional input data that surpass current memory constraints. This problem is perpetuated as end-to-end training does not allow for an exact way of asynchronously optimizing individual layers [Jaderberg et al., 2017]. In a globally optimized network, every layer needs to wait for its predecessors to provide its inputs, as well as for its successors to provide gradients.
http://www.datasimfinancial.com
http://www.datasim.nl

Every Time We Teach a Child Something, We Keep Him from Inventing It Himself
Jean Piaget