As I said, all this stuff is Euler method (duh).

Here is the first step to solve an ODE (is AD the way to do it here??)

https://arxiv.org/pdf/1806.07366.pdf

Some weekend reading.

- Cuchulainn
**Posts:**59013**Joined:****Location:**Amsterdam-
**Contact:**

As I said, all this stuff is Euler method (duh).

Here is the first step to solve an ODE (is AD the way to do it here??)

https://arxiv.org/pdf/1806.07366.pdf

Some weekend reading.

Here is the first step to solve an ODE (is AD the way to do it here??)

https://arxiv.org/pdf/1806.07366.pdf

Some weekend reading.

- Cuchulainn
**Posts:**59013**Joined:****Location:**Amsterdam-
**Contact:**

It only looks that way. You can prove convergence of FEM schemes in weighted Sobolev spaces but 1) it takes longer 2) someone has to pay..Well, I thought it was a joke. And quite a good one. Isn’t much of numerical analysis and optimization like that? Rather arbitrary. One of the many reasons I don’t like all these subjects.It's a mystery.

Just to be clear: I was joking. The point is that people choose a heuristic learning rate and some value ranges "usually work". But everyone knows that this is not how things should be done, and the theory of optimisation in statistical learning is an active research field.

AI numerical methods for AI are somewhat outdated to a lesser or greater extent.

- Cuchulainn
**Posts:**59013**Joined:****Location:**Amsterdam-
**Contact:**

If you study the book by Nocedal and Wright (BTW I have) you will see that there are more nuanced approaches as well.Part of the documentation of the routine LBFGS by J. Nocedal (one of the giants in the field of nonlinear numerical optimization):

Why 0.9? Why 0.1? Why greater than 1e-4? You can shoot off the same questions at LBFGS as the ones Cuch throws at SGD. SGD doesn't have an automated way of setting the learning rate, so it's "dumb". Methods like LBFGS contain an algorithm to automatically set the learning rate, but this algorithms in 99 cases out of 100 contain some hyper-parameters which you either adjust to the every problem or set to some "typical" value. If you're lucky, there's a theorem which tells you the bounds in which you have to fit. But because this is hidden somewhere in the bowels of ancient Fortran library, people naively think it "just works". Just like SGD, it works until it doesn't. There's no magical way around the problem that if you're optimising a function based on point estimates, you have a learning problem and the no-free lunch theorem comes down on you like a ton of bricks.GTOL is a DOUBLE PRECISION variable with default value 0.9, which

C controls the accuracy of the line search routine MCSRCH. If the

C function and gradient evaluations are inexpensive with respect

C to the cost of the iteration (which is sometimes the case when

C solving very large problems) it may be advantageous to set GTOL

C to a small value. A typical small value is 0.1. Restriction:

C GTOL should be greater than 1.D-04.

I know this book, I implemented some algorithms from it. Can you tell me what general purpose non-linear optimisation methods described in it are fully automatic and have no user-adjustable parameters?If you study the book by Nocedal and Wright (BTW I have) you will see that there are more nuanced approaches as well.

- Cuchulainn
**Posts:**59013**Joined:****Location:**Amsterdam-
**Contact:**

I would say theI know this book, I implemented some algorithms from it. Can you tell me what general purpose non-linear optimisation methods described in it are fully automatic and have no user-adjustable parameters?If you study the book by Nocedal and Wright (BTW I have) you will see that there are more nuanced approaches as well.

But if by 'user' you mean something else that's another discussion.

Last edited by Cuchulainn on December 17th, 2018, 3:18 pm, edited 1 time in total.

- Cuchulainn
**Posts:**59013**Joined:****Location:**Amsterdam-
**Contact:**

“"Neural networks" are a sad misnomer. They're neither neural nor even networks. They're chains of differentiable, parameterized geometric functions, trained with gradient descent (with gradients obtained via the chain rule). A small set of highschool-level ideas put together.” — **François Chollet**

https://en.wikipedia.org/wiki/Keras

https://en.wikipedia.org/wiki/Keras

- Cuchulainn
**Posts:**59013**Joined:****Location:**Amsterdam-
**Contact:**

Sounds defeatist.

In mathematics, conditions are stated under which an algorithm works for a given class of problems. It will not works when the problem does not satisfy the assumption. For example,

[$]x_{k+1} = f(x_{k}), k = 0,1,2, ..[$] only converges when [$]f[$] is a contraction.

I see it as: for what class of problems does SGD work? I don't think this has been done, at least not in the spate of articles to date. For example, SGD only finds local minimum. And then SGD for constrained optimisation is a Pandora's box, yes? It gets very fuzzy-fuzzy?

ODE solvers take into account the problem they are trying to solve and adjust parameters accordingly.

Hmm. In my edition (2nd, Springer, 2006) page 37 has Algorithm 3.1 which has 2 hyperparameters (c and rho). What is the theory for choosing the best values of c and rho?I would say theI know this book, I implemented some algorithms from it. Can you tell me what general purpose non-linear optimisation methods described in it are fully automatic and have no user-adjustable parameters?If you study the book by Nocedal and Wright (BTW I have) you will see that there are more nuanced approaches as well.backtracking discrete methods(e.g. page 37 etc.). The continuous analogues (ODE solvers) have all this built-on so as 'user' you just give a tolerance and the solver does the rest instead of having to choose from a palette of learning rates. Backtracking is well-established in numerical analysis.

- Cuchulainn
**Posts:**59013**Joined:****Location:**Amsterdam-
**Contact:**

https://www.theregister.co.uk/2018/07/2 ... ion_sucks/

ISM: how do you think the crooks feel about it.

It's not there yet.

Yes, still a way to go

But at least we have now an inexhaustible supply of really bad Tolkien fan fiction.“Then we’ll keep it alive as long as we live,” added Legolas. “And we won’t forget the first great battle of the night, even if we may have forgotten the final defeat.”

“I agree,” Gandalf said, “but we will all remember it as the last battle in Middle-earth, and the first great battle of the new day.”

Aragorn drew his sword, and the Battle of Fangorn was won. As they marched out through the thicket the morning mist cleared, and the day turned to dusk.

- Cuchulainn
**Posts:**59013**Joined:****Location:**Amsterdam-
**Contact:**

It won't be long before you can do a PhD in a liberal arts college on

"Finnegans Wake: a Hidden Markov Model and Bayesian Network Approach"

riverrun, past Eve and Adam's, from swerve of shore to bendof bay, brings us by a commodius vicus of recirculation back toHowth Castle and Environs. Sir Tristram, violer d'amores, fr'over the short sea, had passen-core rearrived from North Armorica on this side the scraggyisthmus of Europe Minor to wielderfight his penisolate war: norhad topsawyer's rocks by the stream Oconee exaggerated themselseto Laurens County's gorgios while they went doublin their mumperall the time: nor avoice from afire bellowsed mishe mishe totauftauf thuartpeatrick: not yet, though venissoon after, had akidscad buttended a bland old isaac: not yet, though all's fair invanessy, were sosie sesthers wroth with twone nathandjoe. Rot apeck of pa's malt had Jhem or Shen brewed by arclight and roryend to the regginbrow was to be seen ringsome on the aquaface. The fall (bababadalgharaghtakamminarronnkonnbronntonner-ronntuonnthunntrovarrhounawnskawntoohoohoordenenthur-nuk!) of a once wallstrait oldparr is retaled early in bed and lateron life down through all christian minstrelsy

"Finnegans Wake: a Hidden Markov Model and Bayesian Network Approach"

riverrun, past Eve and Adam's, from swerve of shore to bendof bay, brings us by a commodius vicus of recirculation back toHowth Castle and Environs. Sir Tristram, violer d'amores, fr'over the short sea, had passen-core rearrived from North Armorica on this side the scraggyisthmus of Europe Minor to wielderfight his penisolate war: norhad topsawyer's rocks by the stream Oconee exaggerated themselseto Laurens County's gorgios while they went doublin their mumperall the time: nor avoice from afire bellowsed mishe mishe totauftauf thuartpeatrick: not yet, though venissoon after, had akidscad buttended a bland old isaac: not yet, though all's fair invanessy, were sosie sesthers wroth with twone nathandjoe. Rot apeck of pa's malt had Jhem or Shen brewed by arclight and roryend to the regginbrow was to be seen ringsome on the aquaface. The fall (bababadalgharaghtakamminarronnkonnbronntonner-ronntuonnthunntrovarrhounawnskawntoohoohoordenenthur-nuk!) of a once wallstrait oldparr is retaled early in bed and lateron life down through all christian minstrelsy

- Cuchulainn
**Posts:**59013**Joined:****Location:**Amsterdam-
**Contact:**

In general, there are 3 layers between Joyce's *effects* and *causes*. Great PhD topic.

And domain annotations are essential to build the influence diagram.

BTW what's a good C++ open-source for HMMs and BNs?

And domain annotations are essential to build the influence diagram.

BTW what's a good C++ open-source for HMMs and BNs?

- Cuchulainn
**Posts:**59013**Joined:****Location:**Amsterdam-
**Contact:**

LOR is 1-dimensional (all live happily ever after) while PW is a vicus cycle.Yes, still a way to go

But at least we have now an inexhaustible supply of really bad Tolkien fan fiction.“Then we’ll keep it alive as long as we live,” added Legolas. “And we won’t forget the first great battle of the night, even if we may have forgotten the final defeat.”

“I agree,” Gandalf said, “but we will all remember it as the last battle in Middle-earth, and the first great battle of the new day.”

Aragorn drew his sword, and the Battle of Fangorn was won. As they marched out through the thicket the morning mist cleared, and the day turned to dusk.

GZIP: On