SERVING THE QUANTITATIVE FINANCE COMMUNITY

outrun
Posts: 4573
Joined: April 29th, 2016, 1:40 pm

### Re: If you are bored with Deep Networks

I think it's impossible to construct a NN that has a loss surface like that.

Your suggestion to test GD on that loss function has already been done, that where you got it from, right? But GD != NN. Depending on the types of admissible loss fuctions (by a NN in this case) you would need to pick a loss minimisation algorithm. If the surface was a surface with little holes in it where you would drop to a lower level one would have to resort to a grid search. However, that's also not a type of surface you would see in a NN. So you need to test it differently. (and then later you need to understand that we don't want to find the global minimum because the loss function we use is not the one we want to learn)

So to test the performance of GD on learning the weights of a NN you would need to evaluate the performance of GD on a NN test case. E.g. a NN that has to learn to map to inputs to some outputs.

The terminology is like this:

cost = loss_function(error)

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: If you are bored with Deep Networks

And what the hell is your point, T4A?
Math is not robust in the broader physical sense.

Math may be extremely even perfectly provably robust within it's close-world domain of logic but to the extent that there are any discrepancies between the selected math and the selected physical system, the results of all that math may be logically correct and practically wrong.
Yeah, what did mathematicians ever do for us?
I love math and fully appreciate that mathematicians do a lot for us. It's just that math is a hammer -- very hard and powerful if one has nails but not always useful in every situation.

A lot of this discussion reminds me of the difference between using analytic or PDE methods in quant finance versus using brute force simulation methods. If the underlying system can be defined in nice analytic functions then maybe analytic/PDE methods have a chance of producing extremely robust and elegant proofs of results. But if the underlying system has complex nonlinearities in the dynamics of the underlying or in the pay-off, then simulation might be the only possible hope for some sort of estimate (but without the robustness of the math approach).

Looking at the stochastic gradient descent (https://en.wikipedia.org/wiki/Stochasti ... nt_descent) from the standpoint of feasibility of using pure math, I can't help but think that math will only work for restricted subsets of loss functions and variants of the algorithm. For example, maybe if we restrict the loss function to be a polynomial and we restrict the stochastic gradient descent algorithm to one of the simpler variants (e.g., averaging), then theorems and proofs about performance might be possible. But what if most real-world loss functions are not polynomials and numerical testing finds that one of the more complex variants of stochastic gradient descent does a much better job at DL? We're faced with a choice between: A) an inapplicable and inferior stochastic gradient descent system for which we have proof of robustness versus B) an applicable and empirically superior stochastic gradient descent system that lacks a mathematical foundation. Picking A is like drunk who looks for his keys near the lamp post because the light is better even if the keys are certainly not there.

Maybe neural nets are not nails and pure math is unlikely to create useful results even if math can often produce very elegant results in many other domains.

P.S. Evolution is full of adversarial examples. If you are a bee, your little neural net with only a 5000-pixel image needs to decide if this is a nice nectar snack or a death trap:

P.P.S. A second order question is to determine whether a given DL application really contains true adversarial examples and whether the system has any hope for dealing with them. If the sensory input of a adversarial example can be statistically indistinguishable from the sensory input of the target of the adversarial example, then there's nothing DL can do to solve the problem.

ISayMoo
Topic Author
Posts: 2275
Joined: September 30th, 2015, 8:30 pm

### Re: If you are bored with Deep Networks

ISayMoo
Topic Author
Posts: 2275
Joined: September 30th, 2015, 8:30 pm

### Re: If you are bored with Deep Networks

Maybe neural nets are not nails and pure math is unlikely to create useful results even if math can often produce very elegant results in many other domains.
Positive results may be artificial ("if you assume A, B and C then X follows"), but negative results can be surprisingly general ("X does not exist"). Regardless of whether you're drunk or sober, you will save a lot of time if you don't search for things which are proven not to exist.

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: If you are bored with Deep Networks

Maybe neural nets are not nails and pure math is unlikely to create useful results even if math can often produce very elegant results in many other domains.
Positive results may be artificial ("if you assume A, B and C then X follows"), but negative results can be surprisingly general ("X does not exist"). Regardless of whether you're drunk or sober, you will save a lot of time if you don't search for things which are proven not to exist.
Excellent point!

So is there a higher-order branch of math that can prove whether certain categories of proofs (e.g., solution existence/nonexistence/robustness) exist in certain categories of systems (e.g., NN or SGD)? Before attempting to build a mathematical foundation for NN or other DL algorithms it would be nice to know if such foundations exist.

ISayMoo
Topic Author
Posts: 2275
Joined: September 30th, 2015, 8:30 pm

### Re: If you are bored with Deep Networks

LOL

katastrofa
Posts: 8758
Joined: August 16th, 2007, 5:36 am
Location: Alpha Centauri

### Re: If you are bored with Deep Networks

OT, isn't one of the favourite questions of quantum information theory about the existence of unicorns?

ISayMoo
Topic Author
Posts: 2275
Joined: September 30th, 2015, 8:30 pm

### Re: If you are bored with Deep Networks

With or without the rainbow attached at either end?

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: If you are bored with Deep Networks

Presumably unicorn rainbows contain emission lines from unobtainium.

katastrofa
Posts: 8758
Joined: August 16th, 2007, 5:36 am
Location: Alpha Centauri

### Re: If you are bored with Deep Networks

No, that was a leprechaun with a pot of gold at rainbow's end. Seriously, it's good to first clarify whether you're speaking from the stance of scientific realism (the philosophy arguing that mathematical theories describe real objects and have logical sense regardless of our perception; and that we can asymptotically approach the absolute and ultimate truth though research) or anti-realism. IMHO, the first is still appealing only to pop-culture (Hawking or Weinberg's unified theory) and particle physicists (who think that they are touching the hand of god, and indeed only god knows what really are the resonances they called a Higgs boson and what is "5*sigma" in their gravitational waves measurements - I know what is 5, I just don't know what is sigma). The main body of science seems to have shifted to more sceptical and modest antirealist philosophy: there is no absolute truth, no ultimate theory, and our models are mere useful tools.

BTW, imagine a brain scientists discussing consciousness from the realist perspective: "our consciousness exists beyond our consciousness", what?

Cuchulainn
Posts: 61155
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

### Re: If you are bored with Deep Networks

42

Do you carry out sanity checks (upstream and downstream) before jumping into code? Mathematicians do. Plan B is to extensive numerical experimentation  and/or pretend a solution exists if you have an idea of what kind of input you have.

http://www.math.ucla.edu/~lvese/273.1.06f/Summary.pdf

I have some related questions about Universal approximators later.
Last edited by Cuchulainn on November 15th, 2017, 7:07 pm, edited 1 time in total.
http://www.datasimfinancial.com
http://www.datasim.nl

Every Time We Teach a Child Something, We Keep Him from Inventing It Himself
Jean Piaget

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: If you are bored with Deep Networks

As you've defined them, I'd put myself more in the anti-realist camp. I strongly suspect there's a physical world outside the human mind, I'm very certain there's a logical mathematical world inside the human mind, and I notice there's a growing body of empirical evidence that suggests that certain parts of the logical mathematical world do accurately predict certain parts of the physical world within some error bounds. Maybe there's a holy grail math that is perfectly congruent with the physical world but maybe the human mind is not sufficiently powerful or properly structured to find that congruence. I serious doubt today's theories are those true, congruent ones although it's amusing that each generation of scientists arrogantly thinks they have found "the answer."

I've spent enough time with cognitive and neuroscientists to know that the human brain and senses are deeply flawed in an absolute sense even if they are extremely useful in a relative sense. Amusingly, those flaws in the brain nicely predict the arrogance of "successful" scientist, mathematician, politician, business executive, etc.

I'd like to think that human math is universal but can't help but wonder if some day we'll meet an extraterrestrial race that does not have integers in their mathematical repertoire and instead uses some crazy fuzzy-logic measure theory system that humans have never considered. Those extraterrestrials may have a powerful corpus of their math to explain the universe that bears little resemblance to the math humans use to explain the universe. Their math may be just as good as our math although maybe their error bars and sigmas occur in different places than ours.

My wife is currently writing a report about an extremely large company and the company is refusing to tell her how many factories they have. It seems like such a simple objective integer value. How can they not know? But in studying how they operate, I see how the number really can't be defined. They have some very large physical plant locations that are being operated as multiple factories, they have some multi-location systems that are being operated as one virtual factory, and they do a lot of acquisitions, sell-offs, and closures so there's uncertainty about which facilities are legally in or out of their network on any given date. The notion of discrete objects seems quite nice for a being with foveal sensors and only two hands but that does not mean the physical world has discrete objects in reality. In biology, the concept of discrete species and even discrete individuals falls apart no matter what definition one uses.

So do integers exist in the real world or are they just a convenient approximation for which the human mind has a strong subjective preference?

Posts: 23951
Joined: September 20th, 2002, 8:30 pm

### Re: If you are bored with Deep Networks

ISayMoo
Topic Author
Posts: 2275
Joined: September 30th, 2015, 8:30 pm

### Re: If you are bored with Deep Networks

I think it's impossible to construct a NN that has a loss surface like that.
Actually, the properties of NN loss surfaces depend quite a lot on the dataset: see this paper

"It is widely believed that training of deep models using gradient methods works so well because the error surface either has no local minima, or if they exist they need to be close in value to the global minimum. It is known that such results hold under very strong assumptions which are not satisfied by real models. In this paper we present examples showing that for such theorem to be true additional assumptions on the data, initialization schemes and/or the model classes have to be made. We look at the particular case of finite size datasets. We demonstrate that in this scenario one can construct counter-examples (datasets or initialization schemes) when the network does become susceptible to bad local minima over the weight space."

Note that their results also cast a slight shadow of doubt on the results of the paper I linked to today in my previous post. Other nice results were also proven under optimistic and, as admitted by the author, somewhat unrealistic assumptions.

Paul
Posts: 10073
Joined: July 20th, 2001, 3:28 pm

### Re: If you are bored with Deep Networks

Is there an equivalent of local and global minima in the training of actual neurons? I mean the ones in your head!

Wilmott.com has been "Serving the Quantitative Finance Community" since 2001. Continued...

 JOBS BOARD

Looking for a quant job, risk, algo trading,...? Browse jobs here...

GZIP: On