And what the hell is your point, T4A?
Math is not robust in the broader physical sense.
Math may be extremely even perfectly provably robust within it's close-world domain of logic but to the extent that there are any discrepancies between the selected math and the selected physical system, the results of all that math may be logically correct and practically wrong.
Yeah, what did mathematicians ever do for us?
I love math and fully appreciate that mathematicians do a lot for us. It's just that math is a hammer -- very hard and powerful if one has nails but not always useful in every situation.
A lot of this discussion reminds me of the difference between using analytic or PDE methods in quant finance versus using brute force simulation methods. If the underlying system can be defined in nice analytic functions then maybe analytic/PDE methods have a chance of producing extremely robust and elegant proofs of results. But if the underlying system has complex nonlinearities in the dynamics of the underlying or in the pay-off, then simulation might be the only possible hope for some sort of estimate (but without the robustness of the math approach).
Looking at the stochastic gradient descent (https://en.wikipedia.org/wiki/Stochasti ... nt_descent
) from the standpoint of feasibility of using pure math, I can't help but think that math will only work for restricted subsets of loss functions and variants of the algorithm. For example, maybe if we restrict the loss function to be a polynomial and we restrict the stochastic gradient descent algorithm to one of the simpler variants (e.g., averaging), then theorems and proofs about performance might be possible. But what if most real-world loss functions are not polynomials and numerical testing finds that one of the more complex variants of stochastic gradient descent does a much better job at DL? We're faced with a choice between: A) an inapplicable and inferior stochastic gradient descent system for which we have proof of robustness versus B) an applicable and empirically superior stochastic gradient descent system that lacks a mathematical foundation. Picking A is like drunk who looks for his keys near the lamp post because the light is better even if the keys are certainly not there.
Maybe neural nets are not nails and pure math is unlikely to create useful results even if math can often produce very elegant results in many other domains.
P.S. Evolution is full of adversarial examples. If you are a bee, your little neural net with only a 5000-pixel image needs to decide if this is a nice nectar snack or a death trap:
P.P.S. A second order question is to determine whether a given DL application really contains true adversarial examples and whether the system has any hope for dealing with them. If the sensory input of a adversarial example can be statistically indistinguishable from the sensory input of the target of the adversarial example, then there's nothing DL can do to solve the problem.