Serving the Quantitative Finance Community

 
User avatar
Cuchulainn
Topic Author
Posts: 20252
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

Re: Universal Approximation theorem

November 6th, 2019, 7:59 pm

A loss surface of a neural network can approximate anything. Including cows.
This paper seems quite interesting but I spent unsuccessfully two hours trying to understand it. But nice cows indeed !
Is the experience kathartic?
 
User avatar
JohnLeM
Posts: 379
Joined: September 16th, 2008, 7:15 pm

Re: Universal Approximation theorem

November 7th, 2019, 7:09 am

A loss surface of a neural network can approximate anything. Including cows.
This paper seems quite interesting but I spent unsuccessfully two hours trying to understand it. But nice cows indeed !
Is the experience kathartic?
I am not sure if it is cathartic or not. I will try again to see if I can get out of my body. I'll let you know.
But I agree with you, a lot of authors around there should reconsider the link to their readers.
 
User avatar
Cuchulainn
Topic Author
Posts: 20252
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

Re: Universal Approximation theorem

November 7th, 2019, 8:42 am

But I agree with you, a lot of authors around there should reconsider the link to their readers.

If both yourself and I have difficulty, what about freshly-minted data scientists. ArxIv is a litany of unreadable/unrefereed papers.

Documentation and  writing skills have never had the interest of CS. Most articles take no account of the fact that 95% of people don't do AI/ML.
 
User avatar
JohnLeM
Posts: 379
Joined: September 16th, 2008, 7:15 pm

Re: Universal Approximation theorem

November 7th, 2019, 9:38 am

I tried again one hour this morning. But it is really unclear. However one of the included reference seems more detailed and clear, I'll try to fall back to it to understand their work.
 
User avatar
Cuchulainn
Topic Author
Posts: 20252
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

Re: Universal Approximation theorem

November 8th, 2019, 11:31 am

In a sense all of the  ML articles to date have have missed the mark because they are based on statistics and linear algebra (don't have metrics, norms and basically lacking in hard functional analysis). Sapir-Whorf in action (the language you use shapes how you think). The current methods might be barking up the wrong tree and may not even be wrong.
An alternative is to embed probability distributions in a reproducing kernel Hilbet space (RKHS) an then the analysis takes place in the context of inner product spaces by defining feature maps and a kernel mean.

A special case is the kernel trick in SVM.

Much of RKHS rests on the magical Riesz Representation Theorem.

https://en.wikipedia.org/wiki/Riesz_rep ... on_theorem

JohnLeM  is the resident expert on RKHS, Please feel free to correct any flaws.

// BTW kernels can be characterised a being univeral, characteristic, translation-invariant, strictly positive-definite, What's that?
 
User avatar
Cuchulainn
Topic Author
Posts: 20252
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

Re: Universal Approximation theorem

November 8th, 2019, 11:50 am

Here is a nice paper on kernel methods to test if two samples are from different distributions.If I didn't know any better I would be inclined to say that there be a Cauchy sequence hiding in equation (2).

https://arxiv.org/abs/0805.2368
 
User avatar
katastrofa
Posts: 7440
Joined: August 16th, 2007, 5:36 am
Location: Alpha Centauri

Re: Universal Approximation theorem

November 8th, 2019, 12:31 pm

Sounds like a standard PCA to me. You know, the non-parametric unsupervised learning technique developed by ML guys.
 
User avatar
Cuchulainn
Topic Author
Posts: 20252
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

Re: Universal Approximation theorem

November 8th, 2019, 1:38 pm

Sounds like a standard PCA to me. You know, the non-parametric unsupervised learning technique developed by ML guys.
You mean this?
http://citeseerx.ist.psu.edu/viewdoc/su ... .1.29.1366

Almost. They applied kernel methods (from 1907) to PCA (Pearson 1901). Kernel methods from Hilbert, Schmidt, Volterra etc. Om giants' shoulders.
It contains kernel PCA as an instance/sub case. There are many others. 
But the main point IMO is that the methods of appiled functional analysis is being used.

It worth investigating and you seem to be suggesting that it is well-known and universal in ML, which it is not it seems. I could be wrong.

Q: people used the term ML instead of Statistical Learning. Is ii sexier?
Why not call it multivariate statistics and be done with it (parsimony >> verbosity but it may not sell well).
 
User avatar
ISayMoo
Posts: 2332
Joined: September 30th, 2015, 8:30 pm

Re: Universal Approximation theorem

November 8th, 2019, 9:03 pm

For me it's perfectly clear what the article is saying.
 
User avatar
Cuchulainn
Topic Author
Posts: 20252
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

Re: Universal Approximation theorem

November 8th, 2019, 9:46 pm

For me it's perfectly clear what the article is saying.
I would hope so.
Unfortunately, that's irrelevant to the other stakeholders.

Edit: Maybe the threshold can be lowered by more tutorial-style articles, e.g. "RKHS  for the impatient".
 
User avatar
JohnLeM
Posts: 379
Joined: September 16th, 2008, 7:15 pm

Re: Universal Approximation theorem

November 12th, 2019, 10:22 am

For me it's perfectly clear what the article is saying.
After reading one of their included reference, it is now more clear. I was unable to understand their log-entropy functional (3) without this reading. Frankly, they could have developed a little bit more to make it understandable, or at least added a - "see [] for details".
 
User avatar
JohnLeM
Posts: 379
Joined: September 16th, 2008, 7:15 pm

Re: Universal Approximation theorem

November 12th, 2019, 11:44 am

// BTW kernels can be characterised a being univeral, characteristic, translation-invariant, strictly positive-definite, What's that?
Stricly positive kernels on [$]\Omega[$] are functions [$]k(x,y)[$] satisfying [$]k(x^i,x^j)_{i,j \le N}[$] is a s.d.p matrix for any set of distinct points [$]x^i \in \Omega[$]
Translation invariant kernels are kernels having form [$]k(x,y) = \varphi(x-y)[$] 
I found some definitions of Universal kernels in https://arxiv.org/pdf/1003.0887.pdf. To me it might be a little bit obsolete definition, telling you that a kernel can reproduce any continuous function in some Banach space.
I found the definition of characteristic kernels in this reference : https://www.ism.ac.jp/~fukumizu/papers/fukumizu_etal_nips2007_extended.pdf. To me it is a little bit strange definition. As far as I understood : consider a kernel [$]k(x,y)[$], generating a space of functions [$]H_k[$]. It is said characteristic if for any two probability measure [$]\mu, \nu[$], if [$]\int f(x) d\mu = \int f(x) d\nu[$] for any [$]f \in H_k[$] implies [$]\mu = \nu[$]. I would say that both definition (characteristic and universal kernels) are almost surely equivalent ;)
 
User avatar
Cuchulainn
Topic Author
Posts: 20252
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

Re: Universal Approximation theorem

December 13th, 2020, 5:10 pm

time for reflection...


Some updates on Cybenko holy Universal Approximation Theorem (1989). Has it stood the test of time?

Classical measure theory is fundamentally non-constructive, since the classical definition of Lebesgue measure does not describe any way to compute the measure of a set or the integral of a function. In fact, if one thinks of a function just as a rule that "inputs a real number and outputs a real number" then there cannot be any algorithm to compute the integral of a function, since any algorithm would only be able to call finitely many values of the function at a time, and finitely many values are not enough to compute the integral to any nontrivial accuracy.”



All fine, but where does measure theory help in approximation theory, functional and numerical analysis? How would you answer as

. Data scientist
. Computer scientist
. Pure mathematician
. Physicist
. Philospher (the full cycle)
 
Nice warm feelings but we need a priori estimates.
It is now almost 2021,
 
https://en.wikipedia.org/wiki/Universal ... on_theorem
 
User avatar
Cuchulainn
Topic Author
Posts: 20252
Joined: July 16th, 2004, 7:38 am
Location: 20, 000

Re: Universal Approximation theorem

January 16th, 2021, 6:09 pm

Deep, Skinny Neural Networks are not Universal Approximators

https://arxiv.org/abs/1810.00393

trial and error?
 
User avatar
JohnLeM
Posts: 379
Joined: September 16th, 2008, 7:15 pm

Re: Universal Approximation theorem

January 29th, 2021, 5:49 pm

Deep, Skinny Neural Networks are not Universal Approximators

https://arxiv.org/abs/1810.00393

trial and error?
I just read quickly this paper. @Cuchulainn thank you, it is very valuable !