SERVING THE QUANTITATIVE FINANCE COMMUNITY

 
User avatar
Cuchulainn
Topic Author
Posts: 60256
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: Universal Approximation theorem

November 6th, 2019, 7:59 pm

A loss surface of a neural network can approximate anything. Including cows.
This paper seems quite interesting but I spent unsuccessfully two hours trying to understand it. But nice cows indeed !
Is the experience kathartic?
 
User avatar
JohnLeM
Posts: 362
Joined: September 16th, 2008, 7:15 pm

Re: Universal Approximation theorem

November 7th, 2019, 7:09 am

A loss surface of a neural network can approximate anything. Including cows.
This paper seems quite interesting but I spent unsuccessfully two hours trying to understand it. But nice cows indeed !
Is the experience kathartic?
I am not sure if it is cathartic or not. I will try again to see if I can get out of my body. I'll let you know.
But I agree with you, a lot of authors around there should reconsider the link to their readers.
 
User avatar
Cuchulainn
Topic Author
Posts: 60256
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: Universal Approximation theorem

November 7th, 2019, 8:42 am

But I agree with you, a lot of authors around there should reconsider the link to their readers.

If both yourself and I have difficulty, what about freshly-minted data scientists. ArxIv is a litany of unreadable/unrefereed papers.

Documentation and  writing skills have never had the interest of CS. Most articles take no account of the fact that 95% of people don't do AI/ML.
 
User avatar
JohnLeM
Posts: 362
Joined: September 16th, 2008, 7:15 pm

Re: Universal Approximation theorem

November 7th, 2019, 9:38 am

I tried again one hour this morning. But it is really unclear. However one of the included reference seems more detailed and clear, I'll try to fall back to it to understand their work.
 
User avatar
Cuchulainn
Topic Author
Posts: 60256
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: Universal Approximation theorem

November 8th, 2019, 11:31 am

In a sense all of the  ML articles to date have have missed the mark because they are based on statistics and linear algebra (don't have metrics, norms and basically lacking in hard functional analysis). Sapir-Whorf in action (the language you use shapes how you think). The current methods might be barking up the wrong tree and may not even be wrong.
An alternative is to embed probability distributions in a reproducing kernel Hilbet space (RKHS) an then the analysis takes place in the context of inner product spaces by defining feature maps and a kernel mean.

A special case is the kernel trick in SVM.

Much of RKHS rests on the magical Riesz Representation Theorem.

https://en.wikipedia.org/wiki/Riesz_rep ... on_theorem

JohnLeM  is the resident expert on RKHS, Please feel free to correct any flaws.

// BTW kernels can be characterised a being univeral, characteristic, translation-invariant, strictly positive-definite, What's that?
 
User avatar
Cuchulainn
Topic Author
Posts: 60256
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: Universal Approximation theorem

November 8th, 2019, 11:50 am

Here is a nice paper on kernel methods to test if two samples are from different distributions.If I didn't know any better I would be inclined to say that there be a Cauchy sequence hiding in equation (2).

https://arxiv.org/abs/0805.2368
 
User avatar
katastrofa
Posts: 8376
Joined: August 16th, 2007, 5:36 am
Location: Alpha Centauri

Re: Universal Approximation theorem

November 8th, 2019, 12:31 pm

Sounds like a standard PCA to me. You know, the non-parametric unsupervised learning technique developed by ML guys.
 
User avatar
Cuchulainn
Topic Author
Posts: 60256
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: Universal Approximation theorem

November 8th, 2019, 1:38 pm

Sounds like a standard PCA to me. You know, the non-parametric unsupervised learning technique developed by ML guys.
You mean this?
http://citeseerx.ist.psu.edu/viewdoc/su ... .1.29.1366

Almost. They applied kernel methods (from 1907) to PCA (Pearson 1901). Kernel methods from Hilbert, Schmidt, Volterra etc. Om giants' shoulders.
It contains kernel PCA as an instance/sub case. There are many others. 
But the main point IMO is that the methods of appiled functional analysis is being used.

It worth investigating and you seem to be suggesting that it is well-known and universal in ML, which it is not it seems. I could be wrong.

Q: people used the term ML instead of Statistical Learning. Is ii sexier?
Why not call it multivariate statistics and be done with it (parsimony >> verbosity but it may not sell well).
 
User avatar
ISayMoo
Posts: 2143
Joined: September 30th, 2015, 8:30 pm

Re: Universal Approximation theorem

November 8th, 2019, 9:03 pm

For me it's perfectly clear what the article is saying.
 
User avatar
Cuchulainn
Topic Author
Posts: 60256
Joined: July 16th, 2004, 7:38 am
Location: Amsterdam
Contact:

Re: Universal Approximation theorem

November 8th, 2019, 9:46 pm

For me it's perfectly clear what the article is saying.
I would hope so.
Unfortunately, that's irrelevant to the other stakeholders.

Edit: Maybe the threshold can be lowered by more tutorial-style articles, e.g. "RKHS  for the impatient".
 
User avatar
JohnLeM
Posts: 362
Joined: September 16th, 2008, 7:15 pm

Re: Universal Approximation theorem

November 12th, 2019, 10:22 am

For me it's perfectly clear what the article is saying.
After reading one of their included reference, it is now more clear. I was unable to understand their log-entropy functional (3) without this reading. Frankly, they could have developed a little bit more to make it understandable, or at least added a - "see [] for details".
 
User avatar
JohnLeM
Posts: 362
Joined: September 16th, 2008, 7:15 pm

Re: Universal Approximation theorem

November 12th, 2019, 11:44 am

// BTW kernels can be characterised a being univeral, characteristic, translation-invariant, strictly positive-definite, What's that?
Stricly positive kernels on [$]\Omega[$] are functions [$]k(x,y)[$] satisfying [$]k(x^i,x^j)_{i,j \le N}[$] is a s.d.p matrix for any set of distinct points [$]x^i \in \Omega[$]
Translation invariant kernels are kernels having form [$]k(x,y) = \varphi(x-y)[$] 
I found some definitions of Universal kernels in https://arxiv.org/pdf/1003.0887.pdf. To me it might be a little bit obsolete definition, telling you that a kernel can reproduce any continuous function in some Banach space.
I found the definition of characteristic kernels in this reference : https://www.ism.ac.jp/~fukumizu/papers/fukumizu_etal_nips2007_extended.pdf. To me it is a little bit strange definition. As far as I understood : consider a kernel [$]k(x,y)[$], generating a space of functions [$]H_k[$]. It is said characteristic if for any two probability measure [$]\mu, \nu[$], if [$]\int f(x) d\mu = \int f(x) d\nu[$] for any [$]f \in H_k[$] implies [$]\mu = \nu[$]. I would say that both definition (characteristic and universal kernels) are almost surely equivalent ;)
ABOUT WILMOTT

PW by JB

Wilmott.com has been "Serving the Quantitative Finance Community" since 2001. Continued...


Twitter LinkedIn Instagram

JOBS BOARD

JOBS BOARD

Looking for a quant job, risk, algo trading,...? Browse jobs here...


GZIP: On