Is the experience kathartic?This paper seems quite interesting but I spent unsuccessfully two hours trying to understand it. But nice cows indeed !A loss surface of a neural network can approximate anything. Including cows.

- Cuchulainn
**Posts:**62167**Joined:****Location:**Amsterdam-
**Contact:**

Is the experience kathartic?This paper seems quite interesting but I spent unsuccessfully two hours trying to understand it. But nice cows indeed !A loss surface of a neural network can approximate anything. Including cows.

I am not sure if it is cathartic or not. I will try again to see if I can get out of my body. I'll let you know.Is the experience kathartic?This paper seems quite interesting but I spent unsuccessfully two hours trying to understand it. But nice cows indeed !A loss surface of a neural network can approximate anything. Including cows.

But I agree with you, a lot of authors around there should reconsider the link to their readers.

- Cuchulainn
**Posts:**62167**Joined:****Location:**Amsterdam-
**Contact:**

If both yourself and I have difficulty, what about freshly-minted data scientists. ArxIv is a litany of unreadable/unrefereed papers.

Documentation and writing skills have never had the interest of CS. Most articles take no account of the fact that 95% of people don't do AI/ML.

I tried again one hour this morning. But it is really unclear. However one of the included reference seems more detailed and clear, I'll try to fall back to it to understand their work.

- Cuchulainn
**Posts:**62167**Joined:****Location:**Amsterdam-
**Contact:**

In a sense all of the ML articles to date have have missed the mark because they are based on statistics and linear algebra (don't have metrics, norms and basically lacking in hard functional analysis). Sapir-Whorf in action (the language you use shapes how you think). The current methods might be barking up the wrong tree and may not even be wrong.

An alternative is to*embed* probability distributions in a reproducing kernel Hilbet space (RKHS) an then the analysis takes place in the context of inner product spaces by defining feature maps and a kernel mean.

A special case is the kernel trick in SVM.

Much of RKHS rests on the magical Riesz Representation Theorem.

https://en.wikipedia.org/wiki/Riesz_rep ... on_theorem

JohnLeM is the resident expert on RKHS, Please feel free to correct any flaws.

// BTW kernels can be characterised a being univeral, characteristic, translation-invariant, strictly positive-definite, What's that?

An alternative is to

A special case is the kernel trick in SVM.

Much of RKHS rests on the magical Riesz Representation Theorem.

https://en.wikipedia.org/wiki/Riesz_rep ... on_theorem

JohnLeM is the resident expert on RKHS, Please feel free to correct any flaws.

// BTW kernels can be characterised a being univeral, characteristic, translation-invariant, strictly positive-definite, What's that?

- Cuchulainn
**Posts:**62167**Joined:****Location:**Amsterdam-
**Contact:**

Here is a nice paper on kernel methods to test if two samples are from different distributions.If I didn't know any better I would be inclined to say that there be a Cauchy sequence hiding in equation (2).

https://arxiv.org/abs/0805.2368

https://arxiv.org/abs/0805.2368

- katastrofa
**Posts:**9224**Joined:****Location:**Alpha Centauri

Sounds like a standard PCA to me. You know, the non-parametric unsupervised learning technique developed by ML guys.

- Cuchulainn
**Posts:**62167**Joined:****Location:**Amsterdam-
**Contact:**

You mean this?Sounds like a standard PCA to me. You know, the non-parametric unsupervised learning technique developed by ML guys.

http://citeseerx.ist.psu.edu/viewdoc/su ... .1.29.1366

Almost. They applied kernel methods (from 1907) to PCA (Pearson 1901). Kernel methods from Hilbert, Schmidt, Volterra etc. Om giants' shoulders.

It contains kernel PCA as an instance/sub case. There are many others.

But the

It worth investigating and you seem to be suggesting that it is well-known and universal in ML, which it is not it seems. I could be wrong.

Q: people used the term ML instead of Statistical Learning. Is ii sexier?

Why not call it multivariate statistics and be done with it (parsimony >> verbosity but it may not sell well).

For me it's perfectly clear what the article is saying.

- Cuchulainn
**Posts:**62167**Joined:****Location:**Amsterdam-
**Contact:**

I would hope so.For me it's perfectly clear what the article is saying.

Unfortunately, that's irrelevant to the other stakeholders.

Edit: Maybe the threshold can be lowered by more tutorial-style articles, e.g. "RKHS for the impatient".

After reading one of their included reference, it is now more clear. I was unable to understand their log-entropy functional (3) without this reading. Frankly, they could have developed a little bit more to make it understandable, or at least added a - "see [] for details".For me it's perfectly clear what the article is saying.

Stricly positive kernels on [$]\Omega[$] are functions [$]k(x,y)[$] satisfying [$]k(x^i,x^j)_{i,j \le N}[$] is a s.d.p matrix for any set of distinct points [$]x^i \in \Omega[$]// BTW kernels can be characterised a being univeral, characteristic, translation-invariant, strictly positive-definite, What's that?

Translation invariant kernels are kernels having form [$]k(x,y) = \varphi(x-y)[$]

I found some definitions of Universal kernels in https://arxiv.org/pdf/1003.0887.pdf. To me it might be a little bit obsolete definition, telling you that a kernel can reproduce any continuous function in some Banach space.

I found the definition of characteristic kernels in this reference : https://www.ism.ac.jp/~fukumizu/papers/fukumizu_etal_nips2007_extended.pdf. To me it is a little bit strange definition. As far as I understood : consider a kernel [$]k(x,y)[$], generating a space of functions [$]H_k[$]. It is said characteristic if for any two probability measure [$]\mu, \nu[$], if [$]\int f(x) d\mu = \int f(x) d\nu[$] for any [$]f \in H_k[$] implies [$]\mu = \nu[$]. I would say that both definition (characteristic and universal kernels) are almost surely equivalent

GZIP: On