This is a nice reply, JohnLeM. I have not seen any head-on comparison KS versus RKHS for 1-sample, 2-sample, independence tests etc. A cursory read of all this stuff is that the latter approach is more accurate (uniform convergence bounds, distribution-independent), robust, mathematically mature (Functional Analysis) and extendable.
The DS statistic based on the empirical distribution function [$]F_n[$] comes across as rough-and-ready from a mathematical and numerical point of view (I am not a specialist on KS).The kernel method on the other hand (e.g. Gretton et al, equation (4)) allows us to define the MMD statistic ("MMD >> DS"). Indeed, it seems to work with low sample sizes. MMD(p,q) == 0 iff p == q. And it works on structured data.
A tongue-in-cheek remarks: "statistics don't do metrics, norms, Hilbert space, Cauchy-like sequences". I might be very wrong.
JohnLeM has done kernel methods for PDE.