Here come S-values! https://arxiv.org/abs/1906.07801

It looks interesting - I will give it a read and comment further.

Here come S-values! https://arxiv.org/abs/1906.07801

It looks interesting - I will give it a read and comment further.

It looks interesting - I will give it a read and comment further.

- katastrofa
**Posts:**8039**Joined:****Location:**Alpha Centauri

If one considers Tukey's 1960's celebrated works recent... I don't want to be mean, but if they write in the paragraph "Motivation" about how great their method is, where's the paragraph "Our great method" which must contain the accidentally switched motivation?Motivation p-values and standard null hypothesis testing have come under intense scrutiny in recent years (Wasserstein et al., 2016, Benjamin et al., 2018); s-values and safe tests offer several advantages. Most importantly, in contrast top-values, s-values behave excellently under optional continuation, the highly common practice in which the decision to perform additional tests partly depends on the outcome of previous tests. A second reason is their enhanced interpretability, and a third is their flexibility: s-values based on Fisherian, Neyman-Pearsonian and Bayes-Jeffreys’ testing philosophies all can be accommodated for. These three types of s-values can be freely combined, while preserving Type I error guarantees; at the same time, they keep a clear (monetary) interpretation even if one dismisses ‘significance’ altogether, as recently advocated by Amrhein et al. (2019).

- Cuchulainn
**Posts:**59685**Joined:****Location:**Amsterdam-
**Contact:**

BTW which of the 300 authors do we write to if we have queries?Here come S-values! https://arxiv.org/abs/1906.07801

It looks interesting - I will give it a read and comment further.

The middle one. I'm almost done with my "peer review" of the manuscript. I have mixed feelings.

Finished. My feelings about this work have improved. It describes a method of combining the results of a multiple number of tests for the same null hypothesis which a) supports "optional continuation", which is deciding to carry out another test after seeing the results of the previous test(s), and controlling the false positive rate and b) preserves more statistical power than the Bonferroni correction (but the S-values themselves, in order to be combinable under optional continuation, have to be more conservative than "standard" p-values). This is definitely a useful thing, and the results in the paper seem rigorous. They also show some simple examples. In many cases, the S-value you need is just the Bayes factor between the null and some particular alternative hypothesis. The downside of the method is that it's quite complex to use and finding the appropriate S-value to use can be difficult. I wonder how robust this method is to model mis-specification (e.g. you assume the data are Gaussian but they aren't - the standard t-test is quite robust to deviations from normality).

Overall it's a piece of good work attacking an important problem. Read it if you're interested in rigorous statistics. I think the manuscript is a draft of a PhD thesis, which would explain its length and sometimes less formal narrative.

Overall it's a piece of good work attacking an important problem. Read it if you're interested in rigorous statistics. I think the manuscript is a draft of a PhD thesis, which would explain its length and sometimes less formal narrative.

- katastrofa
**Posts:**8039**Joined:****Location:**Alpha Centauri

I have two remarks:

Obvious: the proposed method (the concept of "optional continuation") applies only to systems in equilibrium. In practice, everything changes in time because of drift, mutations, etc. (even changes change!). Ergo, you can rsrely go "let's make one more experiment". There are also all sorts of practical errors which accumulate: from measurement errors to the researchers getting "overfitted to themselves" (vide your last paper). Well, science is by design a one big overfitting contest...

Technical: why not simply use the existing false discovery rste method? The Bayesian version which suits here is common in some disciplines (in which the chance of a discovery is very low - p-values weren't made for such cases).

Obvious: the proposed method (the concept of "optional continuation") applies only to systems in equilibrium. In practice, everything changes in time because of drift, mutations, etc. (even changes change!). Ergo, you can rsrely go "let's make one more experiment". There are also all sorts of practical errors which accumulate: from measurement errors to the researchers getting "overfitted to themselves" (vide your last paper). Well, science is by design a one big overfitting contest...

Technical: why not simply use the existing false discovery rste method? The Bayesian version which suits here is common in some disciplines (in which the chance of a discovery is very low - p-values weren't made for such cases).

- katastrofa
**Posts:**8039**Joined:****Location:**Alpha Centauri

Am I right: their contribution is that they calculate the s-value distribution for the whole family of priors, while - in a standard Bayesian approach - optimising the test for power and controlling the type I error rate? It may be too complex to use in the real fuzzy practice (that's why researchers often resort to simplistic tests like p-values).

That's more or less how I understand this work.

- katastrofa
**Posts:**8039**Joined:****Location:**Alpha Centauri

If you want more: q-valueThat's more or less how I understand this work.

q-values look more Bayesian than p-values. Don't you need a prior probability that H0 is true to calculate a q-value?

- katastrofa
**Posts:**8039**Joined:****Location:**Alpha Centauri

Yes. They are often tested for different assumed probabilities of H0 and compared against p-values. It tells the researcher what is the expected the false discovery rate for the discovery which is significant according to the p-value. All those tests are tailored to specific problems (q-values are used in genome analysis).

GZIP: On