External Publication
Visit Post

Significance versus hypothesis testing

Datamethods Discussion Forum [Unofficial] February 14, 2026
Source

Well put except for “why not just agree with Fisher”. Though I believe the Fisherian approach is better, I still think it’s awful. Null hypotheses are artificial straw-man constructs that do not serve most research goals well. That’s because answering questions is more suitable to scientific endeavor than testing hypotheses, other than when existence hypotheses are of central issue as in particle physics or ESP research. A common type of question that is very relevant is “how much risk reduction will patients get if they take a statin rather than placebo?”. This leads to estimation and Bayesian evidence quantification related to the veracity of every possible level of risk reduction.

The biggest defects in Fisher’s approach is how poorly p-values deal with questions of real interest and how they must take investigator intentions into account, due to the fact that p-values are probabilities about data and not about unknowns. For example, two investigators can analyze the same dataset and get different results when one investigator analyzed the data only at the planned study end, while the other investigator also did an interim look that was inconsequential.

By computing P(data more extreme than the observed data | H0) Fisher thought that the exercise was fully objective and scientific. For a moment he pushed a more relevant quantity P(getting observed data | H0) but soon realized all these probabilities are tiny or zero, hence had to pool the question of interest with other possibilities to have the chance of getting a large p-value. The best contribution from Fisher at this point was his statement that a large p-value should only be interpreted as “get more data” and should not be used as evidence for H0.

Discussion in the ATmosphere

Loading comments...