Ninety years ago, Ronald Fisher changed science forever. With his book, Statistical Methods for Researchers, the eminent English statistician popularized P values for measuring statistical significance in a scientific result, noting, almost as an afterthought, "Personally, the writer prefers to set a low standard of significance at the 5 percent point..." P = 0.05 was born, and, as computer scientist Robert Matthews critiqued years later, scientists were bestowed with a "mathematical machine for turning baloney into breakthroughs, and flukes into funding."
Researchers in a great many disciplines now operate on Fisher's personal recommendation for significance. If a single finding attains a P value of 0.05 or lower, it's published as a noteworthy discovery and a scientific "truth". But that is not actually what Fisher intended. "A scientific fact should be regarded as experimentally established only if a properly designed experiment rarely fails to give this level of significance," he wrote.
"In other words, the operational meaning of a P value less than .05 was merely that one should repeat the experiment," Johns Hopkins biostatistician Steven Goodman interpreted. "If subsequent studies also yielded significant P values, one could conclude that the observed effects were unlikely to be the result of chance alone. So 'significance' is merely that: worthy of attention in the form of meriting more experimentation, but not proof in itself."