It’s a strange way of trying to limit false positives that has taken on nearly sacred value.
And its intellectual history is a mess. Like: All research is built around this 5% level... which RA Fisher just plucked from thin air.
That’s it!
It doesn’t say anything about how often you make false negative errors (when there IS an effect, but you say “nah”).
And that’s before you get to the flagrant corruption that shows how little we get it:
If a paper has “* means 10%, ** means 5%, *** means 1%”—and they all do!—that. Is. Wong.
You set the level in advance—it’s a career error level—and hold to it. p = 0.000001 or p = 0.049 are IDENTICAL if you set your level at 5%.
Literally flipping Popper on his head.
“I’m testing the impact of police on crime. My null is no effect.”
NO IT ISN’T.
So now you’re seeking confirmation, not rejecting the null. We dishonor the idea COMPLETELY.
Here’s the tell:
DOESN’T PROVIDE ANY INFORMATION.
All you’ve done is reject the null (that you didn’t believe).
But what comes next in every paper? “My estimate of 0.4 means....”
No. Nope. No.
But we NEVER do that, bc a no effect null—a null we rarely believe—is the default in EVERY stats program.
It has its uses, in certain places, but it def does NOT deserve the gate-keeping power we give it.