1/ A student just asked me why, if the p-value for a study is 0.04, we can't say the study has a 4% chance of being a false positive. First off, we definitely can't, even under idealized conditions - here's a brief thread as to why.

2/ (Apologies to the great John Ioannidis who does this better than I ever could), and to @VinayPrasadMD whose "Tweetorials" are an amazing epiphenomenon in and of themselves.

3/ Imagine a world of scientific hypotheses - all those hypotheses out there, floating in the ether. "Atorvastatin reduces nose bleeds" is out there. So is "marijuana use increases the chance of graduating college". Some of these are true hypotheses, some are not.

4/ Let's say there are 100,000 of these hypotheses, and 50% of them are true (they correctly state how things actually work). Now we need to study them...

5/ 50,000 hypotheses are true, 50,000 are false, but that's not what our studies will show.

6/ When we test the true hypotheses, assuming our studies have a standard 80% power, we will only capture 40,000 as properly positive. We miss 10,000 true hypotheses. Too bad.

7/ We'll correctly identify that most of the false hypotheses are false, but we'll incorrectly find 5% of those are true (thanks to our p-value threshold of 0.05).

8/ So, we have 42,500 "positive" studies, of which 40,000 are "true positives" - that gives a positive predictive value of 94%. Not bad! Not 5% chance of being wrong (as we might intuit from the p-value), but pretty close!

9/ But what if 50% of the hypotheses in the ether aren't true? What if only, like, 10% are true...

10/ Again, with our 80% power, we'll find 8000 out of 10000 true hypotheses to be true - missing a bunch but such is life.

11/ But of the 90,000 false hypotheses, we'll incorrectly find 4500 that we deem as "true" (thanks to our p-value of 0.05).

12/ Now we have 12,500 positive studies, but only 8000 of them are true positives. Meaning in this world, the positive predictive value of a positive study is just 8000/12,500 or 64%. Meaning 36% of the positive studies are FALSE POSITIVES. That's a far cry from 5%.

13/ The realization here is that the more untrue hypotheses are tested, the higher the rate of false positive studies. Here's how this looks graphically:

14/ Also realize that I didn't talk about publication bias, p-hacking, confounding, or any of the "cheating" ways that increase the risk of false-positive studies - this analysis assumes everything is on the up and up.

15/ Interpret a SINGLE study in the exact same way. Given a p-value of, say, 0.04, you need to decide how likely the hypothesis being tested was prior to the study. This tool, from Held et al, BMC Med Res Methodol 2010 can help.

16/ In clinical terms, think of a randomized trial like you think of a test for a disease. If the patient is at extremely low risk, you view a "positive" test skeptically. If the patient has all the symptoms, a positive test is much more trustworthy.

17/ And the opposite is also true! If it is very likely a hypothesis is true, a p-value of 0.06 is quite reassuring! It doesn't disprove anything.

18/ Yes, this is Bayesianism. But I didn't want to say that til the end. I'm not wading into that debate. I just wanted to talk a bit about why p=0.04 does NOT mean 4% this study is wrong.

19/ I hope this helps someone! Let me know.

Missing some Tweet in this thread?

You can try to force a refresh.

You can try to force a refresh.

Get real-time email alerts when new unrolls are available from this author!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll"
`@threadreaderapp unroll`

You can practice here first or read more on our help page!

Following last night's Thousand Oaks shooting, there's been an uptick in Twitter traffic about "false flags". We downloaded tweets containing #FalseFlag/"false flag" (and filtered out content related to unrelated topics such as Syria).

cc: @ZellaQuixote

cc: @ZellaQuixote

Here are some examples of the tweets asserting that the Thousand Oaks mass shooting is a false flag. The #FalseFlag theories vary, but everything from #Calexit to voter fraud to claims that this shooting is a "distraction" from the outcome of the 2018 midterms surfaces.

Here's the retweet network for #FalseFlag/"false flag" (with terms unrelated to the Thousand Oaks shooting filtered out). A number of these accounts, including the largest node (@LizCroken) are #QAnon fans who've crossed our radar before. . .

TOPIC: P-Values In Table 1 of RCT's. Time to revisit this poll.

Thanks very much to the clinicians that responded. This came out better than expected, albeit the selection bias of “clinicians that follow statisticians on Twitter” suggests that the respondents are collectively better versed in data analysis than general research population

Anyways, putting p-values in Table 1 of RCT’s is an inappropriate use of significance testing, yet remains prevalent in medical literature, because it SEEMS to make so much sense (at least, the way most people have been taught p-values and statistical significance…)

Interview With Greta Van Susteren of Voice of America state.gov/secretary/rema… (from @StateDept)

Q: Mr. Secretary, nice to see you, sir.

POMPEO: Greta, it’s great to be with you.

Q: This is your second trip to Mexico, but why are you here now in Mexico?

POMPEO: Greta, it’s great to be with you.

Q: This is your second trip to Mexico, but why are you here now in Mexico?

POMPEO: So as the new government makes this transition beginning on December 1st, we’re working diligently to make sure we have a solid foreign policy relationship with them. So we certainly are working with the existing government. I’ll see President Pena Nieto in just a few

Let's start our #GEMChats this week. The topic of this week is #student #entrepreneurship!

We'll discuss these three factors in the poll options below as well as replies to the poll 🧐🧐🧐 Let's take this forward!

We'll discuss these three factors in the poll options below as well as replies to the poll 🧐🧐🧐 Let's take this forward!

#Students? And #entrepreneurship? Is that even a good combination?

Especially when you read that the famous student entrepreneurs like Bill Gates and Mark Zuckerberg are all dropouts!!!

Can students not only run a #business while studying but also be #successful at it?

Especially when you read that the famous student entrepreneurs like Bill Gates and Mark Zuckerberg are all dropouts!!!

Can students not only run a #business while studying but also be #successful at it?

THE WATCHMAN

[...] "The watchman’s task has always been a thankless one.

[...] "The watchman’s task has always been a thankless one.

"There are treacherous fifth column elements, who are really working for the #defeat of the #nation, at the hands of its enemies.

"These subversive elements readily gain and hold the #public’s ear, because they soothingly tell the people what they want to hear. Go back to sleep, no danger can come, and there is nothing you can do.

Let’s talk @FlyAirNZ and MCT (minimum connection time).

MCT is the determined amount of time an able-bodied passenger needs to make a connecting flight.

Sometimes, for some passengers, the MCT is inadequate.

(1/9)

MCT is the determined amount of time an able-bodied passenger needs to make a connecting flight.

Sometimes, for some passengers, the MCT is inadequate.

(1/9)

Say you’re flying Wellington to Vancouver. When you book with @FlyAirNZ, your ticket automatically includes an Auckland connection as short as possible while meeting MCT.

This may be as little as 80 mins. Not 80 mins from gate to gate, but 80 mins from gate to takeoff.

(2/9)

This may be as little as 80 mins. Not 80 mins from gate to gate, but 80 mins from gate to takeoff.

(2/9)

If you’ve done this connection, you probably know that even under optimal conditions it can be tight. That’s assuming your flight lands on time, the unreliable terminal shuttle plays nice, or you can hustle on foot between terminals, and security isn’t crowded.

(3/9)

(3/9)