, 21 tweets, 7 min read Read on Twitter
If you saw this and you’re not an avid follower of #epitwitter, you may have followed the link to the paper and thought, “Wow! Why is she so devoted to something so boring?” Well I'm here to tell you why *SELECTION BIAS* could keep me awake though the mid-chemo Benadryl haze. 1/
First, what do I mean by selection bias? The term is used differently between and even within disciplines, so let me clarify. 2/
Often "selection bias" is used to describe the fact that studies aren’t representative of a population — they select on factors like age or income, whether accidentally or on purpose. A trial may have inclusion/exclusion criteria (e.g, < 50 years) that limits generalizability. 3/
But this is a matter of external validity (does the causal effect apply to people outside of my study, e.g., 70 year-olds) rather than internal validity (does the causal effect even apply to people *in* my study, e.g., 40 year-olds). 4/
This type of selection doesn’t affect causal inference about the study population. Sometimes we can’t even precisely characterize the people we can make inference about, but at least we know if we find an effect, there's an effect for someone. See e.g., ncbi.nlm.nih.gov/pubmed/28535177 5/
Others call it "selection bias" when people are selected into treatment based on factors that are also related to their outcome. If your prognosis is worse, you may get more intense treatment. We could fix that with randomization. Epidemiologists just call this confounding. 6/
The selection bias I’m interested in can occur even when treatment has been randomized, and it means that we can’t validly estimate a causal effect even for the specific type of people we have in our study. Not so boring now, huh?!🤣 7/
A silly example:

You’ve been doing some online dating and have made a strange observation. People without dogs are twice as fun on dates as people with dogs! What?!?!🤯🤯🤯 8/
In order to be a fun date, should the people with dogs get rid of them? 9/
Of course not! You're only observing people you’ve "swiped right" on, which is inducing selection into your sample. When you swipe, you choose people for 1 of 2 reasons: either they have a cute dog, or they look like a fun person. (Some people fit both criteria, of course!) 10/
You swipe left on anyone who fits neither. So if you ever go on a date with a person who’s not fun, they’re guaranteed to have a dog. And if you go on a date with someone without a dog, you’re guaranteed to have fun. This will make it look like dogs *make* people less fun. 11/
This will be the case even if dogs are assigned randomly in the population (pick me! pick me!), and they have no effect on fun for anybody, whether you swiped right or left on them (though that’s clearly not true!). 12/
Epidemiologists don’t sample via Tinder, but sometimes do by convenience or logistical necessity. We can’t find everyone who’s lost to follow-up. We can’t genotype people who don’t give us their consent. We (often) can’t study outcomes in fetuses that don’t make it to birth. 13/
If the factors that determine who’s in our study have to do with both the exposure and outcome we’re studying (e.g., the cute dog pics and the fun quotient both affected swipe direction), we could end up with selection bias. But what do I mean by “have to do with”? 14/
It helps to draw a DAG to determine how selection relates to exposure and outcome and whether that could result in selection bias. See e.g. ncbi.nlm.nih.gov/pubmed/15308962, ncbi.nlm.nih.gov/pubmed/31033691 15/
In our paper we said ok, you have selection bias — so how much could it have affected your conclusions? If there’s really no effect of dogs on fun, how strong would the selection (your subconscious swiping tendencies) have to be to see that 2-fold fun increase? 16/
Specifically, how much more likely to be fun are the people you swiped right on than swiped left on, both among the dog owners and among the non-owners? 17/
And how much more likely are you to have swiped right on someone if you went on a date with them than if you didn’t, both among owners and non-owners? (The latter factor would come into play if you also went on some other dates, say with people you met at a bar.) 18/
If you think you have good estimates of those parameters, you could use them to adjust the estimate directly. Or just plug in lots of different values and explore what different strengths of selection would do to your result. 19/
If I lost you there, don't worry. You can try out both options using this web app: selection-bias.louisahsmith.com (as well as check out some more serious examples 😜) 20/
So if you’re worried about selection bias in your online dating study or otherwise, you can find all the details here: journals.lww.com/epidem/pages/a… 21/21
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Louisa Smith
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!