It's been coming up a lot lately, so I thought I'd do a bit of a thread on CONVENIENCE SAMPLES and why they aren't great for assessing POPULATION PREVALENCE of a disease
In other words - how many people have had COVID-19?
1/n
2/n So, the basic idea here is simple. We want to know about people who have (or in this case, have had) a disease
How do we find that out?
3/n The traditional method is to do a large, randomly-sampled study involving dialing up 10,000s of people across a population and surveying them + doing lots of blood tests
But this is EXPENSIVE
4/n Running a proper statistically representative process, getting all the people to answer their phones and give you bloods...even if the cost per person is low, multiply that by 10-100,000 and the cost can be prohibitive
5/n Which brings us to the idea of a CONVENIENCE SAMPLE
Why is it called a convenience sample (hint: answer is in the name)
6/n Yes, convenience samples are just that - convenient
Usually, they are groups of people that you are ALREADY TESTING for some reason that you can either add another test on to or survey
7/n I have used this method in the past to look at the burden of diabetes in-hospital and GP clinics - we looked at people who were already getting blood tests, and added one extra test for diabetes (and science!) sciencedirect.com/science/articl…
8/n But there's an issue here
We have selected these people very specifically. They are not a random, representative sample - they were people ALREADY GETTING blood tests which means they are probably different in LOTS OF WAYS to the general population
9/n So in our lovely study of a convenience sample of diabetes tests, we can't say anything about how much diabetes there is in the community (population prevalence)!
All we can talk about is diabetes IN THE PATIENTS TESTED
10/ "But God", you ask, with a common autocorrect mistake, "what does this have to do with COVID-19?"
Well, reader, this is where we get to antibody testing
11/n You see, when you get sick with a new disease, your body produces antibodies*
We can then test for these antibodies to see if you've had the disease before*
*oversimplified, plz don't murder me immunologists
12/n If you run an antibody test on a large group of people, it's called a serosurvey (because antibody tests are also known as serology in sciency terms)
13/n Now, a lot of places (countries, states, colleges) have run serosurveys and had a grand old time of it. This is why you keep seeing those news articles saying that x% of people in a place have had COVID-19 already
14/n The problem is, some of these serosurveys used CONVENIENCE SAMPLES
Just like we discussed earlier, that makes them a bit problematic
15/n My co-authors and I, in our systematic review of age-stratified IFRs for COVID-19, looked into just how problematic
16/n For example, one study in Tokyo that used a CONVENIENCE SAMPLE found that 3.8% of people had had COVID-19 in the sample tested
But a proper randomized sample found just 0.1% - 38 times lower!
17/n In England, a CONVENIENCE SAMPLE of blood donors implied that 1 in 12 people had had COVID-19, but a large representative sample found it was just 1 in 20
18/n The problem is, these CONVENIENCE SAMPLES are systematically biased. They are of people who are different to the general population in ways that can be very difficult to measure and/or understand
19/n Blood donors, for example, are young and healthy by design. But the people who have been (generously) giving blood during the pandemic might also be...well, a bit odd
20/n They're going to great personal lengths to sacrifice for the rest of us ungrateful buggers, which might indicate that they're more likely to socialize, more likely to mingle, and thus more likely to get infected
We JUST DON'T KNOW
21/n And this is the problem with convenience samples, generally
We cannot use them to estimate population prevalence (how many people have had COVID-19), because they aren't representative of society as a whole
22/n So if you see a headline that says "x% of people infected with COVID-19!" take a leaf out of my mentor's book and ask:
"WHAT'S THE DENOMINATOR?"
It's a vitally important question
23/n THIS DOESN'T MEAN THAT CONVENIENCE SAMPLES ARE USELESS
I use them in my research. They are brilliant for quick, cheap tracking of rates of infection IN SELECT GROUPS
They also provide a brilliant window into change OVER TIME
24/n For example, if you sample blood donors every week for a year, you've got an amazing insight into the changing nature of the pandemic
THIS IS MASSIVELY IMPORTANT AND VERY CHEAP
25/n You just can't use those results to tell how many people in the rest of society have gotten COVID-19
But that doesn't mean the results aren't helpful at all
• • •
Missing some Tweet in this thread? You can try to
force a refresh
The final large published trial on ivermectin for COVID-19, PRINCIPLE, is now out. Main findings:
1. Clinically unimportant (~1-2day reduction) in time to resolution of symptoms. 2. No benefit for hospitalization/death.
Now, you may be asking "why does anyone care at all any more about ivermectin for COVID?" to which I would respond "yes"
We already knew pretty much everything this study shows. That being said, always good to have more data!
The study is here:
For me, the main finding is pretty simple - ivermectin didn't impact the likelihood of people going to hospital or dying from COVID-19. This has now been shown in every high-quality study out there.pubmed.ncbi.nlm.nih.gov/38431155/
What's particularly interesting is a finding that the authors don't really discuss in their conclusion. These results appear to show that gender affirming care is associated with a reduction in suicide risk 1/n
2/n The paper is a retrospective cohort study that compares young adults and some teens who were referred for gender related services in Finland with a cohort that was matched using age and sex. The median age in the study was 19, so the majority of the population are adults.
3/n The study is very limited. The authors had access to the Finnish registries which include a wide range of data, but chose to only correct their cohorts for age, sex, and number of psychiatric appointments prior to their inclusion in the cohort.
These headlines have to be some of the most ridiculous I've seen in a while
The study tested 18 different PFAS in a tiny sample of 176 people. Of those, one had a barely significant association with thyroid cancer
This is genuinely just not news at all
Here's the study. I'm somewhat surprised it even got published if I'm honest. A tiny case-control study, they looked at 88 people with thyroid cancer and 88 controls thelancet.com/journals/ebiom…
Here are the main results. There was a single measured PFAS which had a 'significant' association with the cancer, the others just look a bit like noise to me
A new study has gone viral for purportedly showing that running therapy had similar efficacy to medication for depression
Which is weird, because a) it's not a very good study and b) seems not to show that at all 1/n
2/n The study is here. The authors describe it as a "partially randomized patient preference design", which is a wildly misleading term. In practice, this is simply a cohort study, where ~90% of the patients self-selected into their preferred treatment sciencedirect.com/science/articl…
3/n This is a big problem, because it means that there are likely confounding factors between the two groups (i.e. who is likely to choose running therapy over meds?). Instead of a useful, randomized trial, this is a very small (n=141) non-randomized paper
The study showed that COVID-19 had, if anything, very few long-term issues for children! As a new father, I find this data very reassuring regarding #LongCovid in kids 1/n
2/n The study is here, it's a retrospective cohort comparing children aged 0-14 who had COVID-19 to a matched control using a database of primary care visits in Italy onlinelibrary.wiley.com/doi/10.1111/ap…
3/ The authors found that there was an increased risk of a range of diagnoses for the kids with COVID-19 after their acute disease, including things like runny noses, anxiety/depression, diarrhoea, etc
This study has recently gone viral, with people saying that it shows that nearly 20% of highly vaccinated people get Long COVID
I don't think it's reasonable to draw these conclusions based on this research. Let's talk about bias 1/n
2/n The study is here. It is a survey of people who tested positive to COVID-19 in Western Australia from July-Aug 2022 medrxiv.org/content/10.110…
3/n This immediately gives us our first source of bias
We KNOW that most cases of COVID-19 were missed at this point in the pandemic, so we're only getting the sample of those people who were sick enough to go and get tested