It's so easy nowadays to throw together a statistical test on a computer. But software doesn't help much with study design and interpretation.

In today's post I look at what can go wrong. 🧵
fast.ai/2021/10/17/con…
A recent review paper about Long Covid in kids was widely discussed in the press. The headlines were reassuring...
journals.lww.com/pidj/Abstract/… Image
...but as it turns out, the paper didn't really say what the headlines claimed.

(Although I did see an author of the paper making the headline claims on social media and in videos.) Image
The review separates those studies that use a “control group” from those that don't. It says we should focus on the control group studies: “in the absence of a control group, it is impossible to distinguish symptoms of long COVID from symptoms attributable to the pandemic.”
Control groups are most familiar to most doctors when used a part of a randomized controlled trial (RCT).

This is the case that simple computer-based statistical tests are ideally suited for, because the causal relationships are straightforward. Image
Sometimes “it is not feasible to use controlled experimentation”, but we still want to investigate a causal relationship between variables. In this case we can use an *observational study*, instead of an RCT. For instance, studying “the relationship between smoking and health” Image
The control group studies in the Long Covid review paper are generally of this form. The review reports for each group comparison the results of a statistical test, using a p-value. Image
The idea for these statistical tests is that we take one group that has (or had) COVID, and one group that didn’t, and then see if they have Long Covid symptoms a few weeks or months later. Then we can infer whether Long Covid symptoms are caused by a COVID infection Image
However, it’s not quite this simple. We don’t directly know who has a COVID infection, but instead we have to infer it using a diagnostic test.

We might hope that we could make a minor modification to our diagram, and still directly infer the relationship. Image
Researchers have noted that "results from observational studies can confuse the effect of interest with other variables’ effects, leading to an association that is not causal", & suggest using a diagram to visualize the structure of biases.
journal.chestnet.org/article/S0012-…
A more realistic diagram shows the link between test results and infection is imperfect, & that false negative test results may be more common in children, & accounts for research that shows that “Long-COVID is associated with weak anti-SARS-CoV-2 antibody response.” Image
We now can’t directly infer the relationship between COVID infection & Long Covid symptoms. We would first need to fully understand and account for the confounders and uncertainties. Simply reporting the results of a statistical test does not give meaningful information here.
In particular, we can see that the issues we have identified all bias the data in the same direction: they result in infected cases being incorrectly placed in the control group.

For more details about control group problems, see this by @Dr2NisreenAlwan: nisreenalwan.wordpress.com/2021/10/16/lon…
Furthermore, we should not look at p-values out of context, but instead need to also consider the likelihood of alternative hypotheses. The alternative hypothesis provided in the review is that the symptoms may be due to “lockdown measures, including school closures”. Image
One of the included control group studies stood out as an outlier, in which 10% of Swiss children with negative tests were found to have Long Covid symptoms, many times higher than other similar studies.
jamanetwork.com/journals/jama/…
Was this because of the confounding effects, or due to lockdowns and school closures? Switzerland did not have a full lockdown, and schools were only briefly closed.

On the other hand, Switzerland may have had a very high number of cases.
Assuming that the symptoms found in the control group are due to pandemic factors other than infection is not fully supported by the data in the study, and is should not be part of the null hypothesis for a statistical test.
It is often possible, mathematically, to infer an association even in complex causal relationships. But doing so requires a full understanding of all of the relationships in the causal structure.

We’ve only scratched the surface in this article on one aspect: control group bias.
Whatever the solution turns out to be, it seems that for a while at least, the prevalence of Long Covid in children will remain uncertain. How parents, doctors, and policy makers respond to this risk and uncertainty will be a critical issue for children around the world.
The best way to ensure we get this right is by drawing on diverse expertise, from paediatricians, infectious disease specialists, epidemiologists, patients... and yes, even data scientists and statisticians!
Grateful thanks to @ahandvanish, @Dr2NisreenAlwan, @math_rachel, @DrZoeHyde, and @dgurdasani1 for all their help putting this article together!

PS: If you read the thread this far, you may as well read the whole article!: fast.ai/2021/10/17/con…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Jeremy Howard

Jeremy Howard Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @jeremyphoward

20 Oct
Wow. Schools being told they're not allowed to use air purifiers bought by parents, because they're "not needed" based on an "audit".

BUT the audit was done when schools were closed. So they haven't actually measured CO2 with students present! 1/🧵
theguardian.com/australia-news…
Principals are being told that air purifiers could make air quality worse if not maintained properly.
But Prof Lidia Morawska says it's "strongly recommended that schools that do not have ventilation systems capable of keeping indoor particles down be equipped with air purifiers"
If you're wondering who Lidia Morawska is - she's one of the world's top experts on safe air and COVID.
abc.net.au/news/2021-09-1…
Read 13 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(