Maarten van Smeden Profile picture
Statistician • associate prof • statistics/epidemiology/AI methodology, Julius Center @umcutrecht • own views
7 subscribers
May 9, 2023 11 tweets 1 min read
People often ask what it takes to develop a successful (clinical) prediction model

Here are TEN important things to avoid 1) Make sure you do not talk to domain experts. They only slow things down
Dec 25, 2022 13 tweets 7 min read
This is my *top 10* favorite methods papers of 2022

Appearing in a single thread and in random order Disclaimer: this top 10 is personal opinion. I am biased towards explanatory methods and statistics articles relevant to health research

Shameless plugs alert. 3 papers I co-authored but did not lead made the top 10
Apr 11, 2022 4 tweets 2 min read
Yeah, that is not how it works Apparently I need to update our myths about measurement error paper
doi.org/10.1093/ije/dy…
Mar 17, 2022 19 tweets 12 min read
Medical Research Bingo Image 🧵with an explanation for each 👇
(posting this primarily as a note to self)
Mar 12, 2022 5 tweets 1 min read
The main problem with badly designed medical prediction models is not research waste or hampering scientific progress. It is the risk that someone takes the model seriously and use it to inform medical decision which can ruin someone’s life There is no “hypothesis generating” or “too small, but maybe useful for a meta-analysis”. It’s building a tool, probably more that it is a scientific endeavor
Dec 16, 2021 15 tweets 7 min read
This is my *top 10* favorite methods papers of 2021 Disclaimer: this top 10 is just personal opinion. I’m biased towards explanatory methods and statistics articles relevant to health research, particularly those relating to prediction models.

Shameless plugs alert. Two papers I co-authored (but did not lead) made the top 10
Nov 13, 2021 5 tweets 2 min read
The fact our government has decided to close the sport stadiums again in reaction to the surge of new C19 infections must mean they do not trust the results of their own expensive Fieldlab “experiments”. This is telling These were “experiments” that were explicitly *not* done to measure new infection which were later used to claim (and still are by the event industry) that there is no infection risk in open sport stadiums
Sep 23, 2021 4 tweets 1 min read
This seems to be dominating the news... but what is this?
"exposure misclassification is most likely to be non-differential and result in underestimation of the true effect"
I know what that is! A nice example of Myth 2 academic.oup.com/ije/article/49… Reading through this paper, uhm, no risk of bias assessments, no meta-analyses, just highlighting and counting significant effects (publication bias, anyone?), few of referenced studies have considered confounding, misclassification error issues downplayed
Sep 23, 2021 4 tweets 1 min read
Why random training-test splits usually don't work for prediction model development Option 1. You have not very large N
Prediction models are data hungry: use all the data for model development and some efficient internal validation procedure (e.g. nested cross-validation)
Sep 22, 2021 5 tweets 3 min read
Some thoughts on the provoking recent article by @_MiguelHernan "Causal analyses of existing databases: no power calculations required"
doi.org/10.1016/j.jcli… ImageImageImageImage The link to the original article is here: doi.org/10.1016/j.jcli… Image
Sep 19, 2021 4 tweets 1 min read
If your science get critiqued on Twitter: GET AT IT IMMEDIATELY
Do not take a deep breath
Do not get a coffee and consider whether there is some truth in the critique
NO!
Get that anger out and defend yourself You are going to win the argument by telling critics they don't have enough experience.

Look at that publication record LOL!!
Aug 31, 2021 7 tweets 1 min read
Five simple ways to write a better scientific paper than your colleagues 1. Carefully formulate a specific research question that you can answer with data

Much research is done on questions that are too broad, too unspecific or not formulated at all
Aug 28, 2021 5 tweets 1 min read
Please remember that with over 200k Covid related scientific publications it’s easy to cherry pick evidence to support “expert opinion” no matter how insane the narrative And to get those sweet, sweet likes and retweets let your narrative be extreme. Choose your favorite p-hacker wisely
May 18, 2021 5 tweets 3 min read
If you are new to epidemiology and looking for a way to ruin your week, read about:
- post selection inference
- colliders
- table 2 fallacy
- well defined interventions
- measurement error
- missing data
- standard errors (ignore random confounding) If this isn't enough there is still consistency, positivity and model misspecification to worry about. And then there is literature on why multivariable adjustment is bad, why propensity scores are bad and why weighting is bad. Enjoy
May 11, 2021 4 tweets 1 min read
How to write an introduction for a scientific manuscript:
- say this topic is really important
- show you read the literature
- tell the readers all the earlier studies have ignored the most important topic
- until now
- done. How to write an methods section for a scientific manuscript:
- copy/paste from earlier paper
- make small adjustments that make it look like it's been written from scratch
- done.
Mar 20, 2021 7 tweets 2 min read
Statistical things to worry about *less*
1) significance of univariable associations
2) significant model goodness-of-fit tests
3) imbalance in randomized trials
4) non-normality of observations
5) multicollinearity 1) As a precursor to multivariable analyses, the associations between each individual covariate and outcome are often "screened" for significance. Often does more harm than good, so don't bother doing or worry about it onlinelibrary.wiley.com/doi/full/10.11…
Feb 9, 2021 6 tweets 8 min read
🚨 NEW UPDATE 🚨 living systematic review of diagnosis and prognosis models related to COVID-19 @bmj_latest

Now 232 models reviewed and appraised. Reporting and conduct remains poor with only few exceptions

bmj.com/content/369/bm… Image New for this update: a visualization of the risk of bias assessment Image
Feb 6, 2021 6 tweets 1 min read
Statistical terms: what they really mean

Multicolinearity— they all look the same
Heteroscedasticity— the variation varies
Attenuation— being too modest
Overfitting— too good to be true
Confounding— nothing is what it seems
P-value— it’s complicated Sensitivity analysis— tried a bunch of stuff
Post-hoc— main analysis not sexy enough
Multivariate— oops, meant to say multivariable
Normality— a very rare shape for data
Dichotomized— data was tortured
Extrapolation— just guessing
Feb 1, 2021 11 tweets 4 min read
Personal top 10 fallacies and paradoxes in statistics
1. Absence of evidence fallacy
2. Ecological fallacy
3. Stein’s paradox
4. Lord’s paradox
5. Simpson’s paradox
6. Berkson’s paradox
7. Prosecutors fallacy
8. Gambler’s fallacy
9. Lindsey’s paradox
10. Low birthweight paradox 1. Absence of evidence fallacy

Absence of evidence is not the same as evidence of absence. Wouldn't it be great if not statistically significant would just mean "no effect"? bmj.com/content/311/70…
Jan 16, 2021 13 tweets 2 min read
How to become a SUCCESSFUL academic: a guide 1/n How do I know how to become a successful academic? I don't, but I have received plenty of advice. As a good academic, I will just summarize what I have learned from listening
Jan 15, 2021 4 tweets 2 min read
The infamous retracted Hydroxychloroquire Lancet article?
Cited.... 883 TIMES Only referenced as a joke or warning you say? Think again.. (screenshot from a 2021 paper)