Maarten van Smeden Profile picture
Statistician • asst prof in epidemiological methods • prediction models, measurement, epidemiology, medical statistics • @UMCUtrecht @juliuscenter
John Boylan Profile picture Epi LearneR Profile picture Anca Tilea Profile picture 4 added to My Authors
3 Aug
The BMJ just published an editorial about living systematic reviews worth a read, which is new territory for just about everyone…

ICYI, I have a few thoughts to share
We were fortunate to have produced @bmj_latest first living review…
The aim of our review is (and always was) to give an overview and appraisal of currently available diagnosis and prognosis models related to COVID-19

But this is a fast moving field: from 31 models reviewed in April to 145 models reviewed in our 2nd update published in July
Read 14 tweets
11 Jul
Used to get annoyed by stats consult clients who insisted they needed machine learning for their very large dataset (N of 100s or few 1000s). Now I tell them logistic regression *is* machine learning and everything is great again
And since machine learning is a sub field of AI, logistic regression is also AI. I should have understood this sooner
Logistic regression as statistical model
- prepare data
- estimate model
- evaluate performance
- report

Logistic regression as machine learning
- prepare data
- estimate model
- evaluate performance
- report
Read 4 tweets
9 Jun
Was asked for personal favorite resources for improving methods and statistics skills. I promised to make it a thread, so here it is

I work in medical research, so that is going to be my focus here too. But I’d like to think the resources are relevant to a wider audience

This list should not be taken as a guide to become a statistician, nor is it a must-read list for all academics (obviously)

My personal view is that medical research would benefit from involving trained statisticians earlier and more frequently; not from everyone trying to become one

Here are some good arguments by @statsepi:…

And some more:…

Read 20 tweets
19 May
The definitive guide to COVID-19 prognosis modeling success

1) Do not explain where the data come from (country) or when (study dates) they were obtained. Do not specify inclusion or exclusion criteria
2) Do not define a target group. Talk generically about COVID-19 patients, do not define how they were recruited
3) Do not provide a table with patient characteristics. In particular, do not mention use of medication or co-morbidities
Read 18 tweets
13 Apr
Let's talk about the "risk factors" for COVID-19 for a moment

We talk about risk factors all the time. Not just in the medical scientific literature: you will find risk factors being discussed in the popular media and on social media too

Exhibit A:…

The term "risk factor" is popular in medical research. It has been used in literature since at least the 1950s

BUT definitions for what a risk factor really is or should be varies. As this article argues quite convincingly…

Read 15 tweets
7 Apr
Our NEW systematic review article about diagnosis and prognosis models related to COVID-19 is out now in @bmj_latest…

📢 This review will be regularly updated in the coming months. Watch this space📢

The evidence base for COVID-19 related diagnosis and prognosis models is weak and reporting quality is generally poor: we can and should do better

We will continue our critical appraisals of new models when they appear in the coming months

What we did
Systematically reviewed and critically appraised articles (including preprints) of COVID-19 related diagnosis and prognosis models developed for individual level predictions

Models to forecast the spread of the COVID-19 infection are not part of this review

Read 12 tweets
17 Mar
@michelleskeller As requested, a break down of every point below.
@michelleskeller Unnecessary dichotomization. Analyzing data with continuous variables by dichotomizing or categorizing is generally a bad idea. The literature on this is massive
@michelleskeller Table 2 fallacy. A problem one sees so often in epi data-analyses: after a multivariable model is constructed to control for "confounding", all parameter estimates in the mv model are interpreted as causal effect estimates. This is generally wrong:
Read 5 tweets
23 Dec 19
This my *top 10* of favorite methods papers of 2019

Appearing in a single thread and in random order
Disclaimer: this top 10 is just personal opinion; I’m biased towards explanatory methods and statistics articles relevant to health research
#1: A plea against dichotomization of study results as statistically significant or not. With three authors, 800+ signatories and #2 Altmetric score of the year, this article belongs on the list. Agree with the message or not, or only a little
Read 12 tweets
13 Nov 19
Interesting paper: a guide to reading machine learning articles in @JAMA_current:…

ICYI, have a couple of thoughts to share about this paper

TL;DR: overall, I think this article is a quite useful beginners guide

To start, I like the explanation of the terminology and concepts. Nice use of text boxes, imo

I particularly like the attention to calibration in addition to discrimination performance, and attention to importance of continued testing and updating of algorithms; algorithms are indeed high maintenance, a single “validation study” generally won’t do

Read 15 tweets
6 Aug 19
If art were like scientific manuscripts

Artist: worked some months on this painting that would fit your gallery I believe. Would you consider?

Gallery: fill out these forms

A: okay

G: please remove the frame and attach it to the bottom

A: what? Okay...
G: congratulations, we’ll consider it

A: shall I wait?

G: *lol* no, we’ll contact you

A: okay, can I share it with other galleries while you consider?

G: NO!!

A: okay

*eight months later*

G: we cannot accept it in current form

A: why not?
G: we asked 3 other artists and the 2nd doesn’t like your colour scheme

A: now what?

G: we might reconsider if you replace all blue with lime green

A: bleh, okay...

G: it is too large. Make it 1/2 the size
Read 7 tweets
31 May 19
Our new open-access paper is online at…

The non-technical story is this:
Performance of a clinical prediction models in a setting where it is tested/validated is often worse than the setting in which it was derived. Suspects no 1 and 2: overfitting and differences in patient characteristics (“case-mix”) between derivation and validation settings
So we wondered: how about measurement error in the predictors?
Read 19 tweets
11 Feb 19
I'm preparing a talk about clinical prediction models, wondering: what should be considered the first clinical prediction model or rule?
#epitwitter input would be much appreciated
Read 2 tweets
4 Nov 18
About confidence interval interpretation. Motivated by the results of this poll

Most of us have probably been taught about 95% confidence intervals and their interpretation in rather vague terms. Such as "the limits reflecting the uncertainty in the parameter estimate" or "95% confident about the true value of the parameter"
You may also remember the warning that usually comes with it: 95% *confidence* doesn't mean 95% *probability* that the true value is actually in the confidence interval
Read 16 tweets
14 Oct 18
Let's try something new: *7 days, 7 statistical misconceptions*.
Over the next 7 days I'll post about 1 statistical misconception a day. Curious to hear your favorite #statsmisconceptions, so feel free to add you own to this thread
Day 1: "Sample size has nothing to do with bias"
👆 is something I still hear regularly. This misconception seems to be based on the oversimplified idea that sample size only affects precision - something I was told repeatedly in the first years of my studies
Read 93 tweets
5 Sep 18
What does one get when mindlessly applying logistic regression to a too small dataset? Well…

Just to be clear, I have zero interest in shaming these authors. Small sample logistic regression analyses are very common! But for those of you interested here is the link:…
“The aim of this study was to evaluate the impact of medical student placement of Foley catheters on rates of postoperative catheter-associated urinary tract infection (CAUTI)”.
Read 24 tweets