I had the opportunity to work with @AndrewLBeam and @bnallamo on an editorial sharing views on the reporting of ML models in this month’s @CircOutcomes. First, read the guidelines by Stevens et al. Our editorial addresses what they can and can’t fix.
First, everyone (including Stevens et al.) acknowledges that larger efforts to address this problem are underway, including the @TRIPODStatement’s TRIPOD-ML and CONSORT-AI for trials.
The elephant in the room is: What is clinical ML and what is not? (an age old debate)
What’s in a name clearly matters. Calling something “statistics” vs “ML” affects how the work is viewed (esp by clinical readers) and (unfairly) influences editorial decisions.
How can readers focus on the methods when they can’t assign a clear taxonomy to the methods?
Even the use of frequentist statistics don’t make a method “stats.” Conditional inference trees are an ML method where p-value is a tuning parameter!
Our proposal? Don’t call things “ML” in title. Just call them what they are, eg lasso regression, or don’t call them anything.
Wait, what? Don’t call them anything in the title?!
Think about it.
How many association studies do you read without the word “logistic regression” in the title? Tons!
So you can say “prediction model” in the title and name the method in the Methods.
But if the method is part of the innovation, name it.
Calling a penalized regression model an “ML model” will draw the ire of statistical readers. Calling a single “decision tree” anything but a decision tree will draw the ire of everyone!
Call it what it is.
A natural question that Stevens et al recommend addressing is why chose the approach you did. It’s always good to provide a reason...
BUT:
Your rationale will basically need to boil down to your expectation of signal-to-noise ratio and data quality/volume. Tread carefully.
Adhering to reporting guidelines is extremely important.
Why?
It’s the only way to catch bad science. The problem with bad science in prediction modeling is that many of the common errors inflate apparent performance.
Recently, @DrLukeOR put together a flamethrower of a thread arguing against the sharing of models. I disagree. While sharing models shouldn’t be a prerequisite to publication, the current state is pretty pathetic. Very few ML models shared.
There are LOTS of ways to share! 👇
Recently a collaborator at another institution sent us a set of pretrained models in #rstats .rds format. It was a beautiful moment. It almost brought a tear to my eye.
Those types of moments should be common, not rare!
So, in sum, we advocate for:
- Clear reporting
- Calling methods by name
- Sharing models and not just code
- Making sure ppl w/ appropriate methods expertise participate in editorial decisions
In other words: doing ML in a way that can maximally help patients.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Since my TL is filled with love letters to regression, let's talk about the beauty of random forests. Now maybe you don't like random forests are or don't use them or are afraid of using them due to copyright infringement.
Let's play a game of: It's Just a Random Forest (IJARF).
Decision trees: random forest or not?
Definitely a single-tree random forest with mtry set to the number of predictors and pruning enabled.
IJARF.
Boosted decision trees: random forest or not?
Well, if you weight the random forest trees differently, keep their depth shallow, maximize mtry, and grow them sequentially to minimize residual error, then a GBDT is just a type of random forest.
Why did we do this? What does it mean? Is the Epic deterioration index useful in COVID-19? (Thread)
Shortly after the first @umichmedicine COVID-19 patient was admitted in March, we saw rapid growth in the # of admitted patients with COVID-19. A COVID-19-specific unit was opened (uofmhealth.org/news/archive/2…) and projections of hospital capacity looked dire:
.@umichmedicine was considering opening a field hospital (michigandaily.com/section/news-b…) and a very real question arose of how we would make decisions about which COVID-19 patients would be appropriate for transfer to a field hospital. Ideally, such patients would not require ICU care.
I’ll be giving a talk on implementing predictive models at @HDAA_Official on Oct 23 in Ann Arbor. Here’s the Twitter version.
Model developers have been taught to carefully think thru development/validation/calibration. This talk is not about that. It’s about what comes after...
But before we move onto implementation, let’s think thru what model discrimination and calibration are:
- discrimination: how well can you distinguish higher from lower risk people?
- calibrations: how close are the predicted probabilities to reality?
... with that in mind ...
Which of the following statements is true?
A. It’s possible to have good discrimination but poor calibration.
B. It’s possible to have good calibration but poor discrimination.
The DeepMind team (now “Google Health”) developed a model to “continuously predict” AKI within a 48-hr window with an AUC of 92% in a VA population, published in @nature.
Did DeepMind do the impossible? What can we learn from this? A step-by-step guide.
The 2016 @CJASN paper used logistic regression and 2018 paper used GBMs.
The 2016 CJASN paper is particularly relevant because it was also modeled on a national VA population. Altho the two papers used different modeling approaches, one key similarity is in how the data are prepared: using a discrete time survival method.
I recently had a manuscript accepted that was outlined, written, and revised entirely in Google Docs. Am writing this thread to address Qs I had during the process in case they are of use to others.
The elephant in the room: how to handle references?
This one is easy: use the PaperPile add-on for Google Docs (not the Chrome extension). It’s free and you can handle citation styles for pretty much every journal.
1. Search paper 2. Click “Cite” 3. Click “Update bib” to format
Do you have a collaborator who likes to make edits in Word?
1. Save the GDoc as a .docx file and share 2. Reimport revised Word doc with tracked changes into GDocs (as new file) 3. Accept/reject suggestions in GDocs
Citations still work! (bc PaperPile refs are roundtrip-safe)