I had the opportunity to work with @AndrewLBeam and @bnallamo on an editorial sharing views on the reporting of ML models in this month’s @CircOutcomes. First, read the guidelines by Stevens et al. Our editorial addresses what they can and can’t fix.

ahajournals.org/doi/10.1161/CI… Image
First, everyone (including Stevens et al.) acknowledges that larger efforts to address this problem are underway, including the @TRIPODStatement’s TRIPOD-ML and CONSORT-AI for trials.

The elephant in the room is: What is clinical ML and what is not? (an age old debate) Image
What’s in a name clearly matters. Calling something “statistics” vs “ML” affects how the work is viewed (esp by clinical readers) and (unfairly) influences editorial decisions.

How can readers focus on the methods when they can’t assign a clear taxonomy to the methods? Image
Even the use of frequentist statistics don’t make a method “stats.” Conditional inference trees are an ML method where p-value is a tuning parameter!

Our proposal? Don’t call things “ML” in title. Just call them what they are, eg lasso regression, or don’t call them anything. Image
Wait, what? Don’t call them anything in the title?!

Think about it.

How many association studies do you read without the word “logistic regression” in the title? Tons!

So you can say “prediction model” in the title and name the method in the Methods.
But if the method is part of the innovation, name it.

Calling a penalized regression model an “ML model” will draw the ire of statistical readers. Calling a single “decision tree” anything but a decision tree will draw the ire of everyone!

Call it what it is.
A natural question that Stevens et al recommend addressing is why chose the approach you did. It’s always good to provide a reason...

BUT:

Your rationale will basically need to boil down to your expectation of signal-to-noise ratio and data quality/volume. Tread carefully. Image
Adhering to reporting guidelines is extremely important.

Why?

It’s the only way to catch bad science. The problem with bad science in prediction modeling is that many of the common errors inflate apparent performance.

Mo’ performance = mo’ higher impact journal = mo’ problems. Image
Recently, @DrLukeOR put together a flamethrower of a thread arguing against the sharing of models. I disagree. While sharing models shouldn’t be a prerequisite to publication, the current state is pretty pathetic. Very few ML models shared.

There are LOTS of ways to share! 👇 Image
Recently a collaborator at another institution sent us a set of pretrained models in #rstats .rds format. It was a beautiful moment. It almost brought a tear to my eye.

Those types of moments should be common, not rare!
So, in sum, we advocate for:
- Clear reporting
- Calling methods by name
- Sharing models and not just code
- Making sure ppl w/ appropriate methods expertise participate in editorial decisions

In other words: doing ML in a way that can maximally help patients. Image

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Karandeep Singh

Karandeep Singh Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @kdpsinghlab

28 Aug
The presenter puts up a slide showing “random forest variable importance.” You know the one...

The sideways bar plot.

Says “only showing the top 20 variables here...” to highlight the hi-dimensional power of random forests.

The slide is awkwardly wide-screen. Everyone squints.
A clinician in the front row exclaims, “Wow, that makes so much sense!”

Silence.

Then someone asks, “What do the length of the bars mean?”

The presenter starts to answer when someone else butts in, “Does the fact that they are pointing the same direction mean anything?”
The audience stares at the presenter expectantly. There will be a deep explanation.

“The bar length is relative... the amplitude doesn’t have any interpretable meaning. But look at the top 3 variables. Ain’t that something?”

The clinicians exhale and whisper in reverence.
Read 48 tweets
24 Jul
Since my TL is filled with love letters to regression, let's talk about the beauty of random forests. Now maybe you don't like random forests are or don't use them or are afraid of using them due to copyright infringement.

Let's play a game of: It's Just a Random Forest (IJARF).
Decision trees: random forest or not?

Definitely a single-tree random forest with mtry set to the number of predictors and pruning enabled.

IJARF.
Boosted decision trees: random forest or not?

Well, if you weight the random forest trees differently, keep their depth shallow, maximize mtry, and grow them sequentially to minimize residual error, then a GBDT is just a type of random forest.

IJARF.
Read 6 tweets
29 Apr
PREPRINT: "Validating a Widely Implemented Deterioration Index Model Among Hospitalized COVID-19 Patients"

Manuscript: medrxiv.org/content/10.110…

Code: github.com/ml4lhs/edi_val…

Why did we do this? What does it mean? Is the Epic deterioration index useful in COVID-19? (Thread)
Shortly after the first @umichmedicine COVID-19 patient was admitted in March, we saw rapid growth in the # of admitted patients with COVID-19. A COVID-19-specific unit was opened (uofmhealth.org/news/archive/2…) and projections of hospital capacity looked dire:
.@umichmedicine was considering opening a field hospital (michigandaily.com/section/news-b…) and a very real question arose of how we would make decisions about which COVID-19 patients would be appropriate for transfer to a field hospital. Ideally, such patients would not require ICU care.
Read 40 tweets
21 Oct 19
I’ll be giving a talk on implementing predictive models at @HDAA_Official on Oct 23 in Ann Arbor. Here’s the Twitter version.

Model developers have been taught to carefully think thru development/validation/calibration. This talk is not about that. It’s about what comes after...
But before we move onto implementation, let’s think thru what model discrimination and calibration are:

- discrimination: how well can you distinguish higher from lower risk people?

- calibrations: how close are the predicted probabilities to reality?

... with that in mind ...
Which of the following statements is true?

A. It’s possible to have good discrimination but poor calibration.

B. It’s possible to have good calibration but poor discrimination.
Read 26 tweets
24 Sep 19
The DeepMind team (now “Google Health”) developed a model to “continuously predict” AKI within a 48-hr window with an AUC of 92% in a VA population, published in @nature.

Did DeepMind do the impossible? What can we learn from this? A step-by-step guide.

nature.com/articles/s4158…
To understand this work’s contribution, it’s first useful to know what was previously the state of art. I would point to two of @jaykoyner’s papers.

cjasn.asnjournals.org/content/11/11/… and
insights.ovid.com/crossref?an=00…

The 2016 @CJASN paper used logistic regression and 2018 paper used GBMs.
The 2016 CJASN paper is particularly relevant because it was also modeled on a national VA population. Altho the two papers used different modeling approaches, one key similarity is in how the data are prepared: using a discrete time survival method.

What the heck is that?
Read 35 tweets
3 Jan 19
I recently had a manuscript accepted that was outlined, written, and revised entirely in Google Docs. Am writing this thread to address Qs I had during the process in case they are of use to others.

#AcademicTwitter
The elephant in the room: how to handle references?

This one is easy: use the PaperPile add-on for Google Docs (not the Chrome extension). It’s free and you can handle citation styles for pretty much every journal.

1. Search paper
2. Click “Cite”
3. Click “Update bib” to format
Do you have a collaborator who likes to make edits in Word?

1. Save the GDoc as a .docx file and share
2. Reimport revised Word doc with tracked changes into GDocs (as new file)
3. Accept/reject suggestions in GDocs

Citations still work! (bc PaperPile refs are roundtrip-safe)
Read 9 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!