My Authors
Read all threads
Interesting paper: a guide to reading machine learning articles in @JAMA_current: jamanetwork.com/journals/jama/…

ICYI, have a couple of thoughts to share about this paper

1/n
TL;DR: overall, I think this article is a quite useful beginners guide

To start, I like the explanation of the terminology and concepts. Nice use of text boxes, imo

2/n
I particularly like the attention to calibration in addition to discrimination performance, and attention to importance of continued testing and updating of algorithms; algorithms are indeed high maintenance, a single “validation study” generally won’t do

3/n
Also like the reference to the @TRIPODStatement reporting guideline. Reading (and peer review) of machine learning articles would be so much more pleasant if people would just stick to the reporting guidelines

equator-network.org/reporting-guid…

4/n
ICYI: the TRIPOD guideline will be updated soon for more specific guidance on reporting of machine learning algorithms

thelancet.com/journals/lance…

5/n
I’ll choose to ignore the fact that logistic regression is called a “simpler machine learning system” :)

6/n
There is an accompanying editorial with the article that points out a few interesting things

jamanetwork.com/journals/jama/…

7/n
As the editorial also points out, the article focusses almost completely on deep learning of medical images. Surprising? It is arguably the most promising area of machine learning applications in medicine, but certainly not the only one

8/n
Attention to medical imaging and deep learning may have something to do with the authors interests (all affiliated to Google Health), but a bit broader perspective would have been useful for a beginners guide to machine learning, imho

9/n
The section “How Much Data Are Required for Recent Machine Learning Methods?’ is the part of the article that I don’t particularly like

10/n
Again a reference to the 5 to 10 events per variable rule for logistic regression. Ugh. More info and references here: discourse.datamethods.org/t/reference-co…

11/n
While I agree with the authors arguing that deep learning algorithms with millions of parameters probably do not need 10s of millions of events to become useful, the suggestion that regularization will save the day seems a bit optimistic

12/n
Regularization may in fact not save the day when you need it most: arxiv.org/abs/1907.11493

13/n
To the benefit of the authors, the article does state that tens of thousands of images may be required for a deep learning algorithm to do well!

But do we really know how much data we need for developing and validating reliable complex algorithms? Probably not

14/n
Thanks to the people that pointed me to this article today, including @boback and @ADAlthousePhD

/end
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with Maarten van Smeden

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!