My Authors
Read all threads
This discussion was getting long, so I thought I'd lay out my thoughts on a common argument: should models produce probabilities or decisions? Ie 32% chance of cancer vs "do a biopsy".

I favour the latter, because IMO it is both more useful and... more honest. IMO:

1/13
The argument against using a threshold to determine an action, at a basic level, seems to be:

1) you shouldn't discard information by turning a range of probabilities into a binary
2) probabilities are more useful at the clinical coalface

2/13
Re: 1.

No model discards information. The continuous output score always exists. It is how you make use of that information at point of care that "changes".

I use airquotes around "changes", because this is a ... false dichotomy 😆

3/13
Model outputs for users are *always* discrete. With a probability from 1-100, you have chosen 100 discrete levels. With one decimal point, you have 1000 discrete levels.

The question isn't discrete vs not, but "how many different decisions could a clinician make?"

4/13
Imagine a model used to decide if a patient is a high risk covid case. The model will guide treatment plans.

So how many treatment plans are there?

Do we think there are 1000? 100?

In almost all clinical tasks, there are 2. Aggressive vs conservative therapy.

5/13
So the question changes. If making medical decision making inherently involves "discarding information" to make binary choices, the question isn't "dichotomise or not", but rather *who* should dichotomise.

Should it be the clinician user or the model developer?

6/13
I get why many statisticians think it should be the user. They are domain experts, with access to other information about the patient.

But this is wrong. The end user is very poorly placed to make this choice.

7/13
Users/humans are *terrible* at working with probabilities.

They can't estimate their own decision probabilities (ie humans are not calibrated decision makers in the slightest), and are also bad at valuing new information.

Lots of papers on this. Eg bmjopen.bmj.com/content/5/7/e0…

8/13
Are developers better?

Of course they are! They have a big dataset, so instead of relying on the variable experience and quantification abilities of end users, developers get to use evidence!

A developer chosen threshold is based on data. A user threshold is based on 🤪

9/13
Ps please don't argue developers make bad choices too. Of course many model developers don't know what they are doing. But *all* clinicians are bad at this.

Clinicians are good at synthesising binary data! Chest pain + troponin > X = heart attack. Each element is binary!

10/13
This leads on to argument 2: that probabilities are more useful.

Not at all! No human can balance a 30% chance of cancer vs a 32% chance of cancer. This is #TMI.

Even in shared decision making, most patients prefer terms like "rare" and "almost certainly" vs 3% or 95%.

11/13
If we look at clinically useful algorithms in widespread use, they are all dichotomous. The Well's criteria and Ottowa ankle rules say "image vs not".

Framingham has 3 risk categories. QRISK >10% is treat with statins. Aspects = "treatment contraindicated or not".

12/13
It is undeniable that common models dichotomise *before the clinician*. This doesn't change with more complicated machine learning models.

And like these simple risk models, we can always provide probabilities post-hoc, like below.

It is just that clinicians won't care.

13/13
PS in the above model, a score >=3 gets you a ct scan.

I suppose I should tag the relevant folks #epitwitter 😁
Missing some Tweet in this thread? You can try to force a refresh.

Keep Current with Luke Oakden-Rayner

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!