Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Lauren Oakden-Rayner 🏳️‍⚧️

@DrLaurenOR

Aug 2, 2021 • 10 tweets • 4 min read • Read on X

Scrolly

#Medical #AI has the worst superpower... Racism

We've put out a preprint reporting concerning findings. AI can do something humans can't: recognise the self-reported race of patients on x-rays. This gives AI a path to produce health disparities.

1/8

lukeoakdenrayner.wordpress.com/2021/08/02/ai-…

This is a big deal, so we wanted to do it right. We did dozens of experiments, replication at multiple labs, on numerous datasets and tasks.

We are releasing all the code, as well as new labels to identify racial identity for multiple public datasets.

2/8

Humans can't detect race better than chance, but AI performs absurdly well on the task. As you can see here, AUC scores are in the high 90s, and are maintained on external validation on completely distinct datasets and across multiple different imaging tasks.

3/8

We performed many experiments to work out how it does this, but couldn't pin it down.

This is the most ridiculous figure I have ever seen. AI can detect race from images filtered so heavily they are just blank grey squares! (see the bottom right of the figure)

4/8

Here is the direct link to the paper, but in the first tweet of this thread is a link to my blog post which acts as a companion piece: I explain why we think this is bad and what we think needs to happen.

TL:DR below (this is text from the blog)

5/8

arxiv.org/abs/2107.10356

We are ringing alarm bells here. AI systems trained on standard medical datasets appear to learn to recognise race *by default* and we don't have the capabilities to detect the resulting bias in practice, or to mitigate it technologically.

6/8

With dozens of AI systems already on the market trained on data exactly like this, we are calling for urgent testing of performance across racial groups. We already know that medical practice is biased, but AI can embed health disparities at scale and cause serious harm.

7/8

Finally, this is a preprint, it hasn't been peer reviewed. We are looking for feedback, and suggestions for next steps. This work is important, but we don't know so much about what is happening here, and that isn't good enough. Anyone interested in helping out, let us know!

8/8

I want to acknowledge everyone but don't have Twitter handles for most (do y'all seriously not have them 😂😂).

PS This was a "flat" research collaboration without hierarchy, so authors are listed alphabetically. There is no first author or last author! Take that #Academia!

Hi everyone. This thread has been swamped by racists. I'm probably gonna miss your replies, but I'll still be here in a few days when they move on or you can reach out through other channels.

We appreciate all the wonderful support we've received from the community 🤩

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @DrLaurenOR

Lauren Oakden-Rayner 🏳️‍⚧️

@DrLaurenOR

Jun 9, 2022

@ykilcher

If anyone didn't get what I meant when I said @ykilcher chose to "kick the hornets nest" or if anyone was wondering about the cost of speaking out against unethical behaviour in #AI, here's a little summary of my recent twitter feed.

CW: transphobia

https://twitter.com/DrLaurenOR/status/1533556511913689088

Here's some more. If anyone doesn't understand why all these statements are explicitly transphobic ... well, it is because you don't face it. These are all extremely hurtful.

That's enough, but I've skipped all the misogyny, racism, anti-semitism etc.

https://twitter.com/_joaogui1/status/1534641891853082626

This isn't isolated. Another commentator on this stunt has needed to take some time off Twitter due to the reaction.

I honestly spent several days deciding to post on this, because I knew what would happen. It was important, but there is always a cost.

https://twitter.com/_joaogui1/status/1534641891853082626

Read 6 tweets

Lauren Oakden-Rayner 🏳️‍⚧️

@DrLaurenOR

Jun 6, 2022

@huggingface

This week an #AI model was released on @huggingface that produces harmful + discriminatory text and has already posted over 30k vile comments online (says it's author).

This experiment would never pass a human research #ethics board. Here are my recommendations.

1/7

https://twitter.com/ykilcher/status/1532751551869108227

@huggingface

@huggingface as the model custodian (an interesting new concept) should implement an #ethics review process to determine the harm hosted models may cause, and gate harmful models behind approval/usage agreements.

Medical research has functional models ie for data sharing.

2/7

Open science and software are wonderful principles but must be balanced against potential harm. Medical research has a strong ethics culture bc we have an awful history of causing harm to people, usually from disempowered groups.

See en.wikipedia.org/wiki/Medical_e… for examples.

3/7

Read 14 tweets

Lauren Oakden-Rayner 🏳️‍⚧️

@DrLaurenOR

Apr 6, 2022

Very excited to have 2 new papers in press today in Lancet Digital Health, alongside an editorial from the journal highlighting our work.

I am immensely proud of the work we have done here and honestly think this is the most important work I have been involved in to date 🥳

1/7

#Medical #AI has a problem. Preclinical testing, including regulatory testing, does not accurately predict the risks that AI models pose once they are deployed in clinics.

I've written about this before in my blog, for example in:

google.com/amp/s/laurenoa…

2/7

@theAIML

In this work we:

1) describe a step by step method for algorithmic auditing in health, building on the 🔥 work by Raji et al

2) audit a high accuracy model we developed @theAIML for hip fracture dx, ID-ing several serious risks that were not detected by standard testing.

3/7

Read 9 tweets

Lauren Oakden-Rayner 🏳️‍⚧️

@DrLaurenOR

Dec 8, 2020

Docs are ROCs: A simple fix for a methodologically indefensible practice in medical AI studies.

Widely used methods to compare doctors to #AI models systematically underestimate doctors, making the AI look better than it is! We propose a solution.

lukeoakdenrayner.wordpress.com/2020/12/08/doc…

1/7

The most common method to estimate average human performance in #medical AI is to average sensitivity and specificity as if they are independent. They aren't though - they are inversely correlated on a curve.

The average points will *always* be inside the curve.

2/7

The only solution currently is to force doctors to rate images using confidence scores. While this works well in the few tasks where these scales are used in clinical practice, what does it mean to say you are 6/10 confident that there is a lung nodule?

3/7

Read 8 tweets

Lauren Oakden-Rayner 🏳️‍⚧️

@DrLaurenOR

Aug 19, 2020

https://twitter.com/VickersBiostats/status/1295489610139738112

Alright, let's do this once last time. Predictions vs probabilities. What should we give doctors when we use #AI / #ML models for decision making or decision support?

#epitwitter

1/21

https://twitter.com/VickersBiostats/status/1295489610139738112

First, we need to ask: is there a difference?

This is a weird question, right? Of course there is! One is a categorical class prediction, the other is a continuous variable. Stats 101, amirite?

Well, no.

2/21

Let's set out the two ways that probabilities are supposed to be different than class predictions.

1) they are continuous, not categorical
2) they are probabilities, meaning the numbers reflects some truth about a patient group and are not arbitrary

Weeeeell...

3/21

Read 23 tweets

Lauren Oakden-Rayner 🏳️‍⚧️

@DrLaurenOR

Jul 28, 2020

https://twitter.com/laure_wynants/status/1288131085797294080

This discussion was getting long, so I thought I'd lay out my thoughts on a common argument: should models produce probabilities or decisions? Ie 32% chance of cancer vs "do a biopsy".

I favour the latter, because IMO it is both more useful and... more honest. IMO:

1/13

https://twitter.com/laure_wynants/status/1288131085797294080

The argument against using a threshold to determine an action, at a basic level, seems to be:

1) you shouldn't discard information by turning a range of probabilities into a binary
2) probabilities are more useful at the clinical coalface

2/13

Re: 1.

No model discards information. The continuous output score always exists. It is how you make use of that information at point of care that "changes".

I use airquotes around "changes", because this is a ... false dichotomy 😆

3/13

Read 14 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Lauren Oakden-Rayner 🏳️‍⚧️

Try unrolling a thread yourself!

More from @DrLaurenOR

Lauren Oakden-Rayner 🏳️‍⚧️

Lauren Oakden-Rayner 🏳️‍⚧️

Lauren Oakden-Rayner 🏳️‍⚧️

Lauren Oakden-Rayner 🏳️‍⚧️

Lauren Oakden-Rayner 🏳️‍⚧️

Lauren Oakden-Rayner 🏳️‍⚧️

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!