Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

@emilymbender.bsky.social

@emilymbender

Oct 16, 2022 • 26 tweets • 7 min read • Read on X

Hi folks -- time for another #AIhype take down + analysis of how the journalistic coverage relates to the underlying paper. The headline for today's lesson:

fiercebiotech.com/medtech/ai-spo…

/1

At first glance, this headline seems to be claiming that from text messages (whose? accessed how?) an "AI" can detect mental health issues as well as human psychiatrists do (how? based on what data?).

/2

Let's pause to once again note the use of "AI" in this way suggests that "artificial intelligence" is a thing that exists. Always useful to replace that term with "mathy math" or SALAMI for a reality check.

/3

Mathy math:

https://twitter.com/emilymbender/status/1576941980239953920?s=20&t=KTgL_Xom2AQ8TU2K9dxQGQ

SALAMI:

https://twitter.com/emilymbender/status/1485067239023775744?s=20&t=KTgL_Xom2AQ8TU2K9dxQGQ

Okay, back to the article. Odd choice here to start off with a call out to the Terminator which is carefully denied but still alluded to. Also odd to refer to what is (as we'll see) a text classification system as a "robot", again strengthening the allusion to the Terminator.

/6

Alright, so what does "potential signs of worsening mental illness" mean, and what were these text messages, and how did they get them? Time to go look at the underlying article.

/7

The study that the article actually reports on involves text messages (collected under informed consent) between patients and their therapists which were annotated by two "clinically trained annotators" for whether or not they reflect cognitive distortion (& of what type).

/8

Source: ps.psychiatryonline.org/doi/10.1176/ap…

Then they trained a few different text classification algorithms on that annotated dataset and measured how well they did at replicating the labels on a portion held out as a test set.

/10

You could well be forgiven for reading "identify potential signs of worsening mental illness" as "detect worsening mental illness" and the "on par with human psychiatrists" in the headline & this para meaning on par with what human psychiatrists do when diagnosing patients.

/11

But no: What this study did was have two annotators annotate text messages from patients they were not treating, measured their agreement with each other (a not very impressive κ=0.51) and then measured how well the text classifiers could replicate those annotations.

/12

As an aside: I don't doubt that the expertise of these annotators (one with a master's degree in psychology and one who is a licensed clinical mental health counselor) is relevant. However, it is still misleading to refer to them as "psychiatrists".

/13

It's a really important detail that the annotators weren't working with text messages from patients they treat.

/14

This means that they were working with very little context: saying a machine does as well as them in this task would seem to have very little bearing on whether it would be appropriate to have a machine do this task.

/15

In other words, we could ask: Under what circumstances would we want to have (even clinically trained) humans screening text messages from people they have no relationship to (therapeutic or otherwise) to try to find such signs?

/16

I'm guessing not many. But in that case, what's the value of using an automated system that has as an upper bound the accuracy of the humans in that case?

/17

Another misleading statement in the article: These were not "everyday text messages" (which suggests, say, friends texting each other) but rather texts between patients and providers (with consent) in a study.

/18

Next, let's compare what the peer reviewed article has to say about the purpose of this tech with what's in the popular press coverage. The peer reviewed article says only: could be something to help clinicians take action.

/19

In the popular press article, on the other hand we get instead a suggestion of developing surveillance technology, that would presumably spy not just on the text messages meant for the clinician, but everything a patient writes.

/20

Note that in this case, the source of the hype lies not with the journalist but (alas) with one of the study authors.

/21

Another one of the authors comes in with some weird magical thinking about how communication works. Why in the world would text messages (lacking all those extra context clues) be a *more* reliable signal?

/22

In sum: It seems like here the researchers are way overselling what their study did (to the press, but not in the peer reviewed article) and the press is happily picking it up.

/fin

@holdspacefree

Coda: @holdspacefree illustrates the importance of reading the funding disclosures. The researchers giving the hype-laden quotes to the media weren't just being naive. They're selling something.

https://twitter.com/holdspacefree/status/1582008401097216000

@holdspacefree

@holdspacefree This news story started off life as a press release from @UWMedicine 's @uwmnewsroom who I think should also have disclosed the financial COI that was in the underlying study.

newsroom.uw.edu/news/ai-equal-…

@threadreaderapp

@threadreaderapp unroll

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @emilymbender

@emilymbender.bsky.social

@emilymbender

Nov 4, 2024

As OpenAI and Meta introduce LLM-driven searchbots, I'd like to once again remind people that neither LLMs nor chatbots are good technology for information access.

A thread, with links:

>>

@chirag_shah and I wrote about this in two academic papers:
2022: dl.acm.org/doi/10.1145/34…
2024: dl.acm.org/doi/10.1145/36…

We also have an op-ed from Dec 2022:
iai.tv/articles/all-k…

>>

Why are LLMs bad for search? Because LLMs are nothing more than statistical models of the distribution of word forms in text, set up to output plausible-sounding sequences of words.

>>

Read 15 tweets

@emilymbender.bsky.social

@emilymbender

Feb 29, 2024

It seems like there are just endless bad ideas about how to use "AI". Here are some new ones courtesy of the UK government.

... and a short thread because there is so much awfulness in this one article.
/1

ft.com/content/f2ae55…

Either it's a version of ChatGPT OR it's a search system where people can find the actual sources of the information. Both of those things can't be true at the same time. /2

Also: the output of "generative AI", synthetic text, is NOT information. So, UK friends, if your government is actually using it to respond to freedom of information requests, they are presumably violating their own laws about freedom of information requests. /3

Read 10 tweets

@emilymbender.bsky.social

@emilymbender

Jan 14, 2024

It is depressing how often Bender & Koller 2020 is cited incorrectly. My best guess is that ppl writing abt whether or not LLMs 'understand' or 'are agents' have such strongly held beliefs abt what they want to be true that this impedes their ability to understand what we wrote.

Or maybe they aren't actually reading the paper --- just summarizing based on what other people (with similar beliefs) have mistakenly said about the paper.

>>

Today's case in point is a new arXiv posting, "Are Language Models More Like Libraries or Like Librarians? Bibliotechnism, the Novel Reference Problem, and the Attitudes of LLMs" by Lederman & Mahowald, posted Jan 10, 2024.

>>arxiv.org/pdf/2401.04854…

Read 11 tweets

@emilymbender.bsky.social

@emilymbender

Dec 7, 2023

A quick thread on #AIhype and other issues in yesterday's Gemini release: 1/

#1 -- What an utter lack of transparency. Researchers form multiple groups, including @mmitchell_ai and @timnitgebru when they were at Google, have been calling for clear and thorough documentation of training data & trained models since 2017. 2/

In Bender & Friedman 2018, we put it like this: /3

Read 20 tweets

@emilymbender.bsky.social

@emilymbender

Nov 24, 2023

With the OpenAI clownshow, there's been renewed media attention on the xrisk/"AI safety" nonsense. Personally, I've had a fresh wave of reporters asking me naive questions (+ some contacts from old hands who know how to handle ultra-rich man-children with god complexes). 🧵1/

As a quick reminder: AI doomerism is also #AIhype. The idea that synthetic text extruding machines are harbingers of AGI that is on the verge of combusting into consciousness and then turning on humanity is unscientific nonsense. 2/

t the same time, it serves to suggest that the software is powerful, even magically so: if the "AI" could take over the world, it must be something amazing. 3/

Read 27 tweets

@emilymbender.bsky.social

@emilymbender

Jun 11, 2023

There's a lot I like in this op-ed, but unfortunately it ends with some gratuitous ableism (and also weird remarks about AGI as a "holy grail").

First, the good parts:

theguardian.com/commentisfree/…

"[False arrests w/face rec tech] should be at the heart of one of the most urgent contemporary debates: that of artificial intelligence and the dangers it poses. That it is not, and that so few recognise it as significant, shows how warped has become the discussion of AI,"

>>

"We have stumbled into a digital panopticon almost without realising it. Yet to suggest we live in a world shaped by AI is to misplace the problem. There is no machine without a human, and nor is there likely to be."

>>

Read 7 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

@emilymbender.bsky.social

Try unrolling a thread yourself!

More from @emilymbender

@emilymbender.bsky.social

@emilymbender.bsky.social

@emilymbender.bsky.social

@emilymbender.bsky.social

@emilymbender.bsky.social

@emilymbender.bsky.social

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!