@emilymbender.bsky.social Profile picture
Prof, Linguistics, UW // Faculty Director, CLMS // she/her // @emilymbender@dair-community.social & bsky // rep by @ianbonaparte

Sep 14, 2020, 10 tweets

This is a really important paper for #NLProc, #ethNLP and #ethicalAI folks.

1/n

The authors look deep into a use case for text that is ungrounded in either the world or any commitment what's being communicated but nonetheless fluent, apparently coherent, and of a specified style. You know, exactly #GPT3's specialty.

2/n

What's that use case? The kind of text needed, and apparently needed in quantity, for discussion boards whose purpose is recruitment and entrenchment in extremist ideologies.

3/n

And guess what? They find that #GPT3's trick of "few shot" training is definitely up to this challenge.

4/n

I don’t think GPT-3 could produce text written from the point of view of a conspiracy theorist if it didn’t have such texts among it’s training data. But, in the spirit of healthy skepticism, if someone wants to explain how it could, I’m curious about your theories. #NLProc

5/n

The next question then, is: how much such data does it need? Are we seeing a reflection of lots of this garbage getting sucked into the maw of the data-hungry algorithm? Or does it only take a little?

6/n

And if it only takes a little, that’s actually much worse, because it’s much harder to design processes that can filter out tiny amounts of this. E.g. would examples quoted in serious articles discussing the threat of online fora like this be enough?

7/n

My take away 1: ML systems that rely on datasets too large to actually examine are inherently unsafe. (Quote previous tweet on this.)


8/n

My take away 2: This paper shows the immense value of interdisciplinary perspectives in evaluating the potential risks of technology.

9/9

p.s. The paper goes talk about #GPT3 having "knowledge" of various conspiracy theories. I think this is a category error, but it does not detract from the point the paper is making. For more on why, though, see aclweb.org/anthology/2020…

cc: @KrisMcguffie @AlexBNewhouse

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling