Jonathan Stray Profile picture
Working on better personalized information at @CHAI_Berkeley. Previously @columbiajourn. Made some software, wrote some news. Editor of @better_conflict.
Feb 14 β€’ 13 tweets β€’ 5 min read
Everyone has heard that there are serious problems with optimizing for engagement. But what are the alternatives?
New paper: we gathered folks from eight platforms for an off-the-record chat to share best practices -- and document them from public sources.
arxiv.org/abs/2402.06831
Image π‘Ήπ’‚π’π’Œπ’Šπ’π’ˆ π’ƒπ’š π’‘π’“π’†π’…π’Šπ’„π’•π’†π’… π’†π’π’ˆπ’‚π’ˆπ’†π’Žπ’†π’π’• 𝒄𝒂𝒖𝒔𝒆𝒔 π’”π’Šπ’ˆπ’π’Šπ’‡π’Šπ’„π’‚π’π’•π’π’š π’‰π’Šπ’ˆπ’‰π’†π’“ π’•π’Šπ’Žπ’†-𝒔𝒑𝒆𝒏𝒕 𝒂𝒏𝒅 π’“π’†π’•π’†π’π’•π’Šπ’π’

That's why everyone does it. And this isn't necessarily bad -- real user value here -- but sometimes it goes wrong. Image
Jan 5, 2022 β€’ 8 tweets β€’ 3 min read
β€œWhen journalism begins to accept the death of objectivity, the industry will begin to thrive off relying on organic humanity rather than stiff, rigid and outdated mechanisms.”

I don't think "the death of objectivity" is the right direction. I see better alternatives.

1/x
Why are mainstream journalists gunning for "the death of objectivity"? The short answers:

- it was always a little bit of an incoherent concept
- it was used to exclude marginalized people and perspectives

There are real problems here. But. Reporting is more than opinion.

2/x
Dec 1, 2021 β€’ 9 tweets β€’ 2 min read
Instead of "amplification," here are three alternative ways of talking about what recommender systems do that correspond more closely to how this all works -- and would make better law.

1/x
When I push people on what "amplification" means, we always end up at one of three ideas:

1) Reach
2) Comparison to a chron baseline
3) User intent

2/x
Sep 19, 2021 β€’ 4 tweets β€’ 2 min read
I think we expect far too much of mis/disinfo interventions. Too much of the work in this field rests on the insane implication that we could fix our politics if we just use mass surveillance tools to stop people from saying bad things to each other. I want to qualify this by saying there really are organized information operations, content farms that churn out "news" that isn't, deceptive fraud and scams, etc. I'm not an absolutist by any means. But removing content isn't going change e.g. vaccine hesitancy.
Jul 13, 2021 β€’ 18 tweets β€’ 8 min read
I wrote a paper containing everything I know about recommenders polarization, and conflict, and how these systems could be designed to depolarize.
arxiv.org/abs/2107.04953

Here's my argument in a nutshell -- a THREAD First, what even is polarization? We all kinda know already because of the experience of living in a divided society, but why do we care? A few big reasons:
- it destroys social bonds
- it escalates through feedback cycles
- it contributes to the destruction of democracy
2/n
Feb 13, 2021 β€’ 7 tweets β€’ 3 min read
I have been studying disinformation since 2011. I am horrified by how the category has expanded from β€œfactually wrong” to β€œexpresses a frame I disagree with.” This is reflected in ML work, where no one looks too closely at where the labels in their training data comes from. In complex categories like β€œdisinformation” it’s quite important to understand the nature of the actual examples being used. Read the papers closely. Go look at the underlying data sets. For most disnfo classifiers, the labels are not even at the article level β€” just source level
Oct 28, 2020 β€’ 14 tweets β€’ 8 min read
If you want to use AI/ML for good, have you considered... form extraction? This is a major difficulty for climate change, human rights, medicine, and journalism, and a state-of-the-art-challenge.
A THREAD.
jonathanstray.com/to-apply-ai-fo… Today I am pleased to announce a new public form extraction benchmark, hosted by @weights_biases. This is a nasty campaign finance data set, the FCC Public Files. We've collected 20,000 labelled PDFs, in hundreds of different formats.
wandb.ai/deepform/polit…
2/
Oct 9, 2020 β€’ 14 tweets β€’ 8 min read
Folks who want to use AI/ML for good generally think of things like predictive models, but actually... smart methods for extracting data from forms would do more for journalism, climate science, medicine, democracy etc. than almost any other application. A THREAD.
1/x
Here's how form extraction could help climate science

2/x
Apr 7, 2020 β€’ 24 tweets β€’ 9 min read
Currently watching the @TowCenter discussion on conservative news. They interviewed many conservative journalists to understand how they see their work, and this is the report.
cjr.org/tow_center/con…

thread. Here are the questions they wanted to answer from talking to journalists at conservative news outlets.

2/
Dec 12, 2019 β€’ 6 tweets β€’ 4 min read
The AI text adventure: a THREAD about an emerging art form.

ICYMI, I recommend playing AI Dungeon 2 by @nickwalton00, released last week by. It's a GPT-2 based text adventure. Rudimentary, a prototype, but I believe it's a harbinger of great things.
theverge.com/tldr/2019/12/6…

1/
@nickwalton00 The crazy, amazing thing about this experiment is that the AI will respond in (more or less) sensible natural language to anything you can throw at it. This was the promise of "interactive fiction" of old, what we hoped for in the 70s and 80s when it emerged. But..

2/
Nov 19, 2019 β€’ 27 tweets β€’ 32 min read
A THREAD on the the divide between the "critical" and the "technological" communities, as a response to the paper by @davidjmoats and @npseaver which explores this topic. I'm posting the bits I highlighted and explaining why each of them struck me.
journals.sagepub.com/doi/full/10.11…
Ready? @davidjmoats @npseaver It starts with @mathbabedotorg's 2017 editorial. Apparently the critical community (not the best name, but let's go with it for now) was super surprised to hear that they were not critiquing technology. Me, I recognize Cathy's critique very naturally. The problem is...
Oct 18, 2019 β€’ 9 tweets β€’ 4 min read
Ok, a THREAD about doing journalism for a polarized, suspicious public.

Suppose you want to be read and trusted across the political spectrum. What does that mean and how to do it?

1/
This is the background context:

Media trust has been falling for decades
knightfoundation.org/reports/indica…

Polarization has been increasing for decades
people-press.org/2014/06/12/pol…

(I'm going to focus on the US today but there are similar trends internationally)

2/
Aug 29, 2019 β€’ 8 tweets β€’ 2 min read
THREAD on a wacky problem that AI ethics theorists should be aware of.

Consider a coin flip bet: Heads you get 1.5 times your current wealth. Tails you get 0.6. Should you take it?

Expected value is 1/2*0.6 + 1/2*1.5 = 1.05 times your wealth. So take it right?

Not quite... Averaging over many people, it's true that the expected value of wealth increases. But if you watch one person's wealth over time, it decreases!

If you work it out, you get sqrt(0.6*1.5) ~= 0.95 times as much wealth at each time step.

(from ergodicityeconomics.com/lecture-notes/)