Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Robb Willer

@RobbWiller

Aug 7 • 15 tweets • 5 min read • Read on X

Scrolly

🚨New WP: Can LLMs predict results of social science experiments?🚨

Prior work uses LLMs to simulate survey responses, but can they predict results of social science experiments?

Across 70 studies, we find striking alignment (r = .85) between simulated and observed effects 🧵👇

To evaluate predictive accuracy of LLMs for social science experiments, we used #GPT4 to predict 476 effects from 70 well-powered experiments, including:

➡️50 survey experiments conducted through NSF-funded TESS program
➡️20 additional replication studies (Coppock et al. 2018)

We prompted the model with (a) demographic profiles drawn from a representative dataset of Americans, and (b) experimental stimuli. The effects estimated by pooling these responses were strongly correlated with the actual experimental effects (r = .85; adj. r = 0.91)!

We also find that predictive accuracy improved across generations of LLMs, with GPT4 surpassing predictions elicited from an online sample (N = 2,659) of Americans.

Paper 👉 treatmenteffect.app/paper.pdf

But what if LLMs are simply retrieving & reproducing known experimental results from training data?

We find evidence against this: analyzing only studies *unpublished* at time of GPT4’s training data cut-off, we find high predictive accuracy (r = .90, adj. r = .94).

Important work finds biases in LLM responses resulting from training data inequalities. Do these biases impact accurate prediction of experimental results?

To assess, we compare predictive accuracy for:
➡️women & men
➡️Black & white participants
➡️Democrats & Republicans

Despite known training data inequalities, LLM-derived predictive accuracy was comparable across subgroups.

However, there was little heterogeneity in experimental effects we studied, so more research is needed to assess if/how LLM predictions of experimental results are biased.

We also evaluated predictive accuracy for “megastudies,” studies comparing the impact of a large number of interventions. Across nine survey and field megastudies, LLM-derived predictions were modestly accurate
(Notably, accuracy matched or surpassed expert forecasters)

Finally, we find LLMs can accurately predict effects on socially harmful outcomes, such as the impact of antivax FB posts on vax intentions (@_JenAllen et al., 2024). This capacity may have positive uses, such as for content moderation, though also highlights risks of misuse.

Overall our results show high accuracy of LLM-derived predictions for experiments with human participants, generally greater accuracy than samples of lay and expert humans.

Paper👉 treatmenteffect.app/paper.pdf

This capacity has several applications for science and practice – e.g., running low-cost pilots to identify promising interventions, or simulating experiments that may be harmful to participants – but also limitations and risks, including concerns about bias, overuse, & misuse.

To explore further, you can use LLM-simulated participants to generate predicted experimental effects using this demo!
👇
treatmenteffect.app

*Major* kudos to @lukebeehewitt and @AshuAshok (who co-led the research), and to @ghezae_isaias.

And thanks to @pascl_stanford and @StanfordPACS for generously supporting this project.

H/t also to some of the many scholars whose work we drew on:
@JEichstaedt @danicajdillion @kurtjgray @chris_bail @lpargyle @johnjhorton @mcxfrank @joon_s_pk @msbernst @percyliang @kerstingAIML @davidwingate @lltjuatja @gneubig @nikbpetrov @SchoeneggerPhil @molly_crockett

@MortezDehghani @baixx062 @JamesBisbee @joshclinton @MohammadAtari90 @JoHenrich @lmesseri @SuhaibAbdurahm1

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @RobbWiller

Robb Willer

@RobbWiller

Sep 15, 2022

@jgvoelkel

Many 2022 candidates do not acknowledge results of 2020 election. This is a huge problem.

How can we persuade people not to support undemocratic candidates in the 2022 midterms?

@jgvoelkel & I offer insights from the Strengthening Democracy Challenge.

washingtonpost.com/politics/2022/…

Many Republican voters believe the 2020 election was stolen. For them, voting for election-denying Republican candidates helps their favored party AND helps to defend democratic principles.
This is a misinformation problem.
See, e.g.:
papers.ssrn.com/sol3/papers.cf…

But many Rep voters do *not* believe 2020 election was stolen. For these folks, deciding whether to vote for election-denying, Republican candidates involves a tension of partisan interests and democratic principles.

Read 9 tweets

Robb Willer

@RobbWiller

Sep 2, 2022

🚨🚨New paper “Incivility is Rising Among American Politicians on Twitter” out now in SPPS! 🚨🚨

Is political discourse on Twitter growing more toxic? Is the platform responsible? If so, what can be done about it? 🧵👇

journals.sagepub.com/doi/full/10.11…

@StanfordHAI

⚡️For a quick summary of our results, check out this excellent video produced by the brilliant folks @StanfordHAI (& the thread below!)⚡️

@perspective

We use the validated @perspective API to estimate levels of “toxicity” in 1.3 million tweets by Congresspeople from '09-'19 (findings robust with alt measures of toxicity)

Overall, toxicity⬆️ 23% over the time period

Over same period, toxicity of Congress speeches actually⬇️

Read 26 tweets

Robb Willer

@RobbWiller

Aug 18, 2022

🚨New WP: How can we reduce partisan animosity & anti-democratic attitudes in US?🚨

We share results of the Strengthening Democracy Challenge: N=32k megastudy testing 25 depolarization interventions

Shows effective treatments for many neg outcomes!
🧵👇strengtheningdemocracychallenge.org/paper

In line with claims that American democracy is in crisis, we found concerning baseline levels of potentially problematic attitudes, e.g.:

➤Partisan animosity
➤Support for undemocratic practices
➤Support for undemocratic candidates
➤Biased evaluation of politicized facts

We test 25 interventions to reduce such attitudes, submitted by social scientists & practitioners. Most targeted partisan animosity, but many also aimed to reduce support for undemocratic practices or partisan violence. Walk-through of interventions here👇

https://twitter.com/RobbWiller/status/1558973819842203648?s=20&t=AU1I9rKIT-Lh_IkRVO496Q

Read 24 tweets

Robb Willer

@RobbWiller

Aug 15, 2022

🚨Results are in for the Strengthening Democracy Challenge. Winners will be announced this week!🚨

ITT we announce the 25 submissions we selected to test. We think these submissions are awesome & hope you do too.

But first, how we got here…👇🧵

https://twitter.com/robbwiller/status/1417548995434405890?lang=en

BACKSTORY: last summer we invited people to submit ideas for how to reduce Americans’ anti-democratic attitudes, support for partisan violence, and/or partisan animosity.

https://twitter.com/robbwiller/status/1417548995434405890?lang=en

Our research team worked w/ a stellar advisory board to select the 25 interventions we found most promising & then tested them in a massive (N>31,000) online survey experiment

What was eligible?
Short interventions (< 8 minutes) that were deployable online.

Read 38 tweets

Robb Willer

@RobbWiller

Jul 20, 2021

🚨Call for Submissions🚨 “The Strengthening Democracy Challenge,” a large-scale project testing interventions to reduce (a) anti-democratic attitudes, (b) support for partisan violence, and/or (c) partisan animosity, is open for submissions NOW 1/

strengtheningdemocracychallenge.org

American democracy faces major problems. Americans are willing to compromise on democratic principles for partisan goals. Some people are willing to resort to violence to help their side win. Extreme dislike for rival partisans has grown significantly in recent decades. 2/

To deepen understanding of how to address these problems, we will conduct a large (up to 30k participants) experiment testing up to 25 submitted interventions designed to reduce anti-democratic attitudes, support for partisan violence, and/or partisan animosity among Americans 3/

Read 26 tweets

Robb Willer

@RobbWiller

Apr 21, 2021

@LAtimes

Republicans are the largest vaccine hesitant group in the US and their hesitancy is not declining. What can we do about it?

In this new piece in the @LAtimes, I & @jayvanbavel discuss strategies and some of the relevant behavioral science in this space

latimes.com/opinion/story/…

We identify three broad approaches - general public health messages, promotion by trusted politicians, and promotion by trusted nonpolitical influencers – identifying behavioral science relevant to each.

@deaneckles

We highlight @deaneckles and colleagues’ research on how vaccine intentions can be increased (in the US and beyond) by informing people about the actually very high levels of vaccination+vaccine intentions in the general public.

Read 8 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Robb Willer

Try unrolling a thread yourself!

More from @RobbWiller

Robb Willer

Robb Willer

Robb Willer

Robb Willer

Robb Willer

Robb Willer

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!