Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Natasha Jaques

@natashajaques

Mar 20 • 6 tweets • 4 min read • Read on X

Scrolly

The paper I’ve been most obsessed with lately is finally out: nbcnews.com/tech/tech-news…! Check out this beautiful plot: it shows how much LLMs distort human writing when making edits, compared to how humans would revise the same content.

We take a dataset of human-written essays from 2021, before the release of ChatGPT. We compare how people revise draft v1 -> v2 given expert feedback, with how an LLM revises the same v1 given the same feedback. This enables a counterfactual comparison: how much does the LLM alter the essay compared to what the human was originally intending to write? We find LLMs consistently induce massive distortions, even changing the actual meaning and conclusions argued for.

This is a problem, because LLM-generated text is already infiltrating a lot of our cultural and scientific institutions. For example, we look at the 21% of ICLR 2026 reviews that were found to be LLM-generated, and find that they actually focus on different scientific criteria than human reviews! e.g. LLMs increase focus on scalability by +111% vs. humans.

You might say "well, I’m so good at prompting LLMs, I’m not going to be subject to these issues". So, we conduct a human user study to see how people naturally interact with LLMs to produce a piece of writing, and find that even when allowed to repeatedly prompt the LLM to refine an essay as they see fit, people that rely heavily on LLMs produce writing that argues for significantly different conclusions.

So how are LLMs actually changing human writing? Aside from producing a 70% increase in the proportion of essays that take a neutral stance rather than actually expressing an opinion, we find that LLMs generate text that is both more emotional, as well as more analytical, logical, and statistical. What does this mean? My theory is that LLMs trained with RLHF at a large scale end up learning to write in ways that many many people will give a thumbs up to; and this ends up being both emotional and argumentative language. The ‘clickbait’ of language.

Why am I obsessed with this? LLMs do not preserve our intentions or diversity of thought in writing, and they’re already being adopted en masse. More than 1 billion people worldwide use them on a weekly basis. Existing work has shown that for individual scientists, using LLMs to generate papers increases your productivity and impact, even though it constricts science’s overall focus. In our study we show that even though participants who rely on LLMs say their writing is significantly less creative and not in their voice, they are paradoxically equally satisfied with the output. So, the adoption of LLMs is not going to slow any time soon. But it’s already affecting our cultural institutions and the way we conduct science. We urgently need more research into how massive, widespread LLM adoption will affect our science, politics, and culture.

This is joint work with @marwaabdulhai @isadorcw @yanming_wan @jzl86 and @maxhkw.
Project page: sites.google.com/view/llmwritin…
Paper: arxiv.org/abs/2603.18161

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @natashajaques

Natasha Jaques

@natashajaques

Dec 6, 2020

How can we move deep RL beyond games, without having to hand-build a simulator that covers real-world complexity? We adversarially generate a curriculum of challenging yet feasible environments by maximizing regret between a pair of agents, with PAIRED...

@MichaelD1729

Which is joint work with @MichaelD1729, @EugeneVinitsky, @alexandrebayen, Stuart Russell, Andrew Critch, and @svlevine, and will be presented as an oral at NeurIPS on Monday, December 7th at 6:30pm PT, with poster from 9-11pm PT neurips.cc/virtual/2020/p…

We show that PAIRED agents learn more complex behaviors and generalize better to challenging, unseen test environments zero-shot when compared to minimax adversarial environment generation and domain randomization.

Read 7 tweets

Natasha Jaques

@natashajaques

Jul 2, 2019

Excited to release our latest paper, which uses KL-control for effective off-policy RL, even when you can't explore online in the environment! We use this + neural.chat to learn from human conversation...

Paper arxiv.org/abs/1907.00456
Code github.com/natashamjaques…

2) ...by learning from cues like sentiment and conversation length that are implicit in the text itself. We show this is more effective than relying on explicit labeling of human preferences.

3) Effective off-policy learning is important for learning from human interaction, since experience is expensive to collect. So is testing the policy before you deploy it, which is why we can't explore online as in normal RL. We show several techniques that allow this to work.

Read 4 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Natasha Jaques

Try unrolling a thread yourself!

More from @natashajaques

Natasha Jaques

Natasha Jaques

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!