Natasha Jaques Profile picture
Mar 20 6 tweets 4 min read Read on X
The paper I’ve been most obsessed with lately is finally out: nbcnews.com/tech/tech-news…! Check out this beautiful plot: it shows how much LLMs distort human writing when making edits, compared to how humans would revise the same content.

We take a dataset of human-written essays from 2021, before the release of ChatGPT. We compare how people revise draft v1 -> v2 given expert feedback, with how an LLM revises the same v1 given the same feedback. This enables a counterfactual comparison: how much does the LLM alter the essay compared to what the human was originally intending to write? We find LLMs consistently induce massive distortions, even changing the actual meaning and conclusions argued for.Image
This is a problem, because LLM-generated text is already infiltrating a lot of our cultural and scientific institutions. For example, we look at the 21% of ICLR 2026 reviews that were found to be LLM-generated, and find that they actually focus on different scientific criteria than human reviews! e.g. LLMs increase focus on scalability by +111% vs. humans.Image
You might say "well, I’m so good at prompting LLMs, I’m not going to be subject to these issues". So, we conduct a human user study to see how people naturally interact with LLMs to produce a piece of writing, and find that even when allowed to repeatedly prompt the LLM to refine an essay as they see fit, people that rely heavily on LLMs produce writing that argues for significantly different conclusions.Image
So how are LLMs actually changing human writing? Aside from producing a 70% increase in the proportion of essays that take a neutral stance rather than actually expressing an opinion, we find that LLMs generate text that is both more emotional, as well as more analytical, logical, and statistical. What does this mean? My theory is that LLMs trained with RLHF at a large scale end up learning to write in ways that many many people will give a thumbs up to; and this ends up being both emotional and argumentative language. The ‘clickbait’ of language.Image
Image
Why am I obsessed with this? LLMs do not preserve our intentions or diversity of thought in writing, and they’re already being adopted en masse. More than 1 billion people worldwide use them on a weekly basis. Existing work has shown that for individual scientists, using LLMs to generate papers increases your productivity and impact, even though it constricts science’s overall focus. In our study we show that even though participants who rely on LLMs say their writing is significantly less creative and not in their voice, they are paradoxically equally satisfied with the output. So, the adoption of LLMs is not going to slow any time soon. But it’s already affecting our cultural institutions and the way we conduct science. We urgently need more research into how massive, widespread LLM adoption will affect our science, politics, and culture.Image
This is joint work with @marwaabdulhai @isadorcw @yanming_wan @jzl86 and @maxhkw.
Project page: sites.google.com/view/llmwritin…
Paper: arxiv.org/abs/2603.18161

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Natasha Jaques

Natasha Jaques Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @natashajaques

Dec 6, 2020
How can we move deep RL beyond games, without having to hand-build a simulator that covers real-world complexity? We adversarially generate a curriculum of challenging yet feasible environments by maximizing regret between a pair of agents, with PAIRED...
Which is joint work with @MichaelD1729, @EugeneVinitsky, @alexandrebayen, Stuart Russell, Andrew Critch, and @svlevine, and will be presented as an oral at NeurIPS on Monday, December 7th at 6:30pm PT, with poster from 9-11pm PT neurips.cc/virtual/2020/p… Image
We show that PAIRED agents learn more complex behaviors and generalize better to challenging, unseen test environments zero-shot when compared to minimax adversarial environment generation and domain randomization. Image
Read 7 tweets
Jul 2, 2019
Excited to release our latest paper, which uses KL-control for effective off-policy RL, even when you can't explore online in the environment! We use this + neural.chat to learn from human conversation...

Paper arxiv.org/abs/1907.00456
Code github.com/natashamjaques…
2) ...by learning from cues like sentiment and conversation length that are implicit in the text itself. We show this is more effective than relying on explicit labeling of human preferences.
3) Effective off-policy learning is important for learning from human interaction, since experience is expensive to collect. So is testing the policy before you deploy it, which is why we can't explore online as in normal RL. We show several techniques that allow this to work.
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(