Tuhin Chakrabarty Profile picture
Apr 21 10 tweets 4 min read Read on X
Unlike math/code, writing lacks verifiable rewards. So all we get is slop. To solve this we train reward models on expert edits that beat SOTA #LLMs largely on a new Writing Quality benchmark. We also reduce #AI slop by using our RMs at test time boosting alignment with experts. Image
Image
Self-evaluation using LLMs has proven useful in reward modeling and constitutional AI. But relying on uncalibrated humans or self aggrandizing LLMs for feedback on subjective tasks like writing can lead to reward hacking and alignment issues.
Our work builds on LAMP (Language model Authored, Manually Polished), a corpus of 1282 <AI−generated, Expert−Edited> pairs with implicit quality preference. We train Writing Quality Reward Models (WQRM) across multiple model families using pairwise and scalar rewards from LAMP. Image
To evaluate WQRM, we introduce the Writing Quality Benchmark (WQ), consolidating five datasets that contrast Human-Human, Human-AI, and AI-AI writing pairs reflecting real world applications. SOTA LLMs, some of whom excel at reasoning tasks, barely beat random baselines on WQ. Image
We train an editing model on LAMP interaction traces to improve writing quality. To show WQRM’s practical benefits during inference, we use additional test-time compute to generate and rank multiple candidate revisions, letting us choose high-quality outputs from an initial draft Image
Evaluation with 9 experienced writers confirm that WQRM-based selection produces writing samples preferred by experts 66% overall, and 72.2% when the reward gap is larger than 1 point. Image
Image
In short, we find evidence that WQRM is well-calibrated: a wider gap in scores between two responses is evidence that an expert (or group of experts) would be more likely to prefer the higher-scoring response over the lower-scoring response
To better understand how much content detail affects LLM-writing quality, we did an analysis involving several LLMs on how they write with or without detailed content in the writing prompt and compared it to expert writers and MFA students on the same prompt. Image
Image
Our results show in the absence of original good-quality content, all LLMs are poor writers & models exhibit very high variance compared to experts. Even when provided with very detailed original content, LLMs including GPT4.5 still sucks (contrary to @sama). Image
We hope our work fuels interest in the community to focus on well calibrated reward models for subjective tasks like writing instead of focusing on vibes. In the true spirit of science our code, data, experiments and models are all open sourced.
Paper: arxiv.org/pdf/2504.07532

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Tuhin Chakrabarty

Tuhin Chakrabarty Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @TuhinChakr

Jun 6, 2019
Now live-tweeting The Enigma of Neural Text Degeneration as the First Defense Against Neural Fake News by Yejin Choi #neuralgen2019 #naacl2019
Motivating with super funny fake news created which tells she is a co founder of a self driving icecream truck lmao
Neural Fake news is here 😱😱😱😱
Read 36 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(