Latest Twitter Threads by @TuhinChakr on Thread Reader App

Apr 21 • 10 tweets • 4 min read

Unlike math/code, writing lacks verifiable rewards. So all we get is slop. To solve this we train reward models on expert edits that beat SOTA #LLMs largely on a new Writing Quality benchmark. We also reduce #AI slop by using our RMs at test time boosting alignment with experts.

Self-evaluation using LLMs has proven useful in reward modeling and constitutional AI. But relying on uncalibrated humans or self aggrandizing LLMs for feedback on subjective tasks like writing can lead to reward hacking and alignment issues.

Jun 6, 2019 • 36 tweets • 11 min read

Now live-tweeting The Enigma of Neural Text Degeneration as the First Defense Against Neural Fake News by Yejin Choi #neuralgen2019 #naacl2019 Motivating with super funny fake news created which tells she is a co founder of a self driving icecream truck lmao

Share this page!

Enter URL or ID to Unroll