Incoming Assistant Prof @sbucompsc @stonybrooku. Researcher → @SFResearch Ph.D. → @ColumbiaCompSci Human Centered AI / Future of Work / AI & Creativity
Apr 21 • 10 tweets • 4 min read
Unlike math/code, writing lacks verifiable rewards. So all we get is slop. To solve this we train reward models on expert edits that beat SOTA #LLMs largely on a new Writing Quality benchmark. We also reduce #AI slop by using our RMs at test time boosting alignment with experts.
Self-evaluation using LLMs has proven useful in reward modeling and constitutional AI. But relying on uncalibrated humans or self aggrandizing LLMs for feedback on subjective tasks like writing can lead to reward hacking and alignment issues.
Jun 6, 2019 • 36 tweets • 11 min read
Now live-tweeting The Enigma of Neural Text Degeneration as the First Defense Against Neural Fake News by Yejin Choi #neuralgen2019#naacl2019
Motivating with super funny fake news created which tells she is a co founder of a self driving icecream truck lmao