Assistant Prof @sbucompsc @stonybrooku
Researcher → @SFResearch
Interests : Human Centered AI / Future of Work / AI & Creativity
Formerly @ColumbiaCompSci
Mar 25 • 10 tweets • 4 min read
🚨New paper on AI & Copyright
👨⚖️Courts have credited LLM companies' claims that safety alignment prevents reproduction of copyrighted expression.
But what if fine-tuning on a simple writing task ruins it all?
Worse : Fine-tuning on a single author's books (e.g., Murakami) unlocks verbatim recall of copyrighted books from 30+ unrelated authors, sometimes as high as 90%.
Joint work with @niloofar_mire (@LTIatCMU), Jane Ginsburg ( @ColumbiaLaw) and my amazing PhD student @irisiris_l (@sbucompsc )
(1/n)🧵
Prior work has focused on prefix-based extraction, showing LLMs can continue text they've seen before. This is expected from autoregressive models.
Our work is fundamentally different.
We fine tune models to expand plot summaries into full text, and at inference time given only a semantic description, they produce hundreds of verbatim words of copyrighted books entirely from parametric memory. (2/n)
Apr 21, 2025 • 10 tweets • 4 min read
Unlike math/code, writing lacks verifiable rewards. So all we get is slop. To solve this we train reward models on expert edits that beat SOTA #LLMs largely on a new Writing Quality benchmark. We also reduce #AI slop by using our RMs at test time boosting alignment with experts.
Self-evaluation using LLMs has proven useful in reward modeling and constitutional AI. But relying on uncalibrated humans or self aggrandizing LLMs for feedback on subjective tasks like writing can lead to reward hacking and alignment issues.
Jun 6, 2019 • 36 tweets • 11 min read
Now live-tweeting The Enigma of Neural Text Degeneration as the First Defense Against Neural Fake News by Yejin Choi #neuralgen2019#naacl2019
Motivating with super funny fake news created which tells she is a co founder of a self driving icecream truck lmao