Latest Twitter Threads by @JustenMichel on Thread Reader App

May 22, 2025 • 7 tweets • 2 min read

so many AI headlines over the past weeks bear out a simple point: AI companies can't reliably steer their models.

1. Anthropic can't guarantee that Claude 4 won't blackmail users if they make borderline requests

https://x.com/shashj/status/1925654215537389663

2. OpenAI accidentally made their model way too flattering to users, which they had to roll back.

https://x.com/Kyle_L_Wiggers/status/1917297682118222298

Apr 28, 2025 • 9 tweets • 2 min read

GUYS THE STORY HERE IS NOT ABOUT THE RESEARCHERS IT’S ABOUT THE RESEARCH RESULTS

https://twitter.com/jason_koebler/status/1916866751750607039

The lack of informed consent is bad.

But these results are crazy

Share this page!

Enter URL or ID to Unroll