Latest Twitter Threads by @SherylHsu02 on Thread Reader App

Aug 11, 2025 • 8 tweets • 3 min read

1/n I’m thrilled to share that our @OpenAI reasoning system scored high enough to achieve gold 🥇🥇 in one of the world’s top programming competitions - the 2025 International Olympiad in Informatics (IOI) - placing first among AI participants! 👨‍💻👨‍💻

2/n We officially competed in the online AI track of the IOI, where we scored higher than all but 5 (of 330) human participants and placed first among AI participants. We had the same 5 hour time limit and 50 submission limit as human participants. Like the human contestants, our system competed *without* internet or RAG, and just access to a basic terminal tool.

Jul 19, 2025 • 5 tweets • 2 min read

Watching the model solve these IMO problems and achieve gold-level performance was magical. A few thoughts 🧵

https://twitter.com/alexwei_/status/1946477742855532918

The model solves these problems without tools like lean or coding, it just uses natural language, and also only has 4.5 hours. We see the model reason at a very high level - trying out different strategies, making observations from examples, and testing hypothesis.

Oct 31, 2024 • 5 tweets • 3 min read

Feeling spooked👻🎃? Get grounded...introducing "Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval."

Meet LeReT (Learning to Retrieve by Trying), a RL-based framework that improves LLM’s ability to use retrieval tools by up to 29%.

sherylhsu.com/LeReT/

[2/5] Why is this important?
Like seeing a ghost 👻👻, LLMs often hallucinate (glue in pizza) and grounding LLM answers in retrieved facts improves factuality and transparency. Improving LLM’s ability to retrieve correct information thus improves overall performance.

Share this page!

Enter URL or ID to Unroll