Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Sheryl Hsu

@SherylHsu02

Aug 11 • 8 tweets • 3 min read • Read on X

Scrolly

1/n I’m thrilled to share that our @OpenAI reasoning system scored high enough to achieve gold 🥇🥇 in one of the world’s top programming competitions - the 2025 International Olympiad in Informatics (IOI) - placing first among AI participants! 👨‍💻👨‍💻

2/n We officially competed in the online AI track of the IOI, where we scored higher than all but 5 (of 330) human participants and placed first among AI participants. We had the same 5 hour time limit and 50 submission limit as human participants. Like the human contestants, our system competed *without* internet or RAG, and just access to a basic terminal tool.

3/n We competed with an ensemble of general-purpose reasoning models---we did not train any model specifically for the IOI. Our only scaffolding was in selecting which solutions to submit and connecting to the IOI API.

4/n This result demonstrates a huge improvement over @OpenAI’s attempt at IOI last year where we finished just shy of a bronze medal with a significantly more handcrafted test-time strategy. We’ve gone from 49th percentile to 98th percentile at the IOI in just one year!

5/n It’s been really exciting to see the progress of our newest research methods at OpenAI, with our successes at the AtCoder World Finals, IMO, and IOI over the last couple weeks. We’ve been working hard on building smarter, more capable models, and we’re working hard to get them into our mainstream products.

6/n I’ve been lucky to work with many fantastic teammates here at @OpenAI, specifically with @alexwei_ @bminaiev @oleg_murk for prepping for IOI and building on top of the long term work on competitive programming by @_lorenzkuhn @MostafaRohani @clavera_i @andresnds @ahelkky

I, along with some teammates, were able to travel to Bolivia to attend the IOI in person. It was wonderful to meet all the participants and coaches there, and we wanted to say congrats once again!!

We officially entered the 2025 International Olympiad in Informatics (IOI) online competition track and adhered to the same restrictions as the human contestants, including submissions and time limits, but without direct supervision from the contest organizers.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @SherylHsu02

Sheryl Hsu

@SherylHsu02

Jul 19

https://twitter.com/alexwei_/status/1946477742855532918

Watching the model solve these IMO problems and achieve gold-level performance was magical. A few thoughts 🧵

https://twitter.com/alexwei_/status/1946477742855532918

The model solves these problems without tools like lean or coding, it just uses natural language, and also only has 4.5 hours. We see the model reason at a very high level - trying out different strategies, making observations from examples, and testing hypothesis.

It’s crazy how we’ve gone from 12% on AIME (GPT 4o) → IMO gold in ~ 15 months. We have come very far very quickly. I wouldn’t be surprised if by next year models will be deriving new theorems and contributing to original math research!

Read 5 tweets

Sheryl Hsu

@SherylHsu02

Oct 31, 2024

Feeling spooked👻🎃? Get grounded...introducing "Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval."

Meet LeReT (Learning to Retrieve by Trying), a RL-based framework that improves LLM’s ability to use retrieval tools by up to 29%.

sherylhsu.com/LeReT/

[2/5] Why is this important?
Like seeing a ghost 👻👻, LLMs often hallucinate (glue in pizza) and grounding LLM answers in retrieved facts improves factuality and transparency. Improving LLM’s ability to retrieve correct information thus improves overall performance.

[3/5] How?
LeReT samples a set of queries, computes a reward based on the retrieved documents, and fine-tunes the LLM using SFT + IPO. Moving beyond high temperature sampling, LeReT uses DSPy to optimize few shot prompts, resulting in more diverse and high reward samples.

Read 5 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Sheryl Hsu

Try unrolling a thread yourself!

More from @SherylHsu02

Sheryl Hsu

Sheryl Hsu

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!