MatthewBerman Profile picture
Feb 16 11 tweets 4 min read Read on X
OpenAI just dropped a paper that reveals the blueprint for creating the best AI coder in the world.

But here’s the kicker: this strategy isn’t just for coding—it’s the clearest path to AGI and beyond.

Let’s break it down 🧵👇 Image
1/ OpenAI’s latest research shows that reinforcement learning + test-time compute is the key to building superintelligent AI.

Sam Altman himself said OpenAI’s model went from ranking 175th to 50th in competitive coding—and expects #1 by year-end.
2/ The paper, “Competitive Programming with Large Reasoning Models,” compares different AI coding strategies.

At first, models relied on human-engineered inference strategies—but the biggest leap came when humans were removed from the loop entirely. Image
3/ Enter DeepSeek-R1, a model that cost only ~$5M to train.

Its breakthrough? Reinforcement learning with verifiable rewards.

This method, also used in AlphaGo, let's the model learn from trial & error, and scale intelligence indefinitely. Image
4/ Think about it this way:

AlphaGo became the best Go player in the world without human guidance.

It just kept playing itself until it mastered the game.

Now, OpenAI is applying the same principle to coding—and soon, to all STEM fields. Image
5/ What does this mean?

Every domain with verifiable rewards (math, coding, science) can be mastered by AI just by letting it play against itself.

AI is removing human limitations—and that’s how we get to AGI. Image
6/ Here’s the data from the coding competition:

• GPT-4: 808 ELO (decent)
• OpenAI-01: 1,673 ELO (better)
• OpenAI-03: 2,724 ELO (SUPERHUMAN) 🏆

99.8th percentile of competitive coders, with no human-crafted strategies. Image
7/ Tesla did this with Full Self-Driving.

They used to rely on a hybrid model (human rules + AI).

But when they switched to end-to-end AI, performance skyrocketed.

AI just needs more compute—not more human intervention.

8/ The takeaway?

Sam Altman was right when he said AGI is just a matter of scaling up.

Reinforcement learning + test-time compute is the formula for intelligence—and OpenAI is already proving it.
9/ We’re witnessing the birth of AI superintelligence in real time.

It won’t stop at coding. The same techniques will make AI the best mathematician, scientist, and engineer in history.

The race to AGI is on.
Here's the paper: arxiv.org/pdf/2502.06807

And my full video breakdown: youtube.com/watch?v=VnaKWi…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with MatthewBerman

MatthewBerman Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @MatthewBerman

Feb 13
New research paper shows how LLMs can "think" internally before outputting a single token!

Unlike Chain of Thought, this "latent reasoning" happens in the model's hidden space.

TONS of benefits from this approach.

Let me break down this fascinating paper... Image
The key insight:

Human thinking often happens before we verbalize thoughts.

Traditional LLMs think by generating tokens (Chain of Thought), but this new approach lets models reason in their continuous latent space first. Image
So what is it?

The researchers built a 3.5B parameter model with a recurrent architecture that can "think" repeatedly in latent space before generating any output.

The more thinking iterations, the better the performance! Image
Read 13 tweets
Jan 21
DeepSeek R1 has been out for 24 hours.

The AI industry's reaction has been...strong!

Here's a collection of the most telling reactions: 🧵 Image
Dr. Jim Fan, Sr. Research Manager at NVIDIA, points out how odd it is that a non-US company is leading the Open Source AI charge, given that was the original mission of OpenAI.

Aravind Srinivas, CEO of Perplexity, says DeepSeek has replicated o1-mini and open-sourced it.

I'd say it's more comparable to o1-preview...but...semantics :)

Read 10 tweets
Jan 18
Test Time Compute is bigger than anyone realizes.

It's the most important breakthrough in AI since Transformers.

Let me explain...🧵 Image
What is Test Time Compute?

Think of it like this: Instead of AI giving instant answers, it now "thinks" longer - just like humans do when solving complex problems. Image
The Proof

Google DeepMind proved that scaling test-time compute can be more effective than increasing model parameters.

Then the o1/o3 model crushed benchmarks by thinking longer. Image
Read 11 tweets
Jan 16
1/ SakanaAI just dropped their latest research: Transformer²

It's a self-adaptive architecture that allows AI to evolve at inference time.

Model weights are no longer "static"

Let’s break it down: 🧵 Image
2/ Traditional Transformers are static post-training.

Once trained, they can’t learn or adapt without expensive fine-tuning or additional methods like retrieval-augmented generation (RAG).

Transformer² changes this entirely. Image
3/ The core innovation?

A two-pass system. 🌀

• Pass 1: Analyze the task (e.g., math, coding, or reasoning) to understand the query.

• Pass 2: Dynamically update specific model weights based on the task.

This makes the model far more adaptable.
Read 11 tweets
Jan 15
1/ Google Research unveils new paper: "Titans: Learning to Memorize at Test Time"

It introduces human-like memory structures to overcome the limits of Transformers, with one "SURPRISING" feature.

Here's why this is huge for AI. 🧵👇 Image
2/ The Problem:

Transformers, the backbone of most AI today, struggle with long-term memory due to quadratic memory complexity.

Basically, there's a big penalty for long context windows!

Titans aims to solve this with massive scalability. Image
3/ What Makes Titans Different?

Inspired by human memory, Titans integrate:

• Short-term memory (real-time processing)
• Long-term memory (retaining key past information)
• Persistent memory (task-specific baked-in knowledge)

This modular approach mimics how the brain works.Image
Read 11 tweets
Jan 13
1/9 BREAKING

Biden Admin drops major AI chip rules today!

The 200+ page "AI Diffusion" framework completely reshapes global AI tech trade.

Key goal: Keep advanced AI development running on "American rails"

But not everyone is happy... 🧵 Image
2/9 THE ALLIES LIST

18 countries get VIP treatment with ZERO restrictions - including UK, Canada, Japan, Germany, South Korea & Taiwan.

These trusted partners can freely access US AI tech.

Small orders (up to 1,700 GPUs) worldwide won't need special permission. Image
3/9 TRUSTED STATUS

Companies in allied nations can become "Universal Verified End Users" - letting them use up to 7% of their AI compute globally.

BUT they must keep 75% of total compute power in US/allied territory.

Microsoft & Google already preparing to comply. Image
Read 11 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(