Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Nathan Benaich

@nathanbenaich

Oct 9, 2025 • 34 tweets • 12 min read • Read on X

Scrolly

🪩The one and only @stateofaireport 2025 is live! 🪩

It’s been a monumental 12 months for AI. Our 8th annual report is the most comprehensive it's ever been, covering what you *need* to know about research, industry, politics, safety and our new usage data.

My highlight reel:

First, let’s dive into Research: 12 months on, @OpenAI still leads, but the pack has closed in fast. China’s @deepseek_ai, @Alibaba_Qwen, and @Kimi_Moonshot sit within a few points on reasoning and coding. While the US holds the frontier, China is now a credible #2.

Once a “Llama rip-off,” @Alibaba_Qwen now powers 40% of all new fine-tunes on @huggingface. China’s open-weights ecosystem has overtaken Meta’s, with Llama riding off into the sunset…for now.

Reinforcement learning has grown up. After fuzzy human feedback came rubric-based rewards and verifiable reasoning tasks. We’re rediscovering rigor and environments for agents to undertake long-running tasks is all the rage.

What’s this approach enabling? @OpenAI and @GoogleDeepMind Gemini models both hit math Olympiad gold. Open provers like Gödel-LM are publishing formal proofs, showing that AI-assisted theorem proving is no longer science fiction.

But we’re not just creating superintelligent agents to crush humans. Indeed, @GoogleDeepMind AlphaZero-discovered strategies improved the gameplay of four chess Grandmasters, proving that superhuman systems can teach the very best humans, not just beat them.

AI is now a lab partner too. @GoogleDeepMind's Co-Scientist and @Stanford's Virtual Lab generate, debate, and validate hypotheses, discover new and established ideas as science is becoming a closed loop with AI in it.

Biology gets its scaling laws too. @ProfluentBio's ProGen3 trained on 1.5T tokens and created a compute frontier for protein language models. This is unlocking generalisation in novel protein space and a path to novel therapeutics such as custom gene editors.

Robots now reason too. “Chain-of-Action” planning brings structured thought to the physical world - from AI2’s Molmo-Act to Gemini Robotics. Massive amounts of effort are thrown into the mix, expect lots of progress here…

.@AnthropicAI's Model Context Protocol is the new USB-C of AI. A single standard to connect models to tools, already embedded in ChatGPT, Gemini, Claude, and VS Code, has taken shape. But not without emerging security risks…

Now, onto the Industry section. RIP AGI, long live Superintelligence. AI pilled tech executives have rebranded the mission, and it’s working: provocative, undefined, and exciting.

The frontier fight is relentless. @OpenAI still tops most leaderboards, but @GoogleDeepMind's stays there longer. Timing releases has become its own science…not least informing financing rounds like clockwork.

Capability per dollar is doubling every few months on @ArtificialAnlys. @GoogleDeepMind's rate: 3.4 months. @OpenAI's: 5.8 months. More predictable gains are driving more investment, and more intelligence for less money.

AI software adoption has gone mainstream. @tryramp @arakharazian data shows 44% of US businesses now pay for AI, up from 5% in 2023. Average contract value for AI products hit $530k in 2025 and is expected to pass $1M in 2026. 12 month retention is now 80%+.

AI-first companies still outrun everyone else too, growing 1.5x faster than peers on @metrics_co.

.@deepseek_ai “$5M training run” deep freak was overblown. Since the market realised the fineprint in the R1 paper, that’s led to Jevons paradox on steroids: lower cost per run → more runs → more compute needed, buy more NVIDIA.

Enter Stargate: a $500B, 10GW US mega-cluster (4M chips) backed by @sama, Masa, Ellison, and @POTUS. The industrial era of AI begins. What a time to be alive.

Sovereigns join the race: from China’s $5B Big Fund to the UAE’s MGX, nations are writing cheques to stay in the game. We expect some nations to just tap out and declare neutrality.

China leads in power infrastructure too, adding >400GW in 2024 vs 41GW for the US. Compute now clearly runs on geopolitics.

.@NVIDIA still rules research and crushes its competitors: Hopper chips surge, Jetsons rise, legacy GPUs fade. If you’d just bought @NVIDIAAI stock instead of its challengers, you’d be up 12x vs. 2x.

Now, let’s switch gears into Politics. The US Government is turning capitalist. Golden shares in US Steel, stakes in Intel and MP Materials, and revenue cuts from NVIDIA’s China sales. New-age Industrial policy?

America’s new “AI Stack” exports compute, models, and compliance to allies. Open source is now national security.

The AI Safety Institute network has collapsed. Washington ditched attending meetings altogether, while the US and UK rebranded “safety” into “security.”

Europe’s AI Act is wobbling: only 3 states are compliant, leaders calling it “confusing,” and pressure mounting for a pause as it’s clear the continent is being left behind.

China’s spending through the debt as Xi told ministers to “redouble efforts” on AI, boosting science funding 10% despite record debt.

Moving into Safety: budgets are anemic. All 11 major US safety orgs will spend $133M in 2025…less than frontier labs burn in a day.

Cyber and alignment risks accelerate. Models can now fake alignment under supervision, and exploit code faster than humans fix it.

But users love AI anyway. Our new State of AI Survey of 1.2k AI practitioners shows that 95% use AI at work or home, 76% pay out of pocket, average spend keeps climbing, productivity gains are real and use cases abound.

Contribute your experience to the survey at stateof(dot)ai/survey

Reviewing last year’s Predictions, we scored 5/10. Here is a sample of our 10 predictions for next year:
- A Chinese lab tops a global leaderboard.
- AI agents make a real scientific discovery.
- Datacenter NIMBYism hits US elections.
- Trump bans state AI laws (illegally).

You can check out the full report over on the State of AI website: stateof.ai

If you enjoy reading the State of AI Report, we invite you to read and subscribe to @airstreetpress, the home of our analytical writing, news, and opinions.

@airstreetpress Join our global community of best practices events --> airstreet(dot)com/events

Big thanks to @Zeke_AG @nellie_norm and Ryan Tovcimak for helping me on this year's monster edition.
Thanks to our reviewers @DynamicWebPaige @alxndrdavies @spacemanidol @gordic_aleksa @idohakimi @ryancjulian @NeelNanda5 @omarsar0 @pschwllr @joespeez @davidstutz92 @rosstaylor90 @divy93t @gagnechris @JoyceBenaich @JacobianNeuro

We write this report to share the most interesting things we’ve seen and celebrate the achievements of the AI community while informing conversation about the state of AI. I’d love to hear your thoughts on the findings, your take on our predictions, or any suggestions for next year’s edition.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @nathanbenaich

Nathan Benaich

@nathanbenaich

Nov 11, 2024

new on @airstreetpress: @percyliang of @stanford and @togethercompute, who joined our @stateofaireport launch in SF a few weeks ago, answers a few questions on truly open AI.

We talk about why it matters, where the field’s going wrong and some solutions.

First up, the term ‘open source’ is often a bit of a misnomer.

If we apply the bar for open source we use for most software to LLMs - they fail.

At the moment, it’s hard to interpret or compare models and claimed capabilities fairly.

It’s already proving tough to replicate many frontier labs’ advertised performance.

Read 8 tweets

Nathan Benaich

@nathanbenaich

Oct 10, 2024

🪩The @stateofaireport 2024 has landed! 🪩

Our seventh installment is our biggest and most comprehensive yet, covering everything you *need* to know about research, industry, safety and politics.

As ever, here's my director’s cut (+ video tutorial!) 🧵

For a while, it looked like @OpenAI’s competitors had succeeded in closing the gap, with frontier lab performance converging significantly as the year went on…

…but it was not to last, as inference-time compute and chain-of-thought drove stunning early results from o1.

Read 38 tweets

Nathan Benaich

@nathanbenaich

Jul 30, 2024

New on @airstreetpress - last year we evaluated ~450 opportunities and countless even earlier stage ideas

In the end, we made 3 seed investments

The biggest single reason for passing was that ideas were unexciting

So, what makes for an exciting opportunity?

Thread!

We look for ideas that are non-consensus today, but have the potential to flip into being voted consensus by the market in a few years’ time.

Non-consensus doesn’t just mean ‘whacky’ or ‘mad - there are instead three main things ideas have in common.

Firstly, they’re not fashionable.

This can take on a few different directions.

Read 18 tweets

Nathan Benaich

@nathanbenaich

Feb 8, 2024

Open source is one of the biggest drivers of progress in software - AI would be unrecognizable without it.

However, it is under existential threat from both regulation and well-funded lobby groups.

The community needs to defend it vigorously. 🧵

While open source may win a partial stay-of-execution in the EU AI Act, a large number of well-funded lobbying organizations are trying to ban already existing open source models.

And publication and disclosure norms are often being undermined on, frankly, flimsy safety grounds.

Read 13 tweets

Nathan Benaich

@nathanbenaich

Oct 12, 2023

🪩The @stateofaireport 2023 is now here.

Our 6th installment is one of the most exciting years I can remember. The #stateofai report covers everything you *need* to know, covering research, industry, safety and politics.

There’s lots in there, so here’s my director’s cut 🧵

2023 was of course the year of the LLM, with the world being stunned by @OpenAI’s GPT-4.

GPT-4 succeeded in beating every other LLM - both on classic AI benchmarks, but also on exams designed for humans.

We’re also seeing a move away from openness, amid safety and competition concerns.

@OpenAI published a very limited technical report for GPT-4, @Google published little on PaLM2, @AnthropicAI simply didn’t bother for Claude…or Claude 2.

Read 26 tweets

Nathan Benaich

@nathanbenaich

Jan 26, 2023

@thisismadani

🧬Today is a big day for AI-first biology!

🤓@thisismadani et al in @NatureBiotech: LLMs learn to generate protein sequences with a predictable function across large protein families.

🆕@ProfluentBio launches w/$9M from @airstreet @insightpartners!

🧵🔽
endpts.com/exclusive-prof…

@stateofaireport

Summer is my queue to start pulling together narratives for @stateofaireport.

By '20, it was clear to me that biology was experiencing its "AI moment": a flurry of AI+bio papers and AlphaFold 2.

In summer '21, I dove deeper and crossed paths with Ali's work at @SFResearch...

In a preprint entitled "Deep neural language modeling enables functional protein generation across families" Ali's team showed that AI can learn the language of biology to create artificial proteins that are both functional and unseen in nature.

Wow!

blog.salesforceairesearch.com/learning-from-…

Read 8 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Nathan Benaich

Try unrolling a thread yourself!

More from @nathanbenaich

Nathan Benaich

Nathan Benaich

Nathan Benaich

Nathan Benaich

Nathan Benaich

Nathan Benaich

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!