Nathan Benaich Profile picture
Oct 9 34 tweets 12 min read Read on X
🪩The one and only @stateofaireport 2025 is live! 🪩

It’s been a monumental 12 months for AI. Our 8th annual report is the most comprehensive it's ever been, covering what you *need* to know about research, industry, politics, safety and our new usage data.

My highlight reel:
First, let’s dive into Research: 12 months on, @OpenAI still leads, but the pack has closed in fast. China’s @deepseek_ai, @Alibaba_Qwen, and @Kimi_Moonshot sit within a few points on reasoning and coding. While the US holds the frontier, China is now a credible #2.Image
Once a “Llama rip-off,” @Alibaba_Qwen now powers 40% of all new fine-tunes on @huggingface. China’s open-weights ecosystem has overtaken Meta’s, with Llama riding off into the sunset…for now. Image
Reinforcement learning has grown up. After fuzzy human feedback came rubric-based rewards and verifiable reasoning tasks. We’re rediscovering rigor and environments for agents to undertake long-running tasks is all the rage. Image
What’s this approach enabling? @OpenAI and @GoogleDeepMind Gemini models both hit math Olympiad gold. Open provers like Gödel-LM are publishing formal proofs, showing that AI-assisted theorem proving is no longer science fiction. Image
But we’re not just creating superintelligent agents to crush humans. Indeed, @GoogleDeepMind AlphaZero-discovered strategies improved the gameplay of four chess Grandmasters, proving that superhuman systems can teach the very best humans, not just beat them. Image
AI is now a lab partner too. @GoogleDeepMind's Co-Scientist and @Stanford's Virtual Lab generate, debate, and validate hypotheses, discover new and established ideas as science is becoming a closed loop with AI in it. Image
Biology gets its scaling laws too. @ProfluentBio's ProGen3 trained on 1.5T tokens and created a compute frontier for protein language models. This is unlocking generalisation in novel protein space and a path to novel therapeutics such as custom gene editors. Image
Robots now reason too. “Chain-of-Action” planning brings structured thought to the physical world - from AI2’s Molmo-Act to Gemini Robotics. Massive amounts of effort are thrown into the mix, expect lots of progress here… Image
.@AnthropicAI's Model Context Protocol is the new USB-C of AI. A single standard to connect models to tools, already embedded in ChatGPT, Gemini, Claude, and VS Code, has taken shape. But not without emerging security risks… Image
Now, onto the Industry section. RIP AGI, long live Superintelligence. AI pilled tech executives have rebranded the mission, and it’s working: provocative, undefined, and exciting.Image
The frontier fight is relentless. @OpenAI still tops most leaderboards, but @GoogleDeepMind's stays there longer. Timing releases has become its own science…not least informing financing rounds like clockwork. Image
Capability per dollar is doubling every few months on @ArtificialAnlys. @GoogleDeepMind's rate: 3.4 months. @OpenAI's: 5.8 months. More predictable gains are driving more investment, and more intelligence for less money. Image
AI software adoption has gone mainstream. @tryramp @arakharazian data shows 44% of US businesses now pay for AI, up from 5% in 2023. Average contract value for AI products hit $530k in 2025 and is expected to pass $1M in 2026. 12 month retention is now 80%+. Image
AI-first companies still outrun everyone else too, growing 1.5x faster than peers on @metrics_co. Image
.@deepseek_ai “$5M training run” deep freak was overblown. Since the market realised the fineprint in the R1 paper, that’s led to Jevons paradox on steroids: lower cost per run → more runs → more compute needed, buy more NVIDIA. Image
Enter Stargate: a $500B, 10GW US mega-cluster (4M chips) backed by @sama, Masa, Ellison, and @POTUS. The industrial era of AI begins. What a time to be alive. Image
Sovereigns join the race: from China’s $5B Big Fund to the UAE’s MGX, nations are writing cheques to stay in the game. We expect some nations to just tap out and declare neutrality. Image
China leads in power infrastructure too, adding >400GW in 2024 vs 41GW for the US. Compute now clearly runs on geopolitics. Image
.@NVIDIA still rules research and crushes its competitors: Hopper chips surge, Jetsons rise, legacy GPUs fade. If you’d just bought @NVIDIAAI stock instead of its challengers, you’d be up 12x vs. 2x. Image
Image
Now, let’s switch gears into Politics. The US Government is turning capitalist. Golden shares in US Steel, stakes in Intel and MP Materials, and revenue cuts from NVIDIA’s China sales. New-age Industrial policy?Image
America’s new “AI Stack” exports compute, models, and compliance to allies. Open source is now national security. Image
The AI Safety Institute network has collapsed. Washington ditched attending meetings altogether, while the US and UK rebranded “safety” into “security.” Image
Europe’s AI Act is wobbling: only 3 states are compliant, leaders calling it “confusing,” and pressure mounting for a pause as it’s clear the continent is being left behind. Image
China’s spending through the debt as Xi told ministers to “redouble efforts” on AI, boosting science funding 10% despite record debt. Image
Moving into Safety: budgets are anemic. All 11 major US safety orgs will spend $133M in 2025…less than frontier labs burn in a day. Image
Cyber and alignment risks accelerate. Models can now fake alignment under supervision, and exploit code faster than humans fix it. Image
Image
But users love AI anyway. Our new State of AI Survey of 1.2k AI practitioners shows that 95% use AI at work or home, 76% pay out of pocket, average spend keeps climbing, productivity gains are real and use cases abound.

Contribute your experience to the survey at stateof(dot)ai/surveyImage
Image
Image
Image
Reviewing last year’s Predictions, we scored 5/10. Here is a sample of our 10 predictions for next year:
- A Chinese lab tops a global leaderboard.
- AI agents make a real scientific discovery.
- Datacenter NIMBYism hits US elections.
- Trump bans state AI laws (illegally).Image
Image
You can check out the full report over on the State of AI website: stateof.ai
If you enjoy reading the State of AI Report, we invite you to read and subscribe to @airstreetpress, the home of our analytical writing, news, and opinions. Image
@airstreetpress Join our global community of best practices events --> airstreet(dot)com/events Image
Big thanks to @Zeke_AG @nellie_norm and Ryan Tovcimak for helping me on this year's monster edition.
Thanks to our reviewers @DynamicWebPaige @alxndrdavies @spacemanidol @gordic_aleksa @idohakimi @ryancjulian @NeelNanda5 @omarsar0 @pschwllr @joespeez @davidstutz92 @rosstaylor90 @divy93t @gagnechris @JoyceBenaich @JacobianNeuro
We write this report to share the most interesting things we’ve seen and celebrate the achievements of the AI community while informing conversation about the state of AI. I’d love to hear your thoughts on the findings, your take on our predictions, or any suggestions for next year’s edition.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Nathan Benaich

Nathan Benaich Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @nathanbenaich

Nov 11, 2024
new on @airstreetpress: @percyliang of @stanford and @togethercompute, who joined our @stateofaireport launch in SF a few weeks ago, answers a few questions on truly open AI.

We talk about why it matters, where the field’s going wrong and some solutions. Image
First up, the term ‘open source’ is often a bit of a misnomer.

If we apply the bar for open source we use for most software to LLMs - they fail. Image
At the moment, it’s hard to interpret or compare models and claimed capabilities fairly.

It’s already proving tough to replicate many frontier labs’ advertised performance. Image
Read 8 tweets
Oct 10, 2024
🪩The @stateofaireport 2024 has landed! 🪩

Our seventh installment is our biggest and most comprehensive yet, covering everything you *need* to know about research, industry, safety and politics.

As ever, here's my director’s cut (+ video tutorial!) 🧵
For a while, it looked like @OpenAI’s competitors had succeeded in closing the gap, with frontier lab performance converging significantly as the year went on… Image
…but it was not to last, as inference-time compute and chain-of-thought drove stunning early results from o1. Image
Read 38 tweets
Jul 30, 2024
New on @airstreetpress - last year we evaluated ~450 opportunities and countless even earlier stage ideas

In the end, we made 3 seed investments

The biggest single reason for passing was that ideas were unexciting

So, what makes for an exciting opportunity?

Thread! Image
We look for ideas that are non-consensus today, but have the potential to flip into being voted consensus by the market in a few years’ time.

Non-consensus doesn’t just mean ‘whacky’ or ‘mad - there are instead three main things ideas have in common.
Firstly, they’re not fashionable.

This can take on a few different directions. Image
Read 18 tweets
Feb 8, 2024
Open source is one of the biggest drivers of progress in software - AI would be unrecognizable without it.

However, it is under existential threat from both regulation and well-funded lobby groups.

The community needs to defend it vigorously. 🧵 Image
While open source may win a partial stay-of-execution in the EU AI Act, a large number of well-funded lobbying organizations are trying to ban already existing open source models. Image
And publication and disclosure norms are often being undermined on, frankly, flimsy safety grounds. Image
Read 13 tweets
Oct 12, 2023
🪩The @stateofaireport 2023 is now here.

Our 6th installment is one of the most exciting years I can remember. The #stateofai report covers everything you *need* to know, covering research, industry, safety and politics.

There’s lots in there, so here’s my director’s cut 🧵 Image
2023 was of course the year of the LLM, with the world being stunned by @OpenAI’s GPT-4.

GPT-4 succeeded in beating every other LLM - both on classic AI benchmarks, but also on exams designed for humans. Image
We’re also seeing a move away from openness, amid safety and competition concerns.

@OpenAI published a very limited technical report for GPT-4, @Google published little on PaLM2, @AnthropicAI simply didn’t bother for Claude…or Claude 2. Image
Read 26 tweets
Jan 26, 2023
🧬Today is a big day for AI-first biology!

🤓@thisismadani et al in @NatureBiotech: LLMs learn to generate protein sequences with a predictable function across large protein families.

🆕@ProfluentBio launches w/$9M from @airstreet @insightpartners!

🧵🔽
endpts.com/exclusive-prof…
Summer is my queue to start pulling together narratives for @stateofaireport.

By '20, it was clear to me that biology was experiencing its "AI moment": a flurry of AI+bio papers and AlphaFold 2.

In summer '21, I dove deeper and crossed paths with Ali's work at @SFResearch...
In a preprint entitled "Deep neural language modeling enables functional protein generation across families" Ali's team showed that AI can learn the language of biology to create artificial proteins that are both functional and unseen in nature.

Wow!

blog.salesforceairesearch.com/learning-from-…
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(