Researching AI research agents at @AIatMeta. Spent time at @EPFL, @MSFTResearch, @ETH
Anon feedback: https://t.co/nhtdjRaswF
Apr 15 • 7 tweets • 3 min read
Excited to share AIRA₂ — our next-generation AI Research Agents for ML that address key bottlenecks to scaling.
AIRA₂ achieves SoTA on real-world ML tasks from MLE-bench-30 (81.5% vs 72.7%), exceeds human SoTA on 6/20 diverse AI research tasks from AIRS-Bench (and hacks another 5), while exhibiting strong, predictable scaling properties.
To push the frontier of AI Research, we need systems that scale well. Developing AIRA₂, we learned a lot about the bottlenecks and what it takes to resolve them — insights already driving our next iteration:
1/
First, sample throughput heavily constrains the agent. We develop infra that moves AIRA₂ from sequential execution to asynchronous parallel exploration, enabling throughput to scale linearly with GPU resources.