Over the past months, Cohort I of our RL Residency has been shipping.
Highlights
- continual learning
- automating AI research (from GPU programming to RL itself)
- embodied environments
- multi-agent systems
- materials science discovery
CARLA-Env – @myainotez
An open-source embodied RL environment based on the CARLA simulator. It provides high-fidelity physics, sensors, and configurable urban scenarios for training and evaluating decision-making agents.
A dataset and RL environment based on the book “Programming Massively Parallel Processors,” focused on CUDA and GPU programming skills. Includes verifiable coding exercises and a frontier eval based on it.
First introduced by @a1zhang in Oct 2025, the RLM has access to its inputs through a variable in a persistent Python REPL.
The model can inspect & transform that variable with code, and pipe parts of it into sub-LLMs with tools without ever loading the potentially huge input data into its context.
RLMs are a simple, flexible form of context folding that doesn't depend on lossy summarization.
Instead, the model proactively delegates context to:
- Python scripts (search, filter, transform)
- Sub-LLMs (fresh instances) for parallel work
- Iterative answer edits until it's actually correct
Introducing INTELLECT-3: Scaling RL to a 100B+ MoE model on our end-to-end stack
Achieving state-of-the-art performance for its size across math, code and reasoning
Built using the same tools we put in your hands, from environments & evals, RL frameworks, sandboxes & more
INTELLECT-3 is a 106B parameter Mixture-of-Experts model trained with both SFT and RL on top of the GLM 4.5 Air Base model.
Both stages, including multiple ablations, were carried out on a 512-GPU H200 cluster over the course of two months.
Our Training Stack
+ PRIME-RL: Our scalable, asynchronous RL trainer
+ Verifiers: Our unified library used for hundreds of envs and evals on the Environments Hub
+ Sandboxes: Custom container infra optimized for agentic RL
+ Compute: Orchestration & observability for 512 H200s
We're scaling our Open-Source Environments Program
As part of this, we're committing hundreds of thousands of $ in bounties and looking for partners who want to join our mission to accelerate open superintelligence
Join us in building the global hub for environments and evals
Over the past 2 months, we've crowdsourced 400+ environments and 80+ verified implementations through our bounties and RL residency across:
+ Autonomous AI Research
+ Browser Automation
+ Theorem Proving
+ Subject-Specific QA
+ Legal/Finance Tasks
+ Many more...
Thank you to everyone whose claimed a bounty or joined the residency!