Prime Intellect Profile picture
May 12, 2025 12 tweets 4 min read Read on X
Releasing INTELLECT-2: We’re open-sourcing the first 32B parameter model trained via globally distributed reinforcement learning:

• Detailed Technical Report
• INTELLECT-2 model checkpoint

primeintellect.ai/blog/intellect…
To train a model with reinforcement learning in a fully decentralized setting using community-contributed GPUs, we open-source several novel infrastructure components. Image
PRIME-RL: A fully asynchronous reinforcement learning framework designed for decentralized training. Decoupling of rollout generation, model training, and weight broadcasting enables training across heterogeneous, unreliable networks.
SHARDCAST: A library for distributing large files via a HTTP-based tree-topology network that efficiently propagates updated model weights from training nodes to the decentralized inference workers.Image
TOPLOC Validators: A validator service using TOPLOC proofs to ensure that rollouts from untrusted inference workers can be trusted for model training.Image
INTELLECT-2 is trained using rule-based rewards across math and coding problems and length rewards guiding the model to follow its thinking budget. We introduce modifications to the standard GRPO recipe to enhance training stability and encourage faster learning.

Two-step asynchronous RL: The broadcast of new policy weights is fully overlapped with ongoing inference and training, eliminating communication bottlenecks.Image
Two-Sided GRPO Clipping: Stabilizes training by mitigating gradient spikes with two-sided token probability ratio clipping.Image
Advanced Data Filtering: Combines offline and online filtering to select challenging tasks, significantly enhancing model learning efficiency.Image
Experiments:
We report results from two main experiments: TARGET-SHORT, an experimental run with short target lengths to train an efficient reasoning model, and, TARGET-LONG, our main run with longer target lengths.

Reward Trajectories:Image
Benchmark Performance:
We were able to increase the performance of QwQ-32B on math and coding benchmarks. Since QwQ-32B is already very strong and heavily trained using RL, huge additional improvements will likely require better base models or higher quality data.Image
INTELLECT-2 demonstrates that globally decentralized RL works.

Now, we’re focusing on tool-assisted reasoning, crowdsourcing higher-quality data, and optimizing our infrastructure and training recipe to build frontier open models.

Join us to build open source and decentralized AGI.
Links
• Detailed Technical Report: primeintellect.ai/intellect-2
• INTELLECT-2 on Hugging Face: huggingface.co/collections/Pr…
• Chat Interface to try it out: chat.primeintellect.ai
• Blog: primeintellect.ai/blog/intellect…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Prime Intellect

Prime Intellect Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @PrimeIntellect

Jan 27
We're excited to introduce @arcee_ai's Trinity Large model.

An open 400B parameter Mixture of Experts model, delivering frontier-level performance with only 13B active parameters.

Trained in collaboration between Arcee, Datology and Prime Intellect.
Trinity Architecture

Key design choices:
- Interleaved local + global attention (3:1 pattern)
- Grouped-query + gated attention
- New load-balancing method (SMEBU)
- Depth scaled sandwich norm and QK norm

With extreme sparsity, built for long context and fast inference.Image
Infrastructure

- Large-scale synthetic data generation on ~2k H100s
- Training Trinity Large on 2k B300 GPUs

Training stack:
- Modified torchtitan
- Muon optimizer
- HSDP with FSDP group size 128
- Expert parallelism
- Context parallelism for context extension
- Improvements to recover quickly from hardware failures
Read 7 tweets
Jan 1
We believe the next breakthrough in long-horizon agents is training models to manage their own context.

Introducing our new research direction on Recursive Language Models.

We are sharing our initial experiments showing the promise of RLMs.

primeintellect.ai/blog/rlm
First introduced by @a1zhang in Oct 2025, the RLM has access to its inputs through a variable in a persistent Python REPL.

The model can inspect & transform that variable with code, and pipe parts of it into sub-LLMs with tools without ever loading the potentially huge input data into its context.Image
RLMs are a simple, flexible form of context folding that doesn't depend on lossy summarization.

Instead, the model proactively delegates context to:

- Python scripts (search, filter, transform)
- Sub-LLMs (fresh instances) for parallel work
- Iterative answer edits until it's actually correct
Read 8 tweets
Nov 27, 2025
Introducing INTELLECT-3: Scaling RL to a 100B+ MoE model on our end-to-end stack

Achieving state-of-the-art performance for its size across math, code and reasoning

Built using the same tools we put in your hands, from environments & evals, RL frameworks, sandboxes & more
INTELLECT-3 is a 106B parameter Mixture-of-Experts model trained with both SFT and RL on top of the GLM 4.5 Air Base model.

Both stages, including multiple ablations, were carried out on a 512-GPU H200 cluster over the course of two months. Image
Our Training Stack

+ PRIME-RL: Our scalable, asynchronous RL trainer
+ Verifiers: Our unified library used for hundreds of envs and evals on the Environments Hub
+ Sandboxes: Custom container infra optimized for agentic RL
+ Compute: Orchestration & observability for 512 H200s
Read 13 tweets
Oct 27, 2025
We're scaling our Open-Source Environments Program

As part of this, we're committing hundreds of thousands of $ in bounties and looking for partners who want to join our mission to accelerate open superintelligence

Join us in building the global hub for environments and evals
Over the past 2 months, we've crowdsourced 400+ environments and 80+ verified implementations through our bounties and RL residency across:

+ Autonomous AI Research
+ Browser Automation
+ Theorem Proving
+ Subject-Specific QA
+ Legal/Finance Tasks
+ Many more...
Thank you to everyone whose claimed a bounty or joined the residency!

@alexinexxx @xlr8harder @LatentLich @myainotez @ChaseBrowe32432 @varunneal @vyomdundigalla @amit05prakash @minjunesh @sidbing @unrelated333 @ljt019 @lakshyaag @sid_899 @srthkdev @semiozz @ibnAmjid and more! Image
Read 6 tweets
Sep 25, 2025
Another week, another hundred environments.

From autonomous AI research, MCP integrations, and browser automation to domain specific environments for economically valuable tasks across law, finance, and tax. Image
NanoGPT Speedrun

Evaluate code-generation and pretraining capabilities of LLMs via NanoGPT Speedrun benchmark.

By @leloykun
app.primeintellect.ai/dashboard/envi…
MLE-Bench

Environment for solving Kaggle ML competitions from MLE-bench.

By @creet_z
app.primeintellect.ai/dashboard/envi…
Read 25 tweets
Sep 15, 2025
Today we're launching Reserved Instances

- Request 8–1,000+ GPU clusters
- Get quotes from up to 50+ providers in 24h
- Re-sell idle GPUs back to our spot market
- Support from our research team Image
Expanding our Compute Exchange

- Find the best and most cost-effective reserved instance offers across 50+ providers
- Re-sell idle GPUs from your reserved cluster on our liquid compute market
- H100s, H200s, B200s, and NVL72 clusters available today
Additional Features

- Orchestration with SLURM, Ray or Kubernetes
- Monitoring with Grafana dashboards
- Native integrations into our full-stack infra offering: Environment Hub, Sandboxes, Reinforcement Fine-Tuning, Multi-Node Training
- Dedicated support from our research team
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(