Sakana AI Profile picture
Jun 22 3 tweets 3 min read Read on X
Introducing Sakana Fugu: A full multi-agent orchestration system accessible via a single model API.

Our ‘Fugu Ultra’ model matches the performance of Fable and Mythos, delivering frontier capability without the risk of export controls.

Try it: 🐡sakana.ai/fugu
Fugu stands shoulder-to-shoulder with leading models like Fable and Mythos across the industry's most rigorous engineering, scientific, and reasoning benchmarks.

Read the full blog: sakana.ai/fugu-release

Beyond Bigger Models: Why are Orchestration Models the Next Frontier

Progress in AI has been driven largely by giant, monolithic models. But the most powerful systems of the future will be collaborative ecosystems.

Today, this orchestration is no longer just a technical optimization. It has become a geopolitical and operational imperative.

For an organization or a nation, relying on a single company's model for critical infrastructure, finance, or governance is a material vulnerability. This risk is no longer a hypothetical possibility, but a reality.

As we have seen with recent export controls imposed on models like Fable and Mythos, access can disappear overnight.

Collective intelligence is the practical hedge against this concentration of power. Because Fugu orchestrates an underlying pool of swappable agents, it simply routes around vendor restrictions.

By orchestrating the world’s models, we are delivering the resilient blueprint required for true AI sovereignty.Image
How does it work?

Sakana Fugu is itself an LLM, trained to call various LLMs in an agent pool, including instances of itself recursively. Fugu dynamically orchestrates the world's best models to tackle complex, multi-step tasks.

As shown in this figure, Fugu is a multi-agent system that behaves like a single model. You send a request to one endpoint, and Fugu decides how to handle it internally.

Fugu manages model selection, delegation, verification, and synthesis automatically. It solves tasks directly when that is enough, or coordinates a team of expert models when a problem calls for more. The complexity of a multi-agent system never reaches your code.

At launch, Sakana Fugu comes in two models accessed via a single OpenAI-compatible API:

• Fugu balances strong performance with low latency for everyday work. It fits naturally into tools like Codex for coding, as well as chatbots and interactive services. You can also opt specific agents out of its pool for data compliance.

• Fugu Ultra is our flagship model tuned for maximum answer quality on hard, multi-step problems. It coordinates a deeper pool of expert agents for demanding work like AI research, cybersecurity analysis, and patent investigations.Image

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Sakana AI

Sakana AI Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @SakanaAILabs

Jul 1, 2025
We’re excited to introduce AB-MCTS!

Our new inference-time scaling algorithm enables collective intelligence for AI by allowing multiple frontier models (like Gemini 2.5 Pro, o4-mini, DeepSeek-R1-0528) to cooperate.

Blog: sakana.ai/ab-mcts
Paper: arxiv.org/abs/2503.04412

Inspired by the power of human collective intelligence, where the greatest achievements arise from the collaboration of diverse minds, we believe the same principle applies to AI. Individual frontier models like ChatGPT, Gemini, and DeepSeek are remarkably advanced, each possessing unique strengths and biases stemming from their training, which we view as valuable resources for collective problem-solving.

AB-MCTS (Adaptive Branching Monte Carlo Tree Search) harnesses these individualities, allowing multiple models to cooperate and engage in effective trial-and-error, solving challenging problems for any single AI. Our initial results on the ARC-AGI-2 benchmark are promising, with AB-MCTS combining o4-mini + Gemini-2.5-Pro + R1-0528, current frontier AI models, significantly outperforming individual models by a substantial margin.

This research builds on our 2024 work on evolutionary model merging, shifting focus from “mixing to create” to “mixing to use” existing, powerful AIs. At Sakana AI, we remain committed to pioneering novel AI systems by applying nature-inspired principles such as evolution and collective intelligence. We believe this work represents a step toward a future where AI systems collaboratively tackle complex challenges, much like a team of human experts, unlocking new problem-solving capabilities and moving beyond single-model limitations.

Algorithm (TreeQuest): github.com/SakanaAI/treeq…
ARC-AGI Experiments: github.com/SakanaAI/ab-mc…Image
The AB-MCTS combination of o4-mini + Gemini-2.5-Pro + R1-0528, current frontier AI models, achieves strong performance on the ARC-AGI-2 benchmark, outperforming individual models by a large margin.

We open-sourced our implementation of AB-MCTS:
github.com/SakanaAI/treeq…Results of AB-MCTS and Multi-LLM AB-MCTS on ARC-AGI-2, showing Pass@k as a function of the number of LLM calls.
Many ARC-AGI-2 examples that were unsolvable by any single LLM were solved by combining multiple LLMs. In some cases, an initially incorrect attempt by o4-mini is used by R1-0528 and Gemini-2.5-Pro as a hint to get to the correct solution.

ARC-AGI-2 code:
github.com/SakanaAI/ab-mc…An example problem from ARC-AGI-2. The task is to infer the common transformation rule from the three demonstration cases on the left and apply it to the test case on the right. This is one of the problems that became solvable using Multi-LLM AB-MCTS.
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(