Kimi.ai Profile picture
Mar 16 β€’ 4 tweets β€’ 2 min read β€’ Read on X
Introducing π‘¨π’•π’•π’†π’π’•π’Šπ’π’ π‘Ήπ’†π’”π’Šπ’…π’–π’‚π’π’”: Rethinking depth-wise aggregation.

Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers.

πŸ”Ή Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth.
πŸ”Ή Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale.
πŸ”Ή Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead.
πŸ”Ή Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains.

πŸ”—Full report:
github.com/MoonshotAI/Att…Image
Scaling law experiments reveal a consistent 1.25Γ— compute advantage across varying model sizes. Image
Analysis of training dynamics demonstrates how AttnRes naturally mitigates hidden-state magnitude growth and yields a more uniform gradient distribution across depth. Image
For more details, check out our paper here: github.com/MoonshotAI/Att…

β€’ β€’ β€’

Missing some Tweet in this thread? You can try to force a refresh
γ€€

Keep Current with Kimi.ai

Kimi.ai Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @Kimi_Moonshot

Nov 28, 2025
Meet Kimi Agentic Slides!
Now with Nano Banana Pro 🍌

🎁 Thanksgiving Gift: 48H FREE & UNLIMITED ACCESS

πŸ”Έ Agentic search (Kimi K2)
πŸ”Έ Files β†’ Slides (PDFs, images, docs+)
πŸ”Έ Fully editable + PPTX export
πŸ”Έ Designer-level visuals (infographics, illustrations)

Try now: kimi.com/slides
Here's a quick guide. πŸ‘‡
Research paper -> Presentation Ready Deck Image
Read 5 tweets
Jul 11, 2025
πŸš€ Hello, Kimi K2! Open-Source Agentic Model!
πŸ”Ή 1T total / 32B active MoE model
πŸ”Ή SOTA on SWE Bench Verified, Tau2 & AceBench among open models
πŸ”ΉStrong in coding and agentic tasks
🐀 Multimodal & thought-mode not supported for now

With Kimi K2, advanced agentic intelligence is more open and accessible than ever. We can't wait to see what you build!

πŸ”Œ API is here: platform.moonshot.ai
- $0.15 / million input tokens (cache hit)
- $0.60 / million input tokens (cache miss)
- $2.50 / million output tokens

πŸ”— Tech blog: moonshotai.github.io/Kimi-K2/
πŸ”— Weights & code: huggingface.co/moonshotai
πŸ”— Github: github.com/MoonshotAI/Kim…
Try it now at Kimi.ai or via API!Image
Here are some vibe tests we ran:

1. Interactive 3D Mountain Scene
2. A ball bouncing in hexagon
Read 6 tweets
Jun 20, 2025
Meet Kimi-Researcher - an autonomous agent that excels at multi-turn search and reasoning. Powered by k 1.5 and trained with end-to-end agentic RL.

Achieved 26.9% pass@1 on Humanity's Last Exam, 69% pass@1 on xbench.

πŸ”— Tech blog: moonshotai.github.io/Kimi-Researche…Image
Benchmarks aside, It thinks:
β†’ 23 reasoning steps per task (avg.)
β†’ 200+ URLs explored
β†’ Multi-turn tool use of search, browser, and code
β†’ Inline citations

Beta access is rolling out at kimi.com β€” get on the waitlist πŸ‘‰ [docs.google.com/forms/d/e/1FAI…]
Join the discussion & share feedback in our Discord.πŸ‘‰


To facilitate more research efforts in the field, we are planning on open-sourcing the base pretrained model as well as the reinforcement-learned model underlying Kimi-Researcher in the following months.discord.gg/uGqNmXhNhM
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(