Latest Twitter Threads by @deepseek_ai on Thread Reader App

Dec 1 • 5 tweets • 3 min read

🚀 Launching DeepSeek-V3.2 & DeepSeek-V3.2-Speciale — Reasoning-first models built for agents!

🔹 DeepSeek-V3.2: Official successor to V3.2-Exp. Now live on App, Web & API.
🔹 DeepSeek-V3.2-Speciale: Pushing the boundaries of reasoning capabilities. API-only for now.

📄 Tech report:

1/nhuggingface.co/deepseek-ai/De…

🏆 World-Leading Reasoning

🔹 V3.2: Balanced inference vs. length. Your daily driver at GPT-5 level performance.
🔹 V3.2-Speciale: Maxed-out reasoning capabilities. Rivals Gemini-3.0-Pro.
🥇 Gold-Medal Performance: V3.2-Speciale attains gold-level results in IMO, CMO, ICPC World Finals & IOI 2025.

📝 Note: V3.2-Speciale dominates complex tasks but requires higher token usage. Currently API-only (no tool-use) to support community evaluation & research.

2/n

Sep 29 • 4 tweets • 2 min read

🚀 Introducing DeepSeek-V3.2-Exp — our latest experimental model!

✨ Built on V3.1-Terminus, it debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context.
👉 Now live on App, Web, and API.
💰 API prices cut by 50%+!

1/n ⚡️ Efficiency Gains

🤖 DSA achieves fine-grained sparse attention with minimal impact on output quality — boosting long-context performance & reducing compute cost.
📊 Benchmarks show V3.2-Exp performs on par with V3.1-Terminus.

2/n

Aug 21 • 5 tweets • 3 min read

Introducing DeepSeek-V3.1: our first step toward the agent era! 🚀

🧠 Hybrid inference: Think & Non-Think — one model, two modes
⚡️ Faster thinking: DeepSeek-V3.1-Think reaches answers in less time vs. DeepSeek-R1-0528
🛠️ Stronger agent skills: Post-training boosts tool use and multi-step agent tasks

Try it now — toggle Think/Non-Think via the "DeepThink" button: chat.deepseek.com

1/5 API Update ⚙️

🔹 deepseek-chat → non-thinking mode
🔹 deepseek-reasoner → thinking mode
🧵 128K context for both
🔌 Anthropic API format supported: api-docs.deepseek.com/guides/anthrop…
✅ Strict Function Calling supported in Beta API: api-docs.deepseek.com/guides/functio…
🚀 More API resources, smoother API experience

2/5

Jan 20 • 5 tweets • 3 min read

🚀 DeepSeek-R1 is here!

⚡ Performance on par with OpenAI-o1
📖 Fully open-source model & technical report
🏆 MIT licensed: Distill & commercialize freely!

🌐 Website & API are live now! Try DeepThink at today!

🐋 1/n chat.deepseek.com

🔥 Bonus: Open-Source Distilled Models!

🔬 Distilled from DeepSeek-R1, 6 small models fully open-sourced
📏 32B & 70B models on par with OpenAI-o1-mini
🤝 Empowering the open-source community

🌍 Pushing the boundaries of **open AI**!

🐋 2/n

Dec 26, 2024 • 4 tweets • 2 min read

🚀 Introducing DeepSeek-V3!

Biggest leap forward yet:
⚡ 60 tokens/second (3x faster than V2!)
💪 Enhanced capabilities
🛠 API compatibility intact
🌍 Fully open-source models & papers

🐋 1/n

🎉 What’s new in V3?

🧠 671B MoE parameters
🚀 37B activated parameters
📚 Trained on 14.8T high-quality tokens

🔗 Dive deeper here:
Model 👉 github.com/deepseek-ai/De…
Paper 👉 github.com/deepseek-ai/De…

🐋 2/n

Sep 6, 2024 • 4 tweets • 2 min read

🚀 Exciting news! We’ve officially launched DeepSeek-V2.5 – a powerful combination of DeepSeek-V2-0628 and DeepSeek-Coder-V2-0724! Now, with enhanced writing, instruction-following, and human preference alignment, it’s available on Web and API. Enjoy seamless Function Calling, FIM, and Json Output all-in-one!

Note: Due to significant updates in this version, if performance drops in certain cases, we recommend adjusting the system prompt and temperature settings for the best results!

DeepSeek-V2.5 outperforms both DeepSeek-V2-0628 and DeepSeek-Coder-V2-0724 on most benchmarks.

Share this page!

Enter URL or ID to Unroll