Post

More from @Alibaba_Qwen

Qwen

@Alibaba_Qwen

Sep 11

🚀 Introducing Qwen3-Next-80B-A3B — the FUTURE of efficient LLMs is here!

🔹 80B params, but only 3B activated per token → 10x cheaper training, 10x faster inference than Qwen3-32B.(esp. @ 32K+ context!)
🔹Hybrid Architecture: Gated DeltaNet + Gated Attention → best of speed & recall
🔹 Ultra-sparse MoE: 512 experts, 10 routed + 1 shared
🔹 Multi-Token Prediction → turbo-charged speculative decoding
🔹 Beats Qwen3-32B in perf, rivals Qwen3-235B in reasoning & long-context

🧠 Qwen3-Next-80B-A3B-Instruct approaches our 235B flagship.
🧠 Qwen3-Next-80B-A3B-Thinking outperforms Gemini-2.5-Flash-Thinking.

Try it now: chat.qwen.ai
Blog: qwen.ai/blog?id=4074cc…
Huggingface: huggingface.co/collections/Qw…
ModelScope: modelscope.cn/collections/Qw…
Kaggle: kaggle.com/models/qwen-lm…
Alibaba Cloud API: alibabacloud.com/help/en/model-…

Pretraining Efficiency & Inference Speed

Prefill Stage: At 4K context length, throughput is nearly 7x higher than Qwen3-32B. Beyond 32K, it’s over 10x faster.

Read 9 tweets

Qwen

@Alibaba_Qwen

Aug 18

🚀 Excited to introduce Qwen-Image-Edit!
Built on 20B Qwen-Image, it brings precise bilingual text editing (Chinese & English) while preserving style, and supports both semantic and appearance-level editing.

✨ Key Features
✅ Accurate text editing with bilingual support
✅ High-level semantic editing (e.g. object rotation, IP creation)
✅ Low-level appearance editing (e.g. addition/delete/insert)

Try it now: chat.qwen.ai/?inputFeature=…
Hugging Face: huggingface.co/Qwen/Qwen-Imag…
ModelScope: modelscope.cn/models/Qwen/Qw…
Blog: qwenlm.github.io/blog/qwen-imag…
Github: github.com/QwenLM/Qwen-Im…
API: alibabacloud.com/help/en/model-…

Read 16 tweets

Qwen

@Alibaba_Qwen

May 9

🚀 One line. A full webpage. No hassle.

Introducing Web Dev – the ultimate tool for building stunning frontend webpages & apps using simple prompts in Qwen Chat.

🎨 Just say, "create a twitter website" — and boom! Instant code, ready to go.

No coding required. Just your imagination.

👉🏻 chat.qwen.ai/?inputFeature=…

Make an interesting animate

Create a semantic "Contact Support" form with fields for the user's name, email, issue type, and message. Arrange the form elements vertically within a card.

Read 6 tweets

Qwen

@Alibaba_Qwen

Apr 28

Introducing Qwen3!

We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general capabilities, etc., when compared to other top-tier models such as DeepSeek-R1, o1, o3-mini, Grok-3, and Gemini-2.5-Pro. Additionally, the small MoE model, Qwen3-30B-A3B, outcompetes QwQ-32B with 10 times of activated parameters, and even a tiny model like Qwen3-4B can rival the performance of Qwen2.5-72B-Instruct.

For more information, feel free to try them out in Qwen Chat Web (chat.qwen.ai) and APP and visit our GitHub, HF, ModelScope, etc.

Blog: qwenlm.github.io/blog/qwen3/
GitHub: github.com/QwenLM/Qwen3
Hugging Face: huggingface.co/collections/Qw…
ModelScope: modelscope.cn/collections/Qw…

The post-trained models, such as Qwen3-30B-A3B, along with their pre-trained counterparts (e.g., Qwen3-30B-A3B-Base), are now available on platforms like Hugging Face, ModelScope, and Kaggle. For deployment, we recommend using frameworks like SGLang and vLLM. For local usage, tools such as Ollama, LMStudio, MLX, llama.cpp, and KTransformers are highly recommended. These options ensure that users can easily integrate Qwen3 into their workflows, whether in research, development, or production environments.

Hope you enjoy our new models!

Qwen3 exhibits scalable and smooth performance improvements that are directly correlated with the computational reasoning budget allocated. This design enables users to configure task-specific budgets with greater ease, achieving a more optimal balance between cost efficiency and inference quality.

Qwen3 models are supporting 119 languages and dialects. This extensive multilingual capability opens up new possibilities for international applications, enabling users worldwide to benefit from the power of these models.

Read 5 tweets

Qwen

@Alibaba_Qwen

Jan 30

Announcing Qwen2.5-VL Cookbooks!

🧑‍🍳A collection of notebooks showcasing use cases of Qwen2.5-VL, include local model and API. Examples include Compute use, Spatial Understanding, Document Parsing, Mobile Agent, OCR, Universal Recognition, Video Understanding.

🔗Link:github.com/QwenLM/Qwen2.5…

💬 Qwen Chat: chat.qwenlm.ai (choose Qwen2.5-VL-72B-Instruct as the model)

⚙️ API: alibabacloud.com/help/en/model-…

1 Computer use

This notebook demonstrates how to use Qwen2.5-VL for computer use. It takes a screenshot of a user's desktop and a query, and then uses the model to interpret the user's query on the screenshot.

github.com/QwenLM/Qwen2.5…

2 Spatial Understanding

This notebook showcases Qwen2.5-VL's advanced spatial localization abilities, including accurate object detection and specific target grounding within images.See how it integrates visual and linguistic understanding to interpret complex scenes effectively.

github.com/QwenLM/Qwen2.5…

Read 8 tweets

Qwen

@Alibaba_Qwen

Jan 28

The burst of DeepSeek V3 has attracted attention from the whole AI community to large-scale MoE models. Concurrently, we have been building Qwen2.5-Max, a large MoE LLM pretrained on massive data and post-trained with curated SFT and RLHF recipes. It achieves competitive performance against the top-tier models, and outcompetes DeepSeek V3 in benchmarks like Arena Hard, LiveBench, LiveCodeBench, GPQA-Diamond.

📖 Blog: qwenlm.github.io/blog/qwen2.5-m…
💬 Qwen Chat: chat.qwenlm.ai (choose Qwen2.5-Max as the model)
⚙️ API: alibabacloud.com/help/en/model-… （check the code snippet in the blog）
💻 HF Demo: huggingface.co/spaces/Qwen/Qw…

In the future, we not only continue the scaling in pretraining, but also invest in the scaling in RL. We hope that Qwen is able to explore the unknown in the near future! 🔥

💗 Thank you for your support during the past year. See you next year!

Results of base language models. We are confident in the quality of our base models and we expect the next version of Qwen will be much better with our improved post-training methods.