Latest Twitter Threads by @xuanmingzhangai on Thread Reader App

Jun 23 • 9 tweets • 4 min read

1/8 🧠 Think the deepest layer of an LLM is always the best for output? Think again! Our latest paper by Qwen Team reveals the "Alignment Tax" hiding in your final layers. Post-training can violently perturb terminal tokens away from rigorous logic! 🧵 ↓

2/8 We uncover a persistent Guess-Refine-Perturb forward-pass dynamic. Intermediate layers rigorously refine core reasoning, but the absolute final layers often drag predictions back toward safe, generic common words. This creates a massive planning-pragmatics tradeoff.

Share this page!

Enter URL or ID to Unroll