Koder Profile picture
AI/SRE/DevOps Engineer β€’ Open-source LLMs β€’ Benchmarks β€’ Tips β€’ Hot takes on AI πŸ‡ΈπŸ‡ͺ GΓΆteborg πŸ‡ͺπŸ‡Ί DM open for collabs/projects.

Feb 14, 17 tweets

1/
GPT-5.2 vs GLM-5 β€” the #AI war nobody was ready for.
A Tsinghua spinoff trained a 744B-param open-source nuke on banned chips & just beat OpenAI on agentic benchmarks.
Full breakdown 🧡 bit.ly/glm-5

2/
#GPT52 launched Dec 11 after an internal "Code Red" β€” Gemini 3 Pro panicked OpenAI into rushing it 3 weeks early.
GLM-5 launched Feb 11 in full stealth.
Two very different births. Two very different models. bit.ly/glm-5

3/
Before launch, #GLM5 ran a mystery model called "Pony Alpha" on OpenRouter.
It crushed coding benchmarks anonymously β€” real users, real data, zero hype distortion.
Only THEN did Z.ai reveal it was them. Brilliant playbook. bit.ly/glm-5

4/
Z.ai became the world's FIRST publicly traded foundation model company (HK IPO, Jan 8).
$558M raised. $7.1B valuation.
OpenAI & Anthropic are still private.
That's not a footnote. That's a governance story. #AIInvesting bit.ly/glm-5

5/
Architecture breakdown πŸ‘‡
GPT-5.2: Dense transformer, 3 modes
β€” Instant (speed)
β€” Thinking (chain-of-thought)
β€” Pro (max compute)
#GLM5: 744B MoE, only 40B active per token. Sparse. Efficient. Enormous. bit.ly/glm-5

6/
The #MixtureOfExperts math on GLM-5 is wild:
GLM-4.5: 355B total / 32B active
GLM-5: 744B total / 40B active
Doubled capacity. Added only 8B active params per inference step.
Frontier-scale knowledge. Near-mid-scale compute cost. bit.ly/glm-5

7/
The geopolitical bombshell buried in the architecture:
#GLM5 was trained ENTIRELY on Huawei Ascend chips.
Zero NVIDIA. Zero H100. Zero H200.
Zhipu is on the US Entity List. They shipped a frontier model anyway.
That's not catching up. That's arrival. bit.ly/glm-5

8/
SLIME β€” GLM-5's secret weapon.
Standard RL training: trajectory generation eats 90%+ of total time.
SLIME decouples generation from policy updates asynchronously.
Result: up to 3x higher #RL training throughput. This is how you out-train. - bit.ly/glm-5

9/
Benchmark results β€” the ones that matter:
BrowseComp (web research): GLM-5 75.9 vs GPT-5.2 65.8 βœ…
Terminal-Bench 2.0 (agentic CLI): GLM-5 56.2 vs GPT-5.2 54.0 βœ…
Humanity's Last Exam: GLM-5 50.4 vs GPT-5.2 45.5 βœ…
#AIBenchmarks

πŸ”¬ :
bit.ly/glm-5

10/
Where GPT-5.2 wins:
GPQA-Diamond (PhD science): 92.4% vs 86.0% βœ…
SWE-bench Verified (coding): 80.0% vs 77.8% βœ…
GDPval (professional knowledge): SOTA β€” beats human experts on 70.9% of comparisons βœ…
#LLM #GPT5

bit.ly/glm-5

11/
Vending Bench 2 β€” run a simulated business for a full year:
GLM-5: $4,432 profit πŸ†
GPT-5.2: $3,591
Best other open-source: $2,376
An open-source model, on domestic Chinese hardware, running a better fake business than OpenAI.
We're here. bit.ly/glm-5

12/
GLM-5 also hit a RECORD LOW hallucination rate on AA-Omniscience v4.
Score of -1. A 35-point improvement from GLM-4.5.
Better than GPT-5.2. Better than Gemini 3 Pro.
"Knowing what you don't know" is now #GLM5's biggest enterprise edge. bit.ly/glm-5

13/
The pricing chasm πŸ‘‡
GPT-5.2 Pro API: $21 input / $168 output per 1M tokens
GLM-5 API: ~$1 input / ~$3.20 output
That's 5–8x cheaper.
Or self-host under MIT license for compute cost only.
#OpenSource #AI bit.ly/glm-5

14/
The philosophical divide:
OpenAI: "AI as the world's best professional ASSISTANT"
Z.ai: "AI as an agentic ENGINEER that executes projects"
Not just different marketing. Different training objectives.
#AIStrategy

bit.ly/glm-5

15/
The safety flag nobody's talking about:
Lukas Petersson (Andon Labs) :
"Incredibly effective, but far less situationally aware. Achieves goals via aggressive tactics. This is scary."
High capability + low situational awareness = deployment risk. bit.ly/glm-5

16/
So which should YOU use?
GPT-5.2 β†’ deep reasoning, expert knowledge, safety maturity
#GLM5 β†’ agentic workflows, long-horizon tasks, cost efficiency, open weights
The smart answer: build a ROUTER. Right model for the right task.
bit.ly/glm-5

17/
The real headline isn't the benchmarks.
A sanctioned lab, on banned hardware, open-sourced a 744B frontier model at $1/M tokens the same week they went public.
The frontier is no longer an American lake.

Full breakdown:

#AI #LLM #OpenSourcebit.ly/glm-5

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling