SHOCKING: Many neocloud executives we spoke with feel that if they have non-NVIDIA networking gear on their cluster, or if their cloud has an AMD GPU or TPU offering, NVIDIA will retaliate. They feel that retaliation includes not giving early allocation or no longer supporting a potential IPO/VC raise.(1/3)🧵
Note that this doesn't apply to hyperscalers, as they have more buying power. But for neoclouds, executives feel NVIDIA has been using high-pressure tactics to keep them NVIDIA-only.(2/3)
Some neocloud executives are even starting to consider offering TPUs or AMD GPUs quietly, to avoid getting pressured by NVIDIA and to avoid the feeling that NVIDIA will retaliate.(3/3)
• • •
Missing some Tweet in this thread? You can try to
force a refresh
One of the more uncomfortable observations in our AI Value Capture piece is internal: our token spend at SemiAnalysis now runs at roughly 30% of employee compensation, with employees pulling just under 5 billion tokens per month on average, over 5x more than Meta, and our top contributors clearing 100 billion. We wrote about it openly because every research firm, hedge fund, and law firm we know is heading toward a similar number, just on a delay. (1/4)🧵
The substitution math is the part to internalize. Tasks that used to need a junior analyst for several hours, converting a model to a dashboard, building chart packs from earnings, rebuilding a comp set, now resolve in minutes for a few dollars of tokens. The blended Opus 4.7 cost we observe is about $0.99 per million against $5/$25 sticker, mostly because agentic workloads run 300:1 input-to-output ratios and cache hit rates above 90% pull the effective price down. Thats a real change in the unit economics of professional services, not a 10% efficiency gain. (2/4)
The throughput math has gotten the most pushback in our reader notes, so its worth being precise. On the same B300 running DeepSeek R1, baseline FP8 sits near 1,000 tokens/sec/GPU, adding wideEP plus disagg gets you to roughly 8,000, and layering MTP on top pushes it to about 14,000, a 14x gain from software alone. Factor in hardware too and the most optimized GB300 NVL72 hits about 17x the best H100 config in FP8, 32x in FP4. Once you accept that compression is real, model-lab gross margin expansion stops looking like a temporary pricing oddity and starts looking structural. (3/4)
H100 ornn index spot prices are falling, now at $2.42 per hour, roughly 40% below the May peak. The ecosystem is concerned that this is a sign that compute demand and by extension the appetite for AI is waning. (1/5)🧵
The important signal is that this is likely a spot price index not term pricing. Our neocloud survey for 1-year H100 contract prices have isntead climbed from a trough of roughly $1.70 per hour late last year to about $2.65 per hour today. (2/5)
Spot and on-demand markets are where buyers run POCs, one-off evaluations, burst workloads, and capacity overflow. They can be useful when taken as part of a dataset but are not reflective of where production economics are set. Contract pricing is where sustained workloads show up with the intention of planned, recurring, revenue-bearing inference or training demand. (3/5)
AI demand is outstripping Moore's law in the short run
Moore's law drove import prices of computers and semiconductors down by 52% between 2001 and 2020. (1/4)🧵
AI demand has surged so high that import prices for computers and semiconductors rose 3.6% in May, now up 14.4% year-to-year. This is so far from anything in the historical record that 'fastest ever' doesn't do justice to it. (2/4)
Import prices are hedonically adjusted (accounting for chip speed and capacity) so Moore's law means they normally fall over time. (3/4)
China is Mogging Western Auto, and that’s Bad for Semis, National Security & War
If you live anywhere outside the US, you've noticed it: the streets are filling up with cars you've never seen before. Chery? Jaecoo? Zeekr? Leapmotor? BYD? No, you didn't miss a decade of car launches. They're Chinese. And they're everywhere. (1/10)🧵
Israel is the perfect case study to understand what’s really going on: high car ownership, zero domestic production & no restrictions on auto imports from China. Here’s what the data shows:
China's share of Israel's auto import value: 2023: 23.7% 2024: 29.1% 2025: 36.6% 2026 YTD: 40.2. (2/10)
China is now Israel's #1 auto supplier: $6.4B of imports since Jan 2023 — nearly 2x South Korea, almost 3x Japan.
The kicker: Israel's total auto imports SHRANK ~18% in 2025. China's still grew. That's pure share capture, not market growth. (3/10)
What's the better business model for an AI lab, subscription or API? (1/4)🧵
Recently, we purchased one of each Anthropic/OpenAI subscription plan and randomly ran long horizon coding tasks until we exhausted the weekly limit. It's widely believed that a $200/month plan maxes out at ~$2000/month worth of tokens (assuming API pricing). However, we found that the subscriptions are actually far more generous. (2/4)
The margin on a subscription plan is a function of the average utilization. If we assume both companies have 75% API gross margins, this results in the following subscription margins. (3/4)
OSATs are usually seen as “boring” semiconductor companies. But we’ve been, and remain, bullish on Amkor ($AMKR) and ASE ($3711.TW). Why?
Because both sides of the OSAT model, Assembly & Test, are starting to shift in a meaningful way. (1/10) 🧵
Let’s focus on the first for now; Assembly.
Historically, packaging = low-margin wire bonding. Not exciting.
ASE once made up ~40% of $KLIC’s wire bonder business. After the COVID boom, capacity flooded the market and growth stalled. (2/10)
But something is changing.
KLIC is now seeing :
• 90%+ utilization in China
And guiding to:
• H2’26 China growth +15–20% vs H1
At the Chipbook we have been tracking wire bonder imports into China which are up +108% YoY in March. (3/10)