Post

More from @SemiAnalysis_

SemiAnalysis

@SemiAnalysis_

Jun 28

BREAKING NEWS: The Founder/CEO of LeptonAI has left only a year after LeptonAI’s acquisition. This is quite shocking, as Jensen reportedly spent $700M acquiring LeptonAI. What did he see? DGX Lepton flopped and got nowhere near the success Jensen expected. 1/7🧵

Initially, NVIDIA claimed that Lepton’s core software platform would be open-sourced by 2026. That has yet to happen. While we were skeptical, we wanted to believe that NVIDIA would open-source the core Lepton software platform, given that Lepton’s CEO is the co-creator of Caffe, ONNX, and PyTorch. 2/7🧵

One speculation for why Lepton’s CEO left is that Jensen ultimately changed his mind and did not approve open-sourcing Lepton. In acquisitions, standard practice is for vesting to happen over multiple years. 3/7🧵

Read 7 tweets

SemiAnalysis

@SemiAnalysis_

Jun 27

One of the more uncomfortable observations in our AI Value Capture piece is internal: our token spend at SemiAnalysis now runs at roughly 30% of employee compensation, with employees pulling just under 5 billion tokens per month on average, over 5x more than Meta, and our top contributors clearing 100 billion. We wrote about it openly because every research firm, hedge fund, and law firm we know is heading toward a similar number, just on a delay. (1/4)🧵

The substitution math is the part to internalize. Tasks that used to need a junior analyst for several hours, converting a model to a dashboard, building chart packs from earnings, rebuilding a comp set, now resolve in minutes for a few dollars of tokens. The blended Opus 4.7 cost we observe is about $0.99 per million against $5/$25 sticker, mostly because agentic workloads run 300:1 input-to-output ratios and cache hit rates above 90% pull the effective price down. Thats a real change in the unit economics of professional services, not a 10% efficiency gain. (2/4)

The throughput math has gotten the most pushback in our reader notes, so its worth being precise. On the same B300 running DeepSeek R1, baseline FP8 sits near 1,000 tokens/sec/GPU, adding wideEP plus disagg gets you to roughly 8,000, and layering MTP on top pushes it to about 14,000, a 14x gain from software alone. Factor in hardware too and the most optimized GB300 NVL72 hits about 17x the best H100 config in FP8, 32x in FP4. Once you accept that compression is real, model-lab gross margin expansion stops looking like a temporary pricing oddity and starts looking structural. (3/4)

Read 4 tweets

SemiAnalysis

@SemiAnalysis_

Jun 26

H100 ornn index spot prices are falling, now at $2.42 per hour, roughly 40% below the May peak. The ecosystem is concerned that this is a sign that compute demand and by extension the appetite for AI is waning. (1/5)🧵

The important signal is that this is likely a spot price index not term pricing. Our neocloud survey for 1-year H100 contract prices have isntead climbed from a trough of roughly $1.70 per hour late last year to about $2.65 per hour today. (2/5)

Spot and on-demand markets are where buyers run POCs, one-off evaluations, burst workloads, and capacity overflow. They can be useful when taken as part of a dataset but are not reflective of where production economics are set. Contract pricing is where sustained workloads show up with the intention of planned, recurring, revenue-bearing inference or training demand. (3/5)

Read 5 tweets

SemiAnalysis

@SemiAnalysis_

Jun 22

AI demand is outstripping Moore's law in the short run
Moore's law drove import prices of computers and semiconductors down by 52% between 2001 and 2020. (1/4)🧵

AI demand has surged so high that import prices for computers and semiconductors rose 3.6% in May, now up 14.4% year-to-year. This is so far from anything in the historical record that 'fastest ever' doesn't do justice to it. (2/4)

Import prices are hedonically adjusted (accounting for chip speed and capacity) so Moore's law means they normally fall over time. (3/4)

Read 4 tweets

SemiAnalysis

@SemiAnalysis_

Jun 15

China is Mogging Western Auto, and that’s Bad for Semis, National Security & War

If you live anywhere outside the US, you've noticed it: the streets are filling up with cars you've never seen before. Chery? Jaecoo? Zeekr? Leapmotor? BYD? No, you didn't miss a decade of car launches. They're Chinese. And they're everywhere. (1/10)🧵

Israel is the perfect case study to understand what’s really going on: high car ownership, zero domestic production & no restrictions on auto imports from China. Here’s what the data shows:

China's share of Israel's auto import value: 2023: 23.7% 2024: 29.1% 2025: 36.6% 2026 YTD: 40.2. (2/10)

China is now Israel's #1 auto supplier: $6.4B of imports since Jan 2023 — nearly 2x South Korea, almost 3x Japan.

The kicker: Israel's total auto imports SHRANK ~18% in 2025. China's still grew. That's pure share capture, not market growth. (3/10)

Read 10 tweets

SemiAnalysis

@SemiAnalysis_

Jun 10

What's the better business model for an AI lab, subscription or API? (1/4)🧵

Recently, we purchased one of each Anthropic/OpenAI subscription plan and randomly ran long horizon coding tasks until we exhausted the weekly limit. It's widely believed that a $200/month plan maxes out at ~$2000/month worth of tokens (assuming API pricing). However, we found that the subscriptions are actually far more generous. (2/4)

The margin on a subscription plan is a function of the average utilization. If we assume both companies have 75% API gross margins, this results in the following subscription margins. (3/4)

Read 4 tweets

Share this page!

Enter URL or ID to Unroll

SemiAnalysis

Try unrolling a thread yourself!

More from @SemiAnalysis_

SemiAnalysis

SemiAnalysis

SemiAnalysis

SemiAnalysis

SemiAnalysis

SemiAnalysis

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!