SemiAnalysis Profile picture
Feb 26 9 tweets 4 min read
Everybody knows Alexander Fleming's accidental discovery of penicillin. A forgotten petri dish, a surprising mold contamination, and suddenly – life-saving modern medicine.

Well, the semiconductor industry actually has its own Alexander Fleming and "penicillin moment", and it happened while a brief mix-up occurred between an inkwell and a molten crucible.

Meet Jan Czochralski. (1/8)🧵Image It's 1916. Czochralski, a Polish chemist, is focused on understanding how metals crystallize. Labs back then were often cluttered, intense places with random objects and inventions scattered around.

One day, while writing notes in his Berlin lab, he goes to dip his pen in an inkwell and misses...

Instead of ink, his pen plunges into a crucible of molten tin sitting nearby. He slowly pulls it out and something bizarre happens.
A thin, shimmering filament of solidified metal emerges, perfectly straight and uniform. (2/8)Image
Feb 25 10 tweets 3 min read
Micron’s $100B megafab in NY is at risk of delay due to just 6 “concerned citizens” and their frivolous lawsuit. (1/10) 🧵 Image The project has already taken an absurd 1200 days between announcement and groundbreaking. Competitors overseas who began at the same time now have built and working fabs. (2/10) Image
Feb 22 4 tweets 2 min read
Occasionally we hear pitches for demand response for AI load, in which AI compute clusters reduce their workloads during "peak times" for the electric grid, reducing strain and requiring fewer new generators.

These programs are not new: typically they're tied to an electric bill credit. But to an AI cloud, it's...not worth the money. (1/4)🧵Image Let's use rule-of-thumb figures: AI clusters make $12M/MW annually in revenue, and have an EBIT margin of around 20%.

At those numbers, a utility needs to incentivize the cluster at $0.35/kWh - 5-10x the price of electricity for a datacenter - for demand response to be worth the loss in revenue.

In capacity market terms, that's more than $6,500/MW-day, or 20x the price cap for PJM's most recent capacity auction. (2/4)
Feb 10 9 tweets 5 min read
Agent swarms are moving from prompting tricks to math.

Kimi K2.5 trains an orchestrator that spawns + schedules specialist sub-agents in parallel, and reporting 3×–4.5× lower wall-clock on WideSearch, plus higher scores.

Anthropic also recently released agent teams, where multiple Claude Code instances work together. It is still experimental but has been used to write Claude's C compiler. (1/9) 🧵Image We ran a small benchmark with WideSearch on Claude Teams: 30 unique English tasks, 2 trials each, 30-min timeout, GPT-5.2 judge. We compare it to a Opus 4.6 baseline.

Teams were more expensive and slower on average: Total $93 for baseline vs Total $131 for teams.

We got 46/60 baseline vs 47/60 teams completed runs due to our 30-min cutoff (WideSearch tasks can run for 3+ hours with 100+ sources) (2/9)Image
Image
Image
Jan 31 6 tweets 3 min read
Backend networking architecture is instrumental to networking ownership cost analysis. How does networking architecture matter for networking ownership costs? In general, we observe the following trends:

🟠Hyperscalers typically get preferential terms on networking equipment compared to neoclouds;

🟠Infiniband-based networks are more expensive than Ethernet-based networks for the same networking type and cluster size;

🟠3-Layer networks would be more expensive than 2-Layer networks on a per rack server (1 rack server = 72 GPUs in the case of GB200 and GB300) basis; and

🟠All else constant, G300 networks running on CX-8 NICs are expected to be more expensive than GB200 networks running on CX-7 NICs on a per rack server basis because of the doubling of per GPU bandwidth.
(1/6)🧵 Before we dive into several examples of how architecture influences networking costs, we define the following terminology used to describe networking clusters:

🟠Network layers: Number of switch layers required to connect all GPUs within the same cluster, typically 2 or 3 layers

🟠Rails: Number of pathways you can split a server tray, which is equal to the GPUs per server tray or 4 in the case of GB200 or GB300 deployments

🟠Planes: Number of pathways you can split a NIC

🟠Attach Rate: GPU per Networking Component (or the reverse)
(2/6)
Jan 29 9 tweets 5 min read
Ever wonder how an Nvidia GB200 NVL72 gets from the factory to the data center? You can’t just throw a state-of-the-art 3,000 lb rack (compared to ~500 lb for CPU-based server) on a standard FedEx truck. Moving AI infrastructure is more like transporting a heart for transplant than shipping electronics. (1/9) 🧵Image A fully loaded AI rack can weigh up to 3,700 lbs. That’s the weight of a Ford Explorer concentrated into a 2x4 foot footprint. Standard pallets would crush instantly. These AI racks require custom-engineered reinforced bases with shock-absorbent foam just to keep them from tearing through the floor. (2/9)Image
Jan 28 4 tweets 3 min read
When people think about semiconductor manufacturing, many picture highly automated robotic arms and overhead transport systems. However, the true unsung heroes behind high‑volume chip production are metrology and inspection. If you cannot see it, you cannot manufacture it; and if you cannot measure it, you cannot achieve yield. (1/4) 🧵Image Although these two terms are often discussed together, they actually refer to two distinct domains.

Metrology:
The core of metrology is accuracy and consistency.
It involves ultra‑precise measurements of critical dimensions, film thickness, surface topology, and overlay alignment in advanced, highly scaled manufacturing. Metrology ensures that interconnects and billions of transistors conform precisely to their intended physical and design specifications

Inspection:
The core of inspection is defect capture and yield improvement.
It relies on optical or e‑beam scanning to identify particles, scratches, pattern defects, or electrical anomalies on the wafer surface. Inspection determines whether a fab can catch defective wafers at the earliest possible stage—before hundreds of thousands of dollars of processed silicon turn into scrap. (2/4)
Why have metrology and inspection become so critical in advanced nodes?Image
Jan 22 6 tweets 2 min read
The U.S. wants 40% of chips made onshore. But quietly, the equipment that makes those chips is moving offshore fast. This gap matters just as much as the chips themselves. (1/6) 🧵 Image In 2022 we flagged Lam Research expanding in Malaysia. Today, most of its high-volume production is there, not the U.S. That trend has only accelerated. (2/6) newsletter.semianalysis.com/p/lam-research…
Jan 3 5 tweets 3 min read
Rolling into the new year, 2 of the Six Tigers quietly filed their IPO prospectuses and will start trading in early January if all goes well. We finally get a glimpse into audited financials of foundation model labs. TLDR: Building Machine God Ain't Cheap. (1/5)🧵 Image MiniMax (0100 HK) and aka Knowledge Atlas fka ZhiPu (2513 HK) both give a glimpse into the economics of an AI Lab, demonstrating strong product momentum as well as a flagrant disregard for profitability. (2/5) 🔥📉Z.aiImage
Image
Dec 23, 2025 4 tweets 1 min read
If you want to power a datacenter off the grid, a gas turbine is the "obvious" choice. But it might not be the best option! Many developers select reciprocating engines for a reason. (1/4)🧵 Image A recip is more modular than turbines, happier at partial loads, and more comprehensible to maintain. You're mostly changing lubricants, whereas a turbine requires no maintenance...until it needs a massive overhaul. (2/4)
Dec 4, 2025 8 tweets 2 min read
Massive IT load growth. A transforming electric grid. Five-year lead times for turbines. Why not build more of them?

Well, GE and Siemens have seen this story before. (1/8)🧵 Image Back in the '90s, parts of the American electric grid were "deregulating." These reforms gave us commodity markets for electricity--aka ISOs and RTOs. INDEPENDENT POWER PRODUCERS (IPPs), often utilities from other states, could build and run their own power plants and make money on these new electricity markets. Their generator of choice? The COMBINED CYCLE GAS PLANT (CCGT), particularly the then-new F-CLASS. (2/8)
Dec 3, 2025 7 tweets 4 min read
A semiconductor is a material whose electrical conductivity lies between that of a conductor and an insulator. To achieve this property, doping is applied to a silicon wafer to adjust its electrical characteristics. (1/7)🧵 Image Before the 1970s, doping was performed through thermal diffusion in high-temperature furnaces.
Process steps:
⚆ Pre-deposition: An oxide-based dopant film is deposited on the wafer surface.
⚆ Oxidation: The dopant oxide is driven into the growing silicon dioxide layer.
⚆ Doped region formation: The doped area forms and reaches the desired concentration and depth.
⚆ Wet etching: The oxide layer is removed using a wet etching process. (2/7)Image
Nov 14, 2025 4 tweets 2 min read
The economics of AI has been a big question mark in many investors' minds - What does the value chain look like? How do you model out the ROIC of AI? What would the ROIC look like?

We built up an end-to-end economics stack to answer this question - how we go from a chip’s silicon cost, through full system integration, all the way down to the dollar cost per million inference tokens.(1/4)🧵 At the top of the stack, our accelerator analysis starts with the semiconductor bill of materials (transistors, packaging, HBM, and yield assumptions) to determine GPU provider content. From there, our BoM and ODM modeling breaks down every component inside the server. The network topology model then maps how these servers interconnect.(2/4)Image
Nov 4, 2025 7 tweets 3 min read
Qualcomm and MediaTek are in a race to reduce their dependency on the mature smartphone market. Both are still managing to beat unit growth in smartphones. But that won't last long. Investors are looking for their progress in non-smartphones. Qualcomm's non-smartphone chip business hit a $10B+ annual run-rate, contrasting with MediaTek's $8B+. (1/7) 🧵Image Both have increased their investments to capture more revenue in consumer, networking, industrial and computing markets. Non-smartphones account for 30% of Qualcomm's semiconductor revenue and 48% of MediaTek's. Qualcomm has a target of $22B non-smartphone chip revenue by FY29 at a 5-year CAGR of 21%. Qualcomm built a strong moat in autos but made mixed progress in IoT (a collection of end markets including PC, consumer, networking and infrastructure). (2/7)Image
Oct 30, 2025 6 tweets 2 min read
AI workloads are characterized by elephant flows when all of the GPUs in a cluster exchange data through collective communication operations to synchronize data for distributed workloads. These flows can often lead to congestion and load balancing issues. (1/6)🧵 Image To solve this problem, Meta turned to the use of Disaggregated Scheduled Fabrics (DSFs). Being “Scheduled” means that a credit-based system is used to control flows and prevent congestion – before a node can send packets across the network, it must first send a credit request towards the receiving node to make sure that the receiving end has enough buffer to receive the packet. These packets also travel over a fabric that cellifies the packets, breaking it into smaller cells and spreading it across multiple routes in the fabric. (2/6)
Oct 29, 2025 8 tweets 6 min read
CMP (Chemical-Mechanical Polishing) is a type of planarization process that uses a slurry to thin or polish the wafer surface to achieve a smooth, mirror-like finish. As early as 1980, CMP was developed by IBM specifically as a technique for dielectric planarization.
Aside from wafer edge grinding, etching, dielectric deposition, metal deposition and other thin films, CMP is used commonly throughout the process.

There are several applications of CMP including copper interconnects, removal of USG (undoped silicate glass) films formed during the STI (shallow trench isolation) and polysilicon removal on DRAM surfaces. (1/8)🧵 Interestingly, the use of this technique for wafer surface planarization was initially unexpected. The reason is straightforward, in traditional semiconductor processing, direct contact with the wafer surface is strictly prohibited, as it can cause defects and particle contamination. In turn, it leads to reduced manufacturing efficiency and lower yield. However, it has now been proven that this technique not only enables surface planarization but also reduces defect density and improves yield. (2/8)Image
Oct 26, 2025 11 tweets 7 min read
Etching is a process used to remove material from the wafer surface to meet the design requirements of an integrated circuit (IC).
There are two types of etching: one is patterning etching, which removes material in specified areas, such as transferring patterns from a photoresist or hard mask layer onto the substrate film. Another type is blanket etching, which removes the entire surface film to meet process requirements, for example, backside wafer etching. (1/11) 🧵 Etching also can be categorized into two types based on characteristics: wet etching and dry etching. Wet etching is typically performed at room temperature, requiring no additional vacuum equipment, RF systems, or gas delivery setup. The process is relatively easy to control, making the equipment significantly cheaper than that used for dry etching. Below, we will introduce each in detail. (2/11)Image
Oct 18, 2025 7 tweets 4 min read
AWS believes that their custom K2v5/6 NIC with their in house EFA protocol has better perf than NVIDIA ConnectX-7/8 NICs but due to how increasingly how tightly integrated NVIDIA racks are, it becomes increasingly difficult for hyperscalers to use their own NICs. This is what led to AWS GB300 NVL72 to disaggregate  their NICs from the compute tray into an NIC only sidecar called "JBOK". Below we breakdown the decisions and constraints that led to this design. 👇1\N 🧵Image For GB200, AWS only supported GB200 NVL36x2 and NVL36 which allowed up to 72 GPUs per NVLink domain while allowing each rack to be 66kW power & 2U compute trays by connecting 2 NVL36 with NVLink ACC cables. As many GCP & AWS customers have noticed, NVIDIA's driver & physical engineering support for NVL36x2 has been lackluster and way more bugs than their standalone NVL72 design. Although AWS markets their NVL36x2 as "NVL72", it is not topologically equivalent to an actual NVL72. 2/N🧵Image
Oct 9, 2025 8 tweets 4 min read
China’s State Council on October 9 approved Order No. 61 of 2025, announcing export controls on certain overseas rare-earth items. This marks the fourth round of rare-earth export restriction efforts; the previous round was on April 8.
(1/8)🧵 Image China’s new rare earth export controls focus on two key points:
⚆ Products containing Samarium (Sm), Dysprosium (Dy), or Gadolinium (Gd) originating from China that account for 0.1% or more of the item’s value must obtain a dual-use export license.
⚆ Rare earth materials are not permitted for military use.
⚆ Exports related to the R&D or production of sub-14 nm logic chips, 256-layer-plus memory chips, semiconductor equipment, or AI with potential military use, which will now require case-by-case approval.
(2/8)Image
Oct 8, 2025 9 tweets 3 min read
Looking closer at the Intel – NVIDIA partnership shows no vote of confidence in Intel Foundry! The deal primarily drives demand in Intel Products, with minimal NVIDIA IP fabbed on Intel nodes. While the deal is negative for ARM in datacenter and AMD in PC, Intel Foundry does not gain external revenue either. (1/9) 🧵Image On datacenter chips: Intel will sell x86 CPUs to NVIDIA. NVIDIA will integrate them into superchips (such as the Grace Blackwell superchip board shown) and sold in rackscale NVL72 systems. Superchip means this is an alternative to Grace/Vera for enterprise customers who have to rely on x86. (2/9)Image
Oct 8, 2025 7 tweets 3 min read
At COMPUTEX this May, NVIDIA announced plans to establish its Constellation headquarters in Taiwan. However, the project now faces uncertainty. (1/7)🧵 Image The proposed site for the Taiwan HQ was the T17 and T18 plots in the Beitou-Shilin Technology Park. NVIDIA had signed a Memorandum of Understanding (MOU) with Shin Kong Life Insurance, a Taiwanese company with total assets exceeding USD 100 billion, but the MOU expired on September 30 and is no longer valid. (2/7)