SemiAnalysis Profile picture
Jan 31 6 tweets 3 min read
Backend networking architecture is instrumental to networking ownership cost analysis. How does networking architecture matter for networking ownership costs? In general, we observe the following trends:

🟠Hyperscalers typically get preferential terms on networking equipment compared to neoclouds;

🟠Infiniband-based networks are more expensive than Ethernet-based networks for the same networking type and cluster size;

🟠3-Layer networks would be more expensive than 2-Layer networks on a per rack server (1 rack server = 72 GPUs in the case of GB200 and GB300) basis; and

🟠All else constant, G300 networks running on CX-8 NICs are expected to be more expensive than GB200 networks running on CX-7 NICs on a per rack server basis because of the doubling of per GPU bandwidth.
(1/6)🧵 Before we dive into several examples of how architecture influences networking costs, we define the following terminology used to describe networking clusters:

🟠Network layers: Number of switch layers required to connect all GPUs within the same cluster, typically 2 or 3 layers

🟠Rails: Number of pathways you can split a server tray, which is equal to the GPUs per server tray or 4 in the case of GB200 or GB300 deployments

🟠Planes: Number of pathways you can split a NIC

🟠Attach Rate: GPU per Networking Component (or the reverse)
(2/6)
Jan 29 9 tweets 5 min read
Ever wonder how an Nvidia GB200 NVL72 gets from the factory to the data center? You can’t just throw a state-of-the-art 3,000 lb rack (compared to ~500 lb for CPU-based server) on a standard FedEx truck. Moving AI infrastructure is more like transporting a heart for transplant than shipping electronics. (1/9) 🧵Image A fully loaded AI rack can weigh up to 3,700 lbs. That’s the weight of a Ford Explorer concentrated into a 2x4 foot footprint. Standard pallets would crush instantly. These AI racks require custom-engineered reinforced bases with shock-absorbent foam just to keep them from tearing through the floor. (2/9)Image
Jan 28 4 tweets 3 min read
When people think about semiconductor manufacturing, many picture highly automated robotic arms and overhead transport systems. However, the true unsung heroes behind high‑volume chip production are metrology and inspection. If you cannot see it, you cannot manufacture it; and if you cannot measure it, you cannot achieve yield. (1/4) 🧵Image Although these two terms are often discussed together, they actually refer to two distinct domains.

Metrology:
The core of metrology is accuracy and consistency.
It involves ultra‑precise measurements of critical dimensions, film thickness, surface topology, and overlay alignment in advanced, highly scaled manufacturing. Metrology ensures that interconnects and billions of transistors conform precisely to their intended physical and design specifications

Inspection:
The core of inspection is defect capture and yield improvement.
It relies on optical or e‑beam scanning to identify particles, scratches, pattern defects, or electrical anomalies on the wafer surface. Inspection determines whether a fab can catch defective wafers at the earliest possible stage—before hundreds of thousands of dollars of processed silicon turn into scrap. (2/4)
Why have metrology and inspection become so critical in advanced nodes?Image
Jan 22 6 tweets 2 min read
The U.S. wants 40% of chips made onshore. But quietly, the equipment that makes those chips is moving offshore fast. This gap matters just as much as the chips themselves. (1/6) 🧵 Image In 2022 we flagged Lam Research expanding in Malaysia. Today, most of its high-volume production is there, not the U.S. That trend has only accelerated. (2/6) newsletter.semianalysis.com/p/lam-research…
Jan 3 5 tweets 3 min read
Rolling into the new year, 2 of the Six Tigers quietly filed their IPO prospectuses and will start trading in early January if all goes well. We finally get a glimpse into audited financials of foundation model labs. TLDR: Building Machine God Ain't Cheap. (1/5)🧵 Image MiniMax (0100 HK) and aka Knowledge Atlas fka ZhiPu (2513 HK) both give a glimpse into the economics of an AI Lab, demonstrating strong product momentum as well as a flagrant disregard for profitability. (2/5) 🔥📉Z.aiImage
Image
Dec 23, 2025 4 tweets 1 min read
If you want to power a datacenter off the grid, a gas turbine is the "obvious" choice. But it might not be the best option! Many developers select reciprocating engines for a reason. (1/4)🧵 Image A recip is more modular than turbines, happier at partial loads, and more comprehensible to maintain. You're mostly changing lubricants, whereas a turbine requires no maintenance...until it needs a massive overhaul. (2/4)
Dec 4, 2025 8 tweets 2 min read
Massive IT load growth. A transforming electric grid. Five-year lead times for turbines. Why not build more of them?

Well, GE and Siemens have seen this story before. (1/8)🧵 Image Back in the '90s, parts of the American electric grid were "deregulating." These reforms gave us commodity markets for electricity--aka ISOs and RTOs. INDEPENDENT POWER PRODUCERS (IPPs), often utilities from other states, could build and run their own power plants and make money on these new electricity markets. Their generator of choice? The COMBINED CYCLE GAS PLANT (CCGT), particularly the then-new F-CLASS. (2/8)
Dec 3, 2025 7 tweets 4 min read
A semiconductor is a material whose electrical conductivity lies between that of a conductor and an insulator. To achieve this property, doping is applied to a silicon wafer to adjust its electrical characteristics. (1/7)🧵 Image Before the 1970s, doping was performed through thermal diffusion in high-temperature furnaces.
Process steps:
⚆ Pre-deposition: An oxide-based dopant film is deposited on the wafer surface.
⚆ Oxidation: The dopant oxide is driven into the growing silicon dioxide layer.
⚆ Doped region formation: The doped area forms and reaches the desired concentration and depth.
⚆ Wet etching: The oxide layer is removed using a wet etching process. (2/7)Image
Nov 14, 2025 4 tweets 2 min read
The economics of AI has been a big question mark in many investors' minds - What does the value chain look like? How do you model out the ROIC of AI? What would the ROIC look like?

We built up an end-to-end economics stack to answer this question - how we go from a chip’s silicon cost, through full system integration, all the way down to the dollar cost per million inference tokens.(1/4)🧵 At the top of the stack, our accelerator analysis starts with the semiconductor bill of materials (transistors, packaging, HBM, and yield assumptions) to determine GPU provider content. From there, our BoM and ODM modeling breaks down every component inside the server. The network topology model then maps how these servers interconnect.(2/4)Image
Nov 4, 2025 7 tweets 3 min read
Qualcomm and MediaTek are in a race to reduce their dependency on the mature smartphone market. Both are still managing to beat unit growth in smartphones. But that won't last long. Investors are looking for their progress in non-smartphones. Qualcomm's non-smartphone chip business hit a $10B+ annual run-rate, contrasting with MediaTek's $8B+. (1/7) 🧵Image Both have increased their investments to capture more revenue in consumer, networking, industrial and computing markets. Non-smartphones account for 30% of Qualcomm's semiconductor revenue and 48% of MediaTek's. Qualcomm has a target of $22B non-smartphone chip revenue by FY29 at a 5-year CAGR of 21%. Qualcomm built a strong moat in autos but made mixed progress in IoT (a collection of end markets including PC, consumer, networking and infrastructure). (2/7)Image
Oct 30, 2025 6 tweets 2 min read
AI workloads are characterized by elephant flows when all of the GPUs in a cluster exchange data through collective communication operations to synchronize data for distributed workloads. These flows can often lead to congestion and load balancing issues. (1/6)🧵 Image To solve this problem, Meta turned to the use of Disaggregated Scheduled Fabrics (DSFs). Being “Scheduled” means that a credit-based system is used to control flows and prevent congestion – before a node can send packets across the network, it must first send a credit request towards the receiving node to make sure that the receiving end has enough buffer to receive the packet. These packets also travel over a fabric that cellifies the packets, breaking it into smaller cells and spreading it across multiple routes in the fabric. (2/6)
Oct 29, 2025 8 tweets 6 min read
CMP (Chemical-Mechanical Polishing) is a type of planarization process that uses a slurry to thin or polish the wafer surface to achieve a smooth, mirror-like finish. As early as 1980, CMP was developed by IBM specifically as a technique for dielectric planarization.
Aside from wafer edge grinding, etching, dielectric deposition, metal deposition and other thin films, CMP is used commonly throughout the process.

There are several applications of CMP including copper interconnects, removal of USG (undoped silicate glass) films formed during the STI (shallow trench isolation) and polysilicon removal on DRAM surfaces. (1/8)🧵 Interestingly, the use of this technique for wafer surface planarization was initially unexpected. The reason is straightforward, in traditional semiconductor processing, direct contact with the wafer surface is strictly prohibited, as it can cause defects and particle contamination. In turn, it leads to reduced manufacturing efficiency and lower yield. However, it has now been proven that this technique not only enables surface planarization but also reduces defect density and improves yield. (2/8)Image
Oct 26, 2025 11 tweets 7 min read
Etching is a process used to remove material from the wafer surface to meet the design requirements of an integrated circuit (IC).
There are two types of etching: one is patterning etching, which removes material in specified areas, such as transferring patterns from a photoresist or hard mask layer onto the substrate film. Another type is blanket etching, which removes the entire surface film to meet process requirements, for example, backside wafer etching. (1/11) 🧵 Etching also can be categorized into two types based on characteristics: wet etching and dry etching. Wet etching is typically performed at room temperature, requiring no additional vacuum equipment, RF systems, or gas delivery setup. The process is relatively easy to control, making the equipment significantly cheaper than that used for dry etching. Below, we will introduce each in detail. (2/11)Image
Oct 18, 2025 7 tweets 4 min read
AWS believes that their custom K2v5/6 NIC with their in house EFA protocol has better perf than NVIDIA ConnectX-7/8 NICs but due to how increasingly how tightly integrated NVIDIA racks are, it becomes increasingly difficult for hyperscalers to use their own NICs. This is what led to AWS GB300 NVL72 to disaggregate  their NICs from the compute tray into an NIC only sidecar called "JBOK". Below we breakdown the decisions and constraints that led to this design. 👇1\N 🧵Image For GB200, AWS only supported GB200 NVL36x2 and NVL36 which allowed up to 72 GPUs per NVLink domain while allowing each rack to be 66kW power & 2U compute trays by connecting 2 NVL36 with NVLink ACC cables. As many GCP & AWS customers have noticed, NVIDIA's driver & physical engineering support for NVL36x2 has been lackluster and way more bugs than their standalone NVL72 design. Although AWS markets their NVL36x2 as "NVL72", it is not topologically equivalent to an actual NVL72. 2/N🧵Image
Oct 9, 2025 8 tweets 4 min read
China’s State Council on October 9 approved Order No. 61 of 2025, announcing export controls on certain overseas rare-earth items. This marks the fourth round of rare-earth export restriction efforts; the previous round was on April 8.
(1/8)🧵 Image China’s new rare earth export controls focus on two key points:
⚆ Products containing Samarium (Sm), Dysprosium (Dy), or Gadolinium (Gd) originating from China that account for 0.1% or more of the item’s value must obtain a dual-use export license.
⚆ Rare earth materials are not permitted for military use.
⚆ Exports related to the R&D or production of sub-14 nm logic chips, 256-layer-plus memory chips, semiconductor equipment, or AI with potential military use, which will now require case-by-case approval.
(2/8)Image
Oct 8, 2025 9 tweets 3 min read
Looking closer at the Intel – NVIDIA partnership shows no vote of confidence in Intel Foundry! The deal primarily drives demand in Intel Products, with minimal NVIDIA IP fabbed on Intel nodes. While the deal is negative for ARM in datacenter and AMD in PC, Intel Foundry does not gain external revenue either. (1/9) 🧵Image On datacenter chips: Intel will sell x86 CPUs to NVIDIA. NVIDIA will integrate them into superchips (such as the Grace Blackwell superchip board shown) and sold in rackscale NVL72 systems. Superchip means this is an alternative to Grace/Vera for enterprise customers who have to rely on x86. (2/9)Image
Oct 8, 2025 7 tweets 3 min read
At COMPUTEX this May, NVIDIA announced plans to establish its Constellation headquarters in Taiwan. However, the project now faces uncertainty. (1/7)🧵 Image The proposed site for the Taiwan HQ was the T17 and T18 plots in the Beitou-Shilin Technology Park. NVIDIA had signed a Memorandum of Understanding (MOU) with Shin Kong Life Insurance, a Taiwanese company with total assets exceeding USD 100 billion, but the MOU expired on September 30 and is no longer valid. (2/7)
Oct 7, 2025 5 tweets 4 min read
Physical vapor deposition (PVD) is a deposition process that uses heat or sputtering to vaporizes solid materials through heating or sputtering, and the resulting vapor condenses on the substrate surface to form a solid thin film. PVD plays a critical role in semiconductor metallization processes.
PVD films generally provide higher deposition quality, lower impurity concentration, and lower resistivity, while CVD films typically offer better step coverage.
The cost of PVD is generally lower than CVD, because PVD operates under milder process conditions (around 200–500 °C) and requires relatively simple equipment. In contrast, CVD requires high-temperature environments and more complex reaction control, resulting in higher equipment and process costs.
(1/5) 🧵Image The PVD process typically uses two methods: evaporation and sputtering, with sputtering being the primary technique. This is because sputtering can deposit metal films with high purity and low resistivity, while also providing good uniformity and reliability.
Evaporation
In the early days of IC manufacturing, when aluminum was the only metal used for metallization, thermal evaporation was widely adopted for depositing aluminum films. However, since this process could affect transistors and circuits, it was later replaced by the more familiar electron-beam evaporation.
As shown in the figure, the process must be carried out in a vacuum environment of about 10^-6 Torr to reduce water and oxygen content, thereby preventing the formation of high-resistivity aluminum oxide from reactions with aluminum. A tungsten filament is heated by passing current through it, melting the aluminum and eventually vaporizing it. When the aluminum vapor reaches the wafer surface at the top, it condenses to form an aluminum thin film.
However, filament heating can contaminate the deposited aluminum film with sodium. Even trace amounts of sodium are enough to shift transistor threshold voltages and compromise reliability. As a result, this method is now rarely used outside of academic research institutions.
(2/5)Image
Oct 3, 2025 6 tweets 4 min read
Chemical vapor deposition (CVD) is a process that uses gaseous chemical precursors to undergo chemical reactions on the wafer surface, depositing a solid material as a thin film layer. It is widely utilized across the semiconductor and materials industries for depositing a diverse range of functional films, including:
⚆ Polycrystalline and Epitaxial Silicon Deposition
⚆ Dielectric Deposition: forming various insulating layers, such as oxides, oxynitrides, and low-k dielectrics.
⚆ Conductor Deposition: key metallic and conductive films, including W (Tungsten), Ti (Titanium), and Cu (Copper).
(1/6) 🧵Image These steps are crucial for controlling the film's properties and uniformity.
⚆ Reactant Delivery: Gaseous precursors are introduced into the reaction chamber, typically mixed with an inert carrier gas (like Ar or N), to ensure uniform flow dynamics and deposition.
⚆ Diffusion to the Substrate: The reactants diffuse through the boundary layer and approach the substrate surface.
⚆ Surface Adsorption: The gaseous precursors are then adsorbed (physically or chemically bonded) onto the heated surface of the substrate.
⚆ Surface Migration: The adsorbed raw materials migrate (move around) on the substrate surface.
(2/6)
Aug 27, 2025 4 tweets 2 min read
Umami, the "fifth taste," is the deep, savory flavor that gives broths, aged cheeses, and slow-cooked meats their mouth-coating depth. Whether you’re a Michelin-starred chef or a hobbyist home cook, maximizing the umami of your dishes is key to cooking delicious food. Umami is not just important in the kitchen – but is also the base of today’s high performance processors such as GPU servers. Click below to learn more about the seemingly unlikely relationship between the fifth taste and high performance chips.🧵Image We are of course talking about Ajinomoto Build up Film (‘ABF’). This is the dielectric insulator film that goes into the organic package substrate of most modern processors today. How did it come from Japanese seasonings heavyweight Ajinomoto?

Umami was first discovered in 1908 by Kikunae Ikeda while studying kombu broth, umami’s secret lies in glutamic acid. This the very compound that Japanese seasoning company Ajinomoto would later crystallize as MSG (monosodium glutamate).Image
Aug 11, 2025 14 tweets 5 min read
It's nice to see that OpenAI has updated their chart crime to accurately reflect the size of the 69% SWE-bench Verified score in their bar chart, and the achievement of GPT-5 at 74.9%

However, there is more to the story. OpenAI isn't running all 500 tests in SWE-bench Verified. 🧵Image
Image
What is 74.9% of 500? 374.5 of 500 correct? If we look at the subscript, OpenAI clearly says that they have only run 477 of the total 500 tests in the SWE-bench Verified dataset. Why? Image