This key chart from @SemiAnalysis_ appears to have been the key source for claims of "50,000 Hoppers" and more detailed disclosure on their CapEx buildup analysis ("$1.3B").
But the table has errors/inconsistencies. More significantly, key assumptions don't pass sanity checks.
1⃣ First thing you might notice is that it says 60,000 in the Total column.
But the A100s aren't "Hoppers". So the 50,000 is just the last three columns.
Ok, so far so good.
This is where it starts to get confusing.
The Total column only seems to SUM the first three columns.
Also "ASP" and "per GPU" are average numbers, so you cannot just sum it up. You need to do a weighted average. So the Total figures make no sense.
While the top "# of GPUs" line sums up all four columns, down below only the first three columns are added together.
So the $1.3B capex figure doesn't include the H100s?
Careless formula error? Or do we read more into it?
And then this last line seems to be a sum of Server CapEx and "Cost to Operation".
But TCO (4y Ownership) implies that it should be a per year figure (the label says "$m/hr").
I think it should be the sum of the Server CapEx + Cost to operation divided by 4 but hard to say.
Anyway I put together what I think is a corrected version of this chart if we are counting all 60,000 chips it is $1.6B and $640M p.a. of TCO.
If we are only doing "Hoppers" then it is $1.4B of CapEx and $545M p.a. of TCO.
2⃣ Ok all these might just be chalked up to basic first-year analyst spreadsheet errors and may or may not impact the ultimate analysis.
More substantively ... how credible is the "50,000 Hoppers" estimate in the first place?
The article links to a proprietary "Accelerator Model" that is paywalled so difficult for me to confirm rationale here beyond pure speculation ...
... but what I can do is run a sanity check based on basic understanding of the economics of fund management.
Some people out there are saying High Flyer managed $8B in funds, implictly assuming that those sums could support such a large CapEx number.
But that's not how the hedge fund business works.
What we know about High Flyer and DeepSeek:
▪️ High-Flyer was a quant fund with $8 billion in AUM.
▪️ DeepSeek was "self funded" by High-Flyer.
Can a $7B AUM hedge fund self fund $1.6B of capex? Highly unlikely.
The economics of a hedge fund are typically a management fee and performance fees.
1%/20% is typical.
So on $7B, High Flyer would generate an estimated $70M in management fees.
Performance fees are calculated only on gains. Note below High Flyer's performance of ~13% annualized since 2017.
However, note also how returns have basically been down since 2021. There are unlikely to have been significant performance fees since 2021.
So while strong fund performance through '21 — albeit likely on much lower AUM as it was ramping up — could have arguably funded the reported purchase of ~¥1B in GPUs in 2021, it is unlikely that the hedge fund itself could have continued self-funded that level of CapEx going forward.
$130M is already an extreme amount of CapEx for an $7B fund.
Blackstone, which has more than 100x the AUM (which drives revenue), has an annual capex budget of ~$250M.
Goldman Sachs generates close to 1,000x the revenue as High Flyer, and has an annual CapEx budget of ~$2.5B.
Similarly companies like Alibaba, Baidu and Bytedance generate tens of billions in revenue, orders of magnitude above High Flyer.
They can afford to spend billions buying nVidia chips and building out their own internal datacenters.
There is absolutely no way an $8 billion fund (with flat/negative returns over the period) could have "self-funded" another $1.6 billion in CapEx.
You know what it could have reasonably funded? 2,048 H800 datacenter worth ~$70M ...
... and even here that is quite an extreme CapEx ratio for a fund generating a total of ~$70M (maybe) of management fees that need to pay for fund operations themselves.
So the only possible way that High Flyer could have funded the purchase of another even just 10,000 H800s would have been to have raised secret outside funding for DeepSeek, which of course contradicts the article itself.
Of course, there are now rumors of that swirling around — maybe as people figure out the above math — but then we should just be up front that these estimates are based on pure unsubstantiated speculation and just leave it at that.
A model (even one riddled with basic formula errors) is only as good as its assumptions and it looks like the assumptions here of DeepSeek having access to "50,000 Hoppers" to build out v3 are built on an increasingly shaky foundation.
This is what it looks like with DeepSeek's actual reported cluster of 2,048 H800 GPUs.
These still seem high, but are at least within the realm of reason.
High-Flyer Quant Fund CEO Lu Zhenghe disclosed in an interview in 2020 that "70% of annual revenue is reinvested back into research and development" with strong implication that it is mostly production related, and not CapEx.
P.S. A very common Excel mistake is when you add a column and formula doesn't pick it up.
I suspect this is what happened here: Analyst added "H100" column, Total formula didn't pick it up + while top row is easy to spotcheck, bottom ones were missed
P.P.P.S. Call me when AGI can figure out Excel, amirite @abcampbell ?
P.P.P.P.S. This is just a very quick estimate of the lifetime revenue that High Flyer funds would have generated with accompanying assumptions.
~$400 million available for reinvestment into both R&D and CapEx.
As mentioned earlier, this estimated P&L would support the self-funded buildout of the initial dataclusters (up to 10,000 A100s) through 2021 but hard to see how it could have self-funded anything close to the implied OoM increase in CapEx.
The 10,000 A100s bet was already an extraordinary bet for Liang / High Flyer, with parallels to Elon Musk investing nearly all his PayPal sale proceeds into Tesla + SpaceX.
It's also inconsistent with Quant Fund CEO's comments in 2020 of redirecting reinvestment efforts at R&D (a.k.a. smart people) instead of CapEx.
And yes it makes much more sense that DeepSeek rented from the bigger players and didn’t even own the “2,048 H800s” that they mentioned in the v3 paper.
The H800s that they owned would have been for limited R&D purposes, like trying to hack the PTX code.
So bottom line is I think we violently agree the 50,000 number makes no sense.
@YouJiacheng @angelusm0rt1s @blob_watcher Until they close that loophole
@FarazKh78685502 @dylan522p @SemiAnalysis_ Just to save you the suspense - no Liang doesn’t have a trust fund
Free cash flow is a measure after capital expenditures and incorporates fluctuations in working capital.
Since founding, BYD's modus operandi has been to re-allocate every dollar of operating cashflow + as much capital as it can raise — as non-dilutively as possible — to support the needs of a rapidly growing business.
Frankly, it is financially illiterate to describe re-investment back into a growing business as "losses". Negative cashflow is a cashflow item and — especially if related to CapEx and working capital fluctuations (which I will address below) — is conceptually different from "losses" which is an income statement term.
A better approach is to consider how much long-term capital the company has raised an compare it to the scale of operating capacity that capital has enabled.
We can look at this from BYD's latest balance sheet, which I have summarized here:
To date, BYD has taken in a total of ¥340B in debt and equity funding.
This number includes ~¥82B of ST/LT borrowings and ¥258B of equity (or equity-like) funding.
The equity funding includes ¥107B of "undistributed profit" which is similar in concept to retained earnings (we'll get back to this point in a bit).
For all the flak about "lack of a social welfare safety net", China has one of the lowest pension/retirement ages in the world.
Further, it's hard to imagine that China — a "loud and proud" socialist country — not investing significantly into its social welfare programs in the coming decades, especially as it has officially crossed the "high income" threshold.
Jonathon highlights what I thought was the most interesting point out of the recent communique.
I tend to look at things from a company/sector perspective, and for me this represented the CCP's effort to adapt the vast administrative bureaucracy to align with the operational and realities of shifting sectoral priorities.
Property and infrastructure development were two of the key economic development priorities from the mid-2000s to the early 2020s.
Both property and infrastructure (especially "traditional" infrastructure like highways and bridges) were highly localized in nature. Land is central to both efforts, and land use falls under the jurisdiction of local governments.
Thus, it made sense for executive power to be decentralized to the local governments: Beijing simply cannot effectively manage land development in Guizhou.
This leads to a whole other set of issues, as there is a wide variation in local government competence. The manifestation of these issues has been widely discussed (e.g. LGFVs) but that is not the scope of this thread.
The question here is now that economic development priorities have shifted, how should the bureaucracy adapt from a centralization vs. de-centralization perspective?
And to do that again we need to understand how the differentiated nature of the new priority sectors map against this question of centralized vs. de-centralized administration.
This is important because there is a group of people that insist on confusing/conflating demand with consumption in the China context.
These are meaningfully distinct terms: Consumption is just one component of demand, alongside gross capital formation. The distinction is driven by GDP accounting definitions.
To further clarify, this is what I mean about the distinction between demand (in the context of supply) and "consumption" in the context of GDP accounting-driven split between gross capital formation / "investment" and expenditures / "consumption"
I can see that folks are already starting to wildly misinterpreting what this chart says and this seems like another one of these Rorscarch tests on China.
Let's nip this in the bud: this is IP share of services exports, which comes from Balance of Payments accounting.
That China does not license IP is not an "indictment", it's a statistical quirk that requires some deeper understanding of the BoP and how it maps against real-world trade and investment realities.
This was a complex/nuanced discussion on "overcapacity". Thanks for writing it @wstv_lizzi as it is an important topic.
It presented a number of interesting ideas which make sense on their own but I struggled to tie them together under a "grand unifying narrative" related to the "China Model".
The challenge of the "overcapacity" narrative is trying to use it to summarize "China Model" into a neat, compact narrative. But trying to summarize something as complex as China's economy into a neat model is exceedingly difficult.
(as an aside, the piece read like a writer struggling to force-fit an article within pre-defined narratives/framing set by an editor)
Two key problems I've found in the "overcapacity" debate that I'll go into more detail in this 🧵:
1⃣ Unclear/conflated definition of the term "overcapacity"
2⃣ As you drill down down from the macro/national level to individual sector level, you find many sector-specific idiosyncrasies that contradict core elements of "grand unifying" theme around "overcapacity".
1⃣ Defining "overcapacity" itself
"Overcapacity" has become a loaded word, especially when described in the context of the broad "China Model" in the current geopolitical environment.
In regular industrial/manufacturing usage, overcapacity is simply a state/condition where capacity utilization is below a certain "normal" threshold. This threshold may very by sector and different operating conditions.
Standard capacity utilization is defined not only by physical capital stock, but also by an active labor force operating on a normal shift schedule (typically 2 shifts per day, 5 days per week, or 80 hours / week).
But the way that it is being used in policy/economic/geopolitical discussion is in a much more undefined/amorphous way that goes well beyond the standard industry/mfg definition.
For instance, in this passage it is implicitly defined as production beyond domestic demand. But the implication here is that Chinese companies should not be able to have free access to global markets.
IOW how the term is used/defined appears to reflect implicit policy objectives of one particular side in the ongoing trade war.
Carefully inserted vocabulary like "deluge" and "sinister" peppered throughout the piece subtly signal how "overcapacity" is being normatively framed.