visited my uncle in shenzhen. he’s a gpu smuggler.
he handed me this modified 5090 turbo and said:
"future of AI inference. 32GB each, 8 cards, 256GB total VRAM, under $30k. huaqiangbei doesn’t wait for nvidia."
huaqiangbei is really wild.💀
here’s what he told me: HGX servers are designed for training huge AI models—power-hungry, liquid-cooled, and crazy expensive. But for inference (running those models), it’s a different game: → You don’t need as much compute → You just need enough VRAM to fit the model
That’s why many AI infra builders use traditional x86 + PCIe servers:
• cheaper
• flexible
• easy to scale horizontally
But there’s a problem: consumer GPUs like 4090/5090 are big and awkward—2.5 to 4 slots wide.
Enter the blower-style card: double-slot, front-to-back airflow, server-friendly.
Each generation has one. But NVIDIA hates them.
Why? Because a rack full of 4090 blowers replaces an H100 server at 1/10 the cost.
NVIDIA cripples gaming cards on purpose:
🚫 No NVLink after 3090
🚫 Max 24GB VRAM
🚫 No official blower 4090/5090
So if you want dense GPU inference, you either go broke... or go underground.
In Huaqiangbei, engineers reverse-engineered the blower design.
Now they mass-produce 4090 blowers, unofficial and off-NVIDIA’s radar.
They're shipping globally, and account for 90%+ of all 4090 blowers in the wild.
This has accidentally made the 4090 the go-to choice for inference servers because it’s crazy cost-effective. Sure, it doesn’t have NVLink, but with some software wizardry, you can still pool the VRAM—24GB times 8 cards gives you 192GB total—to run big models under 200 billion parameters, or even FP4-quantized high-parameter models.
Huaqiangbei takes it even further. They’ve figured out a way to mod 4090s to have 48GB of VRAM. That means you can build an inference server with 384GB of VRAM for under 50k. And right now all those huaqiangbei GPU bros are busy producing blower-style 5090 cards, which my uncle believes will become the next big thing for affordable, high-performance inference servers.
• • •
Missing some Tweet in this thread? You can try to
force a refresh