senior sde, junior ai researcher. powered by chaos, backed by god.
prev: physics, distributed database kernel, cloud infra, robotics learning.
hipster.
2 subscribers
May 14 • 8 tweets • 2 min read
visited my uncle in shenzhen. he’s a gpu smuggler.
he handed me this modified 5090 turbo and said:
"future of AI inference. 32GB each, 8 cards, 256GB total VRAM, under $30k. huaqiangbei doesn’t wait for nvidia."
huaqiangbei is really wild.💀
here’s what he told me: HGX servers are designed for training huge AI models—power-hungry, liquid-cooled, and crazy expensive. But for inference (running those models), it’s a different game: → You don’t need as much compute → You just need enough VRAM to fit the model