manic pixie tech girl backed by god. dev & ai research. good will hunting.
prev: physics, distributed system, cloud infra, robotics learning.
2 subscribers
May 14, 2025 • 8 tweets • 2 min read
visited my uncle in shenzhen. he’s a gpu smuggler.
he handed me this modified 5090 turbo and said:
"future of AI inference. 32GB each, 8 cards, 256GB total VRAM, under $30k. huaqiangbei doesn’t wait for nvidia."
huaqiangbei is really wild.💀
here’s what he told me: HGX servers are designed for training huge AI models—power-hungry, liquid-cooled, and crazy expensive. But for inference (running those models), it’s a different game: → You don’t need as much compute → You just need enough VRAM to fit the model