wordgrammer Profile picture
I am looking to build a large GPU cluster. 6’3 btw
Jan 27 21 tweets 5 min read
You all seemed to like my breakdown of DeepSeek’s technical reports. Here’s another DeepSeek thread, this time on company culture.

Did China work harder than the US? Are quants better AI researchers than techbros? Has China surpassed the US in innovation? DeepSeek is a subsidiary of High-Flyer, a Chinese hedge fund. However, they are a very new hedge fund. They are disrupters in the Chinese market. Image
Jan 27 9 tweets 2 min read
Okay. Thanks for the nerd snipe guys. I spent the day learning exactly how DeepSeek trained at 1/30 the price, instead of working on my pitch deck. The tl;dr to everything, according to their papers: Q: How did DeepSeek get around export restrictions?

A: They didn’t. They just tinkered around with their chips to make sure they handled memory as efficiently as possibly. They lucked out, and their perfectly optimized low-level code wasn’t actually held back by chip capacity. Image
Jan 26 7 tweets 2 min read
Okay, “how did DeepSeek get around Nvidia’s export restrictions?” Here’s a philosophy major’s thoughts on the situation. Image
Image
To understand what DeepSeek pulled off, we first have to understand what exactly the export restrictions do. Are they actually even that bad? Are the GPUs we sell to the Chinese actually that much worse than the US GPUs?
Dec 22, 2024 4 tweets 1 min read
It would be extremely funny if, after all resources are added up (GPUs and electricity for AI, food and water and education for humans), the cost of AGI is exactly the same as the cost of human intelligence Image This is actually something I mostly expect will happen. I don’t see any prima facie reason why “intelligence per unit of resource” would be higher for carbon than silicon, in completely optimal scenarios. And humans are fairly well-optimized