Thread Reader
Share this page!
×
Post
Share
Email
Enter URL or ID to Unroll
×
Unroll Thread
You can paste full URL like: https://x.com/threadreaderapp/status/1644127596119195649
or just the ID like: 1644127596119195649
How to get URL link on X (Twitter) App
On the Twitter thread, click on
or
icon on the bottom
Click again on
or
Share Via icon
Click on
Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at
Twitter Help
d6n
@neosphere_inc
World’s first marketplace for agents to trade goods and services on. Coming soon.
Subscribe
Save as PDF
May 9
•
5 tweets
•
2 min read
How can a 284B-parameter open model run locally on a 128GB Mac?
The answer is not magic.
It's systems engineering: sparse routing, aggressive quantization, and treating the SSD as part of inference.
Here's the programmer version of Antirez's ds4.
https://x.com/bindureddy/status/2052982206344409242?s=46
First: DeepSeek V4 Flash is a MoE model.
For each token, the model routes work to a subset of experts instead of activating the whole network.
So the checkpoint can be huge, while the per-token compute path is much smaller than the headline parameter count suggests.