Latest Twitter Threads by @iam_shanmukha on Thread Reader App

Shanmukha Vishnu

@iam_shanmukha

I enjoy breaking things and Fixing them overnight from Android to Robots. 'Corrupt. Rebuild. Repeat'

May 1 • 7 tweets • 2 min read

Just achieved 60 tokens/sec with Qwen3.6-35B-A3B (35B MoE) on RTX 4070 12GB
Full 128k context + Q4_K_M + running agents daily.
Here’s the complete step-by-step from scratch

1. Build llama.cpp with CUDA

```
cd ~/llama.cpp
git pull
make clean
LLAMA_CUDA=1 make -j$(nproc)
cd build && make -j$(nproc)
```

Share this page!

Enter URL or ID to Unroll