Samip Profile picture
working on generalization. https://t.co/zsptJBlblS
Mar 9 5 tweets 4 min read
1/ We released NanoGPT Slowrun 10 days ago. Already at 8x data efficiency and improving fast, so we're doubling down.

Announcing Slowrun Research and Slowrun Cluster: our open research effort to collaborate with researchers with crazy ideas, and a serious cluster to back it. 2/ Why? Compute scales. Data doesn't. Current scaling laws require both to grow proportionally, and that's a big problem. We need fundamentally new learning algorithms in the limited data, practically infinite compute settings.

Slowrun is already surfacing new data-efficient methods, but we want to aim for at least 100x data efficiency this year and that will take a lot more exploration.