Christopher Potts Profile picture
Stanford Professor of Linguistics and, by courtesy, of Computer Science. Member of technical staff @stanfordnlp and @StanfordAILab. Co-founder @ Bigspin AI.
Jun 1 17 tweets 6 min read
We take for granted that larger models are better than smaller ones, but why is this so? Our new paper, led by Jing Huang and @EkdeepL, traces this to a data-induced competition for resources (neurons), using formal analysis, idealized tasks, and real pretraining. Title card for a research paper. The title reads "Why Larger Models Learn More: Effects of Capacity, Interference, and Rare-Task Retention." Authors listed: Jing Huang, Daniel Wurgaft, Rachit Bansal, Laura Ruis, Naomi Saphra, David Alvarez-Melis, Andrew Lampinen, Christopher Potts, and Ekdeep Singh Lubana. A Goodfire logo appears below the names. Author affiliations: Stanford University, Kempner Institute at Harvard University, MIT, and Anthropic. Link to the paper: arxiv.org/abs/2605.29548