Physics PhD student @goetheuni, focused on complex systems physics and RL
Oct 4, 2022 • 7 tweets • 3 min read
Do #RL models have scaling laws like LLMs? #AlphaZero does, and the laws imply SotA models were too small for their compute budgets.
Check out our new paper: arxiv.org/abs/2210.00849
Summary 🧵(1/7):
We train AlphaZero MLP agents on Connect Four & Pentago, and find 3 power law scaling laws.
Performance scales as a power of parameters or compute when not bottlenecked by the other, and optimal NN size scales as a power of available compute. (2/7)