Group Leader, NTT Research at Harvard University
CBS-NTT Program in "Physics of Intelligence" at Harvard
Dec 6, 2021 • 10 tweets • 7 min read
Q. What does Noether’s theorem tell us about the “geometry of deep learning dynamics”?
A. We derive Noether’s Learning Dynamics and show:
”SGD+momentum+BatchNorm+weight decay” = “RMSProp" due to symmetry breaking!