Research Scientist at Samsung - SAIT AI Lab (SAIL). PhD @uoguelph_mlrg.
Oct 26, 2021 • 12 tweets • 5 min read
Do we still need SGD/Adam to train neural networks? Based on our #NeurIPS2021 paper, we are one step closer to replacing hand-designed optimizers with a single meta-model. Our meta-model can predict parameters for almost any neural network in just one forward pass. (1/n)
For example, our meta-model can predict all ~25M parameters of a ResNet-50 and this ResNet-50 will achieve ~60% on CIFAR-10 without any training. When our meta-model was training, it did not observe any network close to ResNet-50. (2/n)