Zachary Nado Profile picture
Research engineer @googlebrain. Past: software intern @SpaceX, ugrad researcher in @tserre lab @BrownUniversity. All opinions my own.
Jerome Ku Profile picture 1 subscribed
Jan 19, 2023 10 tweets 4 min read
Excited to announce our Deep Learning Tuning Playbook, a writeup of tips & tricks we employ when designing DL experiments. We use these techniques to deploy numerous large-scale model improvements and hope formalizing them helps the community do the same! github.com/google-researc… Many of the practices we describe may seem obvious, but we have found time and time again when talking to colleagues that they missed one or more of the insights we discuss. Also, we haven’t seen them all formalized in one place before.
May 27, 2021 10 tweets 4 min read
A thread on our latest optimizers work! We tune Nesterov/Adam to match performance of LARS/LAMB on their more commonly used workloads. We (@jmgilmer, Chris Shallue, @_arohan_, @GeorgeEDahl) do this to provide more competitive baselines for large-batch training speed measurements Image We are **not** trying to prove that any optimizer is better than any other (more on that later). However, we believe that well tuned baselines are very important, especially in optimization where there are so many confounding factors.