Latest Twitter Threads by @zacharynado on Thread Reader App

Jan 19, 2023 • 10 tweets • 4 min read

Excited to announce our Deep Learning Tuning Playbook, a writeup of tips & tricks we employ when designing DL experiments. We use these techniques to deploy numerous large-scale model improvements and hope formalizing them helps the community do the same! github.com/google-researc…

Many of the practices we describe may seem obvious, but we have found time and time again when talking to colleagues that they missed one or more of the insights we discuss. Also, we haven’t seen them all formalized in one place before.

May 27, 2021 • 10 tweets • 4 min read

A thread on our latest optimizers work! We tune Nesterov/Adam to match performance of LARS/LAMB on their more commonly used workloads. We (@jmgilmer, Chris Shallue, @_arohan_, @GeorgeEDahl) do this to provide more competitive baselines for large-batch training speed measurements

We are **not** trying to prove that any optimizer is better than any other (more on that later). However, we believe that well tuned baselines are very important, especially in optimization where there are so many confounding factors.

Share this page!

Enter URL or ID to Unroll