CEO @Recursiveai. Former @DeepMind.
Generalization and robustness in AI.
Occasional writer, cook, generative modeller.
2 subscribers
Jul 7, 2022 • 14 tweets • 4 min read
Research Engineers are the people training and tuning state-of-the-art models like GPT-3, DALL-E, Imagen, Alphafold, etc
I spent 100s of hours coaching and mentoring junior and mid-career REs.
Here’s how you become an RE at a top-tier institution (Google, Meta, OpenAI,…) 🧵
REs need to be jacks of all trades, master of most.
They are the glue between theory papers and the implementations of those algorithms running on advanced chips in the cloud.
Jun 16, 2022 • 16 tweets • 5 min read
The next big breakthrough in AI will come from hardware, not software.
Training giant models like PaLM already require 1000s of chips consuming several MW, and we will probably want to keep scaling these up several orders of magnitude. How can we do it? a 🧵
All computations done by a neural network are ultimately a series of floating point operations.
To do a floating point operation, two (or three) numbers need to be loaded from memory into a circuit that performs the calculation, and the result needs to be stored back in memory.