https://t.co/mcuQvV7YOC
proud father of 16 A100s & 16 H100s
flirting with LLMs, tensor core maximalist
x @GoogleDeepMind @Microsoft
Apr 16, 2023 • 7 tweets • 3 min read
[🤖 This is BIG!] The best truly open-source ChatGPT alternative just came in! OpenAssistant! In a user study, they showed that OpenAssistant replies are on par with ChatGPT (48.3 vs 51.7%)! 🤯
1/ 👇🧵
Additionally, I assume the model will suffer far less from "corporate speech" which will make it way more fun to use! Let me know how you like it once you try it down in the comment section. :))
2/
Apr 15, 2023 • 4 tweets • 2 min read
Insightful new blog post by @chipro covers a broad set of LLM-related topics fairly concisely.
* What to do with LLMs' output issues due to the inherent ambiguity
1/ 🧵👇
of natural language (e.g. your output format might be violated, your outputs could vary without you changing the input, etc.)
* Prompt versioning (similar to how you version code, data...) and optimization (CoT - chain of thought prompting, self-consistency technique, etc.)
2/
Apr 15, 2023 • 10 tweets • 3 min read
[IMPORTANT ❗] I feel a sense of duty to warn people about social media posts they'll be seeing in AI in the upcoming months.
Web3/crypto/salesy vibes came to AI big time. Huge monetary incentives are at play so many of them flocked to the space...
1/ 🧵👇
(not judging it's just a fact - also there are as always exceptions!).
I'm a techno-optimist but there is hyping and there is hyping so hard that you're basically lying and spreading misinformation.
A concrete example:
People saying AutoGPT, a project that started...
2/
Mar 14, 2023 • 16 tweets • 5 min read
BIG LIFE ANNOUNCEMENT: I'm leaving @DeepMind to start my own company. I'm 28 now. This is a start of a new life chapter.
I'm both happy and sad.
❤️ I'm happy because I've been planning on starting my own company ever since I graduated from college back in 2017.
1/ (MEGA 🧵)
In the long run, I always felt that's going to be the best way to maximize my positive impact on the world.
My plan was to gain some real-world experience in the top tech companies before I go on to start my own thing.
That moment has finally come.
2/
Dec 28, 2022 • 25 tweets • 10 min read
[🤖 Build time! 🧠] I'm so excited to announce my new project: Andrew Huberman podcast transcripts! 🎉🥳
[16:55 - 58:00] They introduced a prototype of their humanoid robot - Optimus. Only a concept last year - and now a reality. The progress was incredibly fast!
1/
Throughout the presentation, they stressed that there are so many parallels between building a humanoid robot and building a self-driving car. That's why the progress was so fast - they could reuse the supply chain, the training infra, etc.
2/
Sep 1, 2022 • 4 tweets • 3 min read
If you want to understand how @StableDiffusion works behind the scenes I just made a deep dive video on it walking you through the codebase and papers step by step.
1. First stage autoencoder training (with KL regularization)
2. Latent Diffusion Model training (UNet + cond model)
2/
Aug 30, 2022 • 4 tweets • 2 min read
[💥 Open-sourcing Stable Diffusion scripts 💥] Folks if you missed this one I open-sourced a script that should make it super easy to get started playing with stable diffusion!
1/
It supports generating a diverse set of images, interpolating in the latent space, and thus creating (mostly) smooth transitions in the image space!
The image you see above was generated using the prompt:
"a painting of an ai robot having an epiphany moment" 🤖🤖🤖
2/
Aug 29, 2022 • 5 tweets • 4 min read
[🤯 Stable Diffusion 💥] If you wanted to get started with Stable Diffusion this video is for you!
1/ 👇🧵
This is what perplexity vs wall-clock time looks like when training LLMs. 😅
You can almost taste that suffering
2/
Aug 3, 2022 • 5 tweets • 2 min read
During 2020 I started logging my ML journey that eventually led to me landing a job at DeepMind - and I'm so happy I've done it!
For multiple reasons:
* I forced myself to distill everything I've learned, and that compression/reflection solidified my knowledge
1/ 👇🧶
* I feel I helped others going on a similar path (although the logs are fairly meta as well - it's a more general learning blueprint) as well as the future me!
* It's a nice historical document and a public artifact that I am proud of.
2/
Aug 1, 2022 • 6 tweets • 2 min read
[learning machine learning 🧠] Don't fall into the same trap as many - namely trying to overengineer your curriculum when you're just getting started (and later as well).
You'll just end up with a decision making paralysis and eventually you'll end up giving up...
1/ 👇🧶
- which is the only bad outcome (unless it comes from a place of deep self-awareness).
You ask yourself the following questions:
What's the best ML course out there? Should I do X, Y or Z? The reputable guy on Reddit said Y, @ylecun said Z, and my professor said W.
2/
Jul 31, 2022 • 12 tweets • 3 min read
I get asked a lot about what does it take to land a job at DeepMind or any other world-class AI industry lab.
For those of you that are unaware of it I wrote a detailed blog on that topic and shared my personal journey here: link.medium.com/dV0H7fay6rb
If I could summarize..
1/ 🧶
..my tips the main ones would be:
1) Have a lot of tenacity - it takes a lot of hard work, patience and consistency. The good thing is - this can be learned/practiced! For me personally I built this part of my personality through sports (calisthenics, running, martial arts..
2/
Jul 30, 2022 • 8 tweets • 2 min read
I wish I learned how to learn while I was still a kid. For some reason they don't teach us this in schools and everyone is left to figure it out on their own - which is sad, as most people never take the time to learn this.
I strongly recommend you read it. Taking a step back from "actual learning" to boost your learning effiency is a time well spent.
IMHO, things that will benefit everyone should be...
2/
Jul 29, 2022 • 7 tweets • 3 min read
If you truly want to become proficient with machine learning (I really don't like the word expert) try to get out from the "going through the newest courses and books" phase as soon as possible.
Too many people keep on reading the newest books that come out...
1/
(and same for courses), thinking they are now up to date with ML world whereas in reality they are "light years" behind (things move fast around here 😇).
Try to get into the paper reading and replication phase as fast as you can without skipping the necessary steps.
2/
Jun 8, 2022 • 4 tweets • 3 min read
Very much enjoyed doing this podcast with @LeiserNeil from AI Stories, discussing what makes one a good ML engineer, imposter syndrome, my career path and challenges I faced, my path to @DeepMind, how to learn ML, and much more!
Watch here:
1/ 🧵
It's my first podcast ever and it was already long due! :))
Neil reached out already in January this year but since I had so much going on (moving to London, starting a new job + bunch of personal things) I had to postpone it until now - but now it's up!
2/
Jun 8, 2022 • 10 tweets • 3 min read
[🧠 Getting started with biology 🧠] I just finished the best MOOC course I've done in my life: "Introduction to Biology - The Secret of Life" offered on edX by the famous @eric_lander (Humane Genome project guy).
I've collected a ton of notes over the last month or so.
1/ 🧵👇
My idea is to start sharing my learnings and notes over the next weeks - do let me know down in the comments whether you'd find that useful!
A bit about the course:
* You'll learn the fundamentals of biochemistry, genetics/genomics, and more - enough to understand...
2/
Jun 7, 2022 • 12 tweets • 3 min read
[🧠 Interesting read 🧠] "Can a Biologist Fix a Radio? or, What I Learned while Studying Apoptosis" paper. So why is it interesting?
The state of AI seems to be somewhere in the middle between the experimental biologist's approach, as described in this paper, and...
1/
...the classical engineering approach.
The author observes and describes funny (and all too familiar?) boom and bust cycles that happen in biology while trying to understand complex phenomena with the promise of discovering a miracle drug that will solve all our problems.
2/
Apr 8, 2022 • 7 tweets • 4 min read
Everyone is hyped up about @OpenAI's DALL-E 2 model atm and most people have noticed this cryptic "signature" code at the end of their images - but how many of you understand what it stands for?
I did some research and found the answer! 🔎🧠 I couldn't believe it.
Thread 👇🧵1/
I've decided to do some deciphering. Is it a simple color code? What are @sama et al. trying to tell us?
I opened up my Photoshop and selected the color picker tool.
I first extracted 5 RGB tuples (R, G, B) from the 5 colors of the signature.
2/
Jan 9, 2022 • 7 tweets • 3 min read
[🧠 Paper Summary 📚] An interesting paper was recently published to arxiv: "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets" (although it originally appeared in May 2021).
The main idea is this: 1/ 🧵
If you have an overparametrized neural network (more params than the # of data points in your dataset) and you train it way past the point where it has memorized the training data (as suggested by the low training loss, and high val loss), all of a sudden the network will...
2/