Jeremy Howard Profile picture
🇦🇺 Co-founder: @FastDotAI ; Hon Professor: @UQSchoolITEE ; :wq
💙💥The D In Democrat💥💙 Profile picture Ross Grayson, MPH, CIH Profile picture Remco Frijling Profile picture Angela Rogers Profile picture Aswath Rao Profile picture 14 subscribed
May 31 11 tweets 3 min read
I teamed up with philosopher @sethlazar and AI impacts researcher @random_walker to investigate the "Statement on AI Risk" that proposes:

"Mitigating the risk of extinction from AI should be a global priority".

tl;dr: We're not convinced.🧵… One thing I haven't seen mentioned elsewhere: the original request for people to sign the letter had the subject line "Invitation to join Hinton, Bengio & Amodei".

That's pretty powerful social status signaling being used to attract signatories. Image
May 4 11 tweets 5 min read
There's a new programming language in town - it's Mojo! I'm more than a little excited about it. It's Python, but with none of Python's problems.

You can write code as fast as C, and deploy small standalone applications like C.

My post is below, and a 🧵… Python is the language that I have used for nearly all my work over the last few years. It is a beautiful language. It has an elegant core on which everything else is built.

But it comes with a downside: performance. It's thousands of times slower than C.
Apr 28 16 tweets 3 min read
I'm seeing a lot of people confused about this - asking: what exactly is the problem here? That's a great question!

Let's use this as a learning opportunity and dig in. 🧵 First, I've seen that one of the most common responses is that anyone criticising the original post clearly doesn't understand it and is ignorant of how language models work.

Aidan Gomez is an author of the Transformers paper, and is CEO of Cohere. I think he understands fine.
Apr 28 4 tweets 1 min read
Sometimes it feels like NLP papers prior to 2020 don't exist...

(Bidirectional autoregressive models have been common for many years, and were for instance used in ULMFiT.) Image AFAIK the first bidirectional RNN was from 1997. (Although it was popularised in Alex Grave's classic 2013 paper "Generating Sequences With Recurrent Neural Networks" I think.)
Apr 5 11 tweets 6 min read
Our new course, "From Deep Learning Foundations to Stable Diffusion", is finally done after 8 months of work!!!

With >30 hours of video content (all free, no ads!), you'll learn how to create and train a Stable Diffusion model starting from pure Python 🧵… This field was developing rapidly as we were developing and teaching the course, so many lessons include a walk-through of a paper that had just been released.

We also implement key papers that aren't in Stable Diffusion, such as Karras et al (2022)
Apr 3 25 tweets 7 min read
There's a lot of folks under the misunderstanding that it's now possible to run a 30B param LLM in <6GB, based on this GitHub discussion.

This is not the case. Understanding why gives us a chance to learn a lot of interesting stuff! 🧵… The background is that the amazing @JustineTunney wrote this really cool commit for @ggerganov's llama.cpp, which modifies how llama models are loaded into memory to use mmap…
Nov 21, 2022 8 tweets 3 min read
Intriguing new study from the amazing Adriaan Bax and team suggests that most covid deaths resulted from (preventable) snoring droplets rather than (unpreventable) microaspiration. This could be a game changer.

No time for the paper? Then read this 🧵!… Infection of the lung with SARS-CoV-2 is a two-step process: first the nose / throat, then the lungs. Postulated, but physically implausible, mechanism for step 2 involves “microaspiration” Image
Oct 24, 2022 7 tweets 4 min read
After just 2 weeks of the new @fastdotai course, our students are already making research advances in Stable Diffusion.

@sebderhy developed a novel yet simple modification to classifier-free guidance that gives better results (previous approach on left, new approach on right) Image @fastdotai @sebderhy I think in this case there's room to improve the results even further. The basic idea being tackled is that the "old way" of doing guidance actually increased the scale of the update (especially if the difference between conditional and unconditional embeddings is large)
Oct 20, 2022 8 tweets 3 min read
I got a special surprise for you all...

We just released the first 5.5 hours of our new course "From Deep Learning Foundations to Stable Diffusion", for free!… Lesson 9 starts with a tutorial on how to use pipelines in the Diffusers library to generate images. We show some nifty tweaks like guidance scale and textual inversion.

The second half of the lesson shows the key concepts involved in Stable Diffusion.
Sep 27, 2022 9 tweets 4 min read
Awesome news: 6 chapters of our (with @GuggerSylvain) fastai book are now freely available online, beautifully formatted thanks to @quarto_pub.

The chapters I've selected will give you a great understanding of the foundations of deep learning.🧵 Chapter 1 is an introduction to the amazing world of neural networks and deep learning, and shows how to train and use a neural net for many applications across vision, language, tabular analysis, and collaborative filtering.…
Sep 16, 2022 8 tweets 4 min read
Big news: we're launching a new course in <4 weeks. "From Deep Learning Foundations to Stable Diffusion".

Bigger news: for this course, we're teaming up with @StabilityAI!

AFAIK, this is the 1st course that covers every method used in Stable Diffusion.… To take stable diffusion to the next level, you need to deeply understand what’s under the hood. Then you can craft your own loss functions, initialization methods, multi-model mixups, and more, to create totally new applications that have never been seen before
Aug 26, 2022 7 tweets 4 min read
Have you ever tried to use @ProjectJupyter with git/@github? Did it drive you crazy because nothing worked right?

🔥It drove *us* crazy, so we fixed it.🔥
1/🧵… Notebooks are a powerful tool-an ideal environment for exploring data and code, writing programs, and documenting the results.

But when collaborating with, this goes up in smoke, because git makes notebooks unusable. Literally.
Jul 28, 2022 10 tweets 8 min read
Our biggest launch in years: nbdev2, now boosted with the power of @quarto_pub!

Use @ProjectJupyter to build reliable and delightful software fast. A single notebook creates a python module, tests, @github Actions CI, @pypi/@anacondainc packages, & more… What can you create with #nbdev with @quarto_pub? Well, for starters, check out our beautiful new website.

Created with nbdev, of course!
Jul 28, 2022 10 tweets 4 min read
My keynote at JuliaCon, "Standing Out - What makes a programming language successful", is now available!

In it, I describe why I want Julia to succeed, and what is going to be needed to make this happen.
I coded in dozens of languages over the last few decades, and I'm starting to get a sense of what are the features that make some languages more successful than others.
Jul 23, 2022 4 tweets 2 min read
I've just published over 20 hours of tutorials and live coding showing how to: install python the right way; set up a terminal; write shell scripts; use vim; use a remote Jupyter server; use git, github, tmux, and ssh; use the python debugger; and more! 🧵… Every session includes a forum discussion and also has youtube timestamps so you can see what's covered and jump to whatever interests you. Here's the forum links:…
Jul 21, 2022 12 tweets 6 min read
After 2 years, Practical Deep Learning for Coders v5 is finally ready! 🎊

This is a from-scratch rewrite of our most popular course. It has a focus on interactive explorations, & covers @PyTorch, @huggingface, DeBERTa, ConvNeXt, @Gradio & other goodies 🧵 For details on what's in this new course, check out the launch post:…
Jul 8, 2022 14 tweets 7 min read
I led the team that studied mask efficacy in early 2020 and published our results in the Proceedings of the National Academy of Science.

I spent three months earlier this year revisiting this topic, and today I'm publishing my notes and links here:… An admission: these notes were meant to be the basis of another academic paper, and I gave up on it. In Jan 2022 when I finished this research, I looked around, and it seemed like no-one much cared about avoiding COVID any more.

So I figured it wasn't worth spending time on.
Jul 7, 2022 5 tweets 2 min read
If you use any library, then you'll want to install the latest fastcore, because it's >3x faster to import!

For instance, that means that git actions in an nbdev repo are now ~3x faster 😀

I used a neat but simple trick to speed it up... Where things got slow is if you imported `fastcore.xtras`, which is a module that wraps a bunch of python stdlib functionality into some convenient interfaces. It's used by ``, `fastcore.parallel`, and `fastcore.utils`, so it comes up a lot.
Jun 27, 2022 15 tweets 6 min read
One of my fave chapters of "Practical Deep Learning for Coders", co-written with @GuggerSylvain, is chapter 8. I've just made the whole thing available as an executable notebook on Kaggle!

It covers a lot of critical @PyTorch & deep learning principles 🧵… The chapter looks at the "matrix completion" problem at the heart of recommendation systems -- e.g what would you guess are the missing values in this matrix showing what rating users gave movies?
Jun 25, 2022 5 tweets 3 min read
Often we want to predict more than one dependent variable in a neural network. For instance in the current @kaggle "Paddy Doctor" competition there's both paddy disease and rice variety provided for each image.

It's easy w/ fastai DataBlock! Notebook &🧵:… Image In the pic in the previous tweet, you can see that each image is associated with two outputs: disease, and variety.

Here's all the code needed to create DataLoaders which provide that data to a model (see the notebook for details on what every line does) Image
Jun 19, 2022 5 tweets 4 min read
Big release of fastai today - @huggingface Accelerate is now supported for distributed training thanks to @TheZachMueller and @GuggerSylvain. That means you can now do distributed training in a notebook! 1/🧵 Here's all you need to train imagewoof with xresnet50 and mixup augmentation, on multiple GPUs. Run with `accelerate launch`…