Tweet

How to get URL link on Twitter App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Jeremy Howard

@jeremyphoward

May 4 • 11 tweets • 5 min read Twitter logo

Read on Twitter

There's a new programming language in town - it's Mojo! I'm more than a little excited about it. It's Python, but with none of Python's problems.

You can write code as fast as C, and deploy small standalone applications like C.

My post is below, and a 🧵
fast.ai/posts/2023-05-…

Python is the language that I have used for nearly all my work over the last few years. It is a beautiful language. It has an elegant core on which everything else is built.

But it comes with a downside: performance. It's thousands of times slower than C.

Python programmers learn to avoid using Python for the implementation of performance-critical sections, instead using Python wrappers over C, FORTRAN, Rust, etc.

But this “two-language” approach has serious downsides. It's complex, hard to debug or profile, & hard to deploy.

@PyTorch

Also, it leaves a lot of performance on the table. That's why @PyTorch, @TensorFlow, and #jax don't use Python for anything fast - they use separate compilers for Python DSL's or subsets.
pytorch.org/tutorials/inte…

Mojo is "syntax sugar for MLIR". It has a small foundation which basically provides a simple way to access MLIR from a Python-like language, and then everything else is written on top of that.
mlir.llvm.org

A Mojo trick is to opt in to a faster “mode” by using “fn” instead of “def” - as a result Mojo can create optimised machine code to implement your function. & use “struct” instead of “class” to tightly pack your attrs in memory, to avoid pointer chasing

@clattner_llvm

Mojo isn't finished - but what's there is already mind-blowing, and it has been created by a very small team in a very short time. This shows the benefits of using carefully architected foundations, based on @clattner_llvm's years of experience with Clang, LLVM, and Swift.

@JuliaLanguage

There are lots of other great alternatives to getting high performance and the benefits of elegant programming language, including @JuliaLanguage, #cython, and @numba_jit.

These all have their place, but they're not perfect. Eg here's my thoughts on Julia

In particular, Mojo is the first to solve deployment.

A Mojo app can be compiled into a small, standalone, fast-starting binary. This is a game changer! Think about the things you could do if you could create small fast tools quickly & easily, & distribute them in a single file.

Mojo is *far more* than a language for AI/ML applications. It’s actually a version of Python that allows us to write fast, small, easily-deployed applications that take advantage of all available cores and accelerators!
modular.com/mojo

If you want to know more or have any questions, check out the full blog post, which has much more detail than this thread:
fast.ai/posts/2023-05-…

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @jeremyphoward

Jeremy Howard

@jeremyphoward

Apr 28

https://twitter.com/aidangomezzz/status/1651053357719535622

I'm seeing a lot of people confused about this - asking: what exactly is the problem here? That's a great question!

Let's use this as a learning opportunity and dig in. 🧵

https://twitter.com/aidangomezzz/status/1651053357719535622

First, I've seen that one of the most common responses is that anyone criticising the original post clearly doesn't understand it and is ignorant of how language models work.

Aidan Gomez is an author of the Transformers paper, and is CEO of Cohere. I think he understands fine.

So why haven't we seen clear explanations of why "checking for sudden drops in the loss function and suspending training" comment is so ludicrous?

Well, the problem is that it's such a bizarre idea that it's not even wrong. It's nonsensical. Which makes it hard to refute.

Read 16 tweets

Jeremy Howard

@jeremyphoward

Apr 28

Sometimes it feels like NLP papers prior to 2020 don't exist...

(Bidirectional autoregressive models have been common for many years, and were for instance used in ULMFiT.)

AFAIK the first bidirectional RNN was from 1997. (Although it was popularised in Alex Grave's classic 2013 paper "Generating Sequences With Recurrent Neural Networks" I think.)
ieeexplore.ieee.org/document/650093

@NguynTu24128917

@NguynTu24128917 might be worth updating your paper with some extra citations and background around this?

Read 4 tweets

Jeremy Howard

@jeremyphoward

Apr 5

Our new course, "From Deep Learning Foundations to Stable Diffusion", is finally done after 8 months of work!!!

With >30 hours of video content (all free, no ads!), you'll learn how to create and train a Stable Diffusion model starting from pure Python 🧵
fast.ai/posts/part2-20…

This field was developing rapidly as we were developing and teaching the course, so many lessons include a walk-through of a paper that had just been released.

We also implement key papers that aren't in Stable Diffusion, such as Karras et al (2022)
arxiv.org/abs/2206.00364

@StabilityAI

I wouldn't have been able to keep up with all this research without the fantastic help of folks from @StabilityAI, @huggingface, and the generative AI community. @iScienceLuvr and @johnowhitaker even joined me to teach some lessons together, which was a blast!

Read 11 tweets

Jeremy Howard

@jeremyphoward

Apr 3

There's a lot of folks under the misunderstanding that it's now possible to run a 30B param LLM in <6GB, based on this GitHub discussion.

This is not the case. Understanding why gives us a chance to learn a lot of interesting stuff! 🧵
github.com/ggerganov/llam…

@JustineTunney

The background is that the amazing @JustineTunney wrote this really cool commit for @ggerganov's llama.cpp, which modifies how llama models are loaded into memory to use mmap
github.com/ggerganov/llam…

Prior to this, llama.cpp (and indeed most deep learning frameworks) load the weights of a neural network by reading the file containing the weights and copying the contents into RAM. This is wasteful since a lot of bytes are moving around before you can even use the model

Read 25 tweets

Jeremy Howard

@jeremyphoward

Nov 21, 2022

Intriguing new study from the amazing Adriaan Bax and team suggests that most covid deaths resulted from (preventable) snoring droplets rather than (unpreventable) microaspiration. This could be a game changer.

No time for the paper? Then read this 🧵!
sciencedirect.com/science/articl…

Infection of the lung with SARS-CoV-2 is a two-step process: first the nose / throat, then the lungs. Postulated, but physically implausible, mechanism for step 2 involves “microaspiration”

Microaspiration during sleep is the accepted “hand-waving” mechanism for transfer of microbes from the oral cavity into the lung

Read 8 tweets

Jeremy Howard

@jeremyphoward

Oct 24, 2022

@fastdotai

After just 2 weeks of the new @fastdotai course, our students are already making research advances in Stable Diffusion.

@sebderhy developed a novel yet simple modification to classifier-free guidance that gives better results (previous approach on left, new approach on right)

@fastdotai

@fastdotai @sebderhy I think in this case there's room to improve the results even further. The basic idea being tackled is that the "old way" of doing guidance actually increased the scale of the update (especially if the difference between conditional and unconditional embeddings is large)

So the trick is to add the guidance without changing the scale.

Read 7 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter Twitter Thread URL to Unroll

Jeremy Howard

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @jeremyphoward

Jeremy Howard

Jeremy Howard

Jeremy Howard

Jeremy Howard

Jeremy Howard

Jeremy Howard

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!