Tanishq Mathew Abraham, Ph.D. Profile picture
May 31, 2022 13 tweets 9 min read Read on X
Have you seen #dalle2 and #Imagen and wondered how it works?

Both models utilize diffusion models, a new class of generative models that have overtaken GANs in terms of visual quality.

Here are 10 resources to help you learn about diffusion models ⬇ ⬇ ⬇
1. "What are Diffusion Models?" by @ari_seff
Link →

This 3blue1brown-esque YouTube video is a great introduction to diffusion models!
2. "Introduction to Diffusion Models for Machine Learning" by @r_o_connor
Link → assemblyai.com/blog/diffusion…

This article provides a great deep-dive of the theoretical foundations for Diffusion Models.
3. "What are Diffusion Models?" by @lilianweng
Link → lilianweng.github.io/posts/2021-07-…

Admittedly much more mathematically dense and a little harder to approach but still an awesome resource!
4. "Generative Modeling by Estimating Gradients of the Data Distribution" by @YSongStanford
Link → yang-song.github.io/blog/2021/scor…

This *awesome* blog post (+Colab notebook) is a tutorial about the general class of score-based generative models, which includes diffusion models.
5. "An introduction to Diffusion Probabilistic Models" by @dasayan05
Link → ayandas.me/blog-tut/2021/…

This is another great blog post reviewing diffusion models and related score-based generative models.
6. "Diffusion Models as a kind of VAE" by @AngusTurner9
Link → angusturner.github.io/generative_mod…

Diffusion models have connections to multiple types of generative models. The previous resources talk about the score-based model connection, this one connects diffusion models to VAEs.
7. "Diffusion models are autoencoders" by @sedielem
Link → benanne.github.io/2022/01/31/dif…

This blog post provides a great review of diffusion models while also detailing the connection between diffusion models and *denoising* autoencoders.
8. "The new contender to GANs: score matching with Langevin Sampling" by @jm_alexia
Link → ajolicoeur.wordpress.com/the-new-conten…

Back in 2020, Alexia was already excited about diffusion models and provided us with a great blog post on the topic.
9. "Diffusion-based Deep Generative Models" by @jmtomczak
Link → jmtomczak.github.io/blog/10/10_ddg…

As part of his amazing "Introduction to deep generative modeling" blog series, Dr. Jakub Tomczak provides a great intro and code examples of diffusion models.
10. "Denoising Diffusion Probabilistic Models" by @hojonathanho et al.
Link → arxiv.org/abs/2006.11239

After going through the introductory resources shared here, reading the original papers will be quite informative too!

The DDPM paper was the breakout diffusion model paper.
I have tried to include a diversity of resources that provide different perspectives on diffusion models. Hopefully it provides you different ways about thinking and learning about the topic!
If you like this thread, please share! 🙏

I am also working on my own blog post about diffusion models 👀

Follow me to stay tuned!🙂 → @iScienceLuvr

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Tanishq Mathew Abraham, Ph.D.

Tanishq Mathew Abraham, Ph.D. Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @iScienceLuvr

Mar 27
"A Manga Guide to DeepSeek-V3 Technical Report"

from now on this is how I will post all papers 🤣 Image
Image
Image
Read 5 tweets
Feb 26
Diffusion language models are SO FAST!!

A new startup, Inception Labs, has released Mercury Coder, "the first commercial-scale diffusion large language model"

It's 5-10x faster than current gen LLMs, providing high-quality responses at low costs.

And you can try it now!
The performance is similar to small frontier models while achieving a throughput of ~1000 tokens/sec... on H100s! Reaching this level of throughput for autoregressive LLMs typically requires specialized chips. Image
It's currently tied for second place on Copilot Arena! Image
Read 5 tweets
Feb 17
Have you heard of Cleo?

Cleo was an account on Math Stack Exchange that was infamous for dropping the answer to the most difficult integrals with no explanation...

often mere minutes after the question was asked!!

For years, no one knew who Cleo was, UNTIL NOW! Image
Image
People noticed that the same few people were interacting with Cleo (asking the questions Cleo answered, commenting, etc.), a couple of them only active at the same time as Cleo as well.

People were wondering maybe someone is controlling all these accounts as alts Image
One of the accounts, Laila Podlesny, had an email address associated with it, and by trying to fake log into the Gmail and obtaining the backup recovery email, someone figured out that Vladimir Reshetnikov was in control of Laila Podlesny.

Based on other ineractions from Vladimir on Math.SE, it seemed likely he controlled Cleo, Laila, and couple other accounts as well.
Read 5 tweets
May 13, 2024
The livestream demo is not the only cool part about GPT-4o

Remember, GPT-4o is an end-to-end trained multimodal model!

No one is reading the GPT-4o blog post which highlights so many other cool features

SEE MORE FEATURES GPT-4o HAS ↓
First of all, GPT-4o is a much better language model. It's SOTA on a variety of LLM benchmarks:
And also good at chat arena evals
Read 11 tweets
May 8, 2024
AlphaFold3 is out!

This a diffusion model pipeline that goes beyond what AlphaFold2 did: predicting the structures of protein-molecule complexes containing DNA, RNA, ions, etc.

Blog post:
Paper:

A quick thread about the method↓blog.google/technology/ai/…
nature.com/articles/s4158…
AlphaFold2 was impactful but had one major limitation: it could only predict structures of proteins by itself.

In reality, proteins have various modifications, bind to other molecules, form complexes w/ DNA, RNA, etc.

Structure of these complexes can't be predicted by AF2
AF3 is similar to AF2, utilizing Template, MSA & Pairformer (similar to Evoformer from AF2) modules

However, amino acid + DNA/RNA/ion/ligand/post-translational modifications can be passed in unlike AF2

Also, the structure is directly generated with a diffusion model (3/11) Image
Read 12 tweets
Apr 30, 2024
Google announces Med-Gemini, a family of Gemini models fine-tuned for medical tasks! 🔬

Achieves SOTA on 10 of the 14 benchmarks, spanning text, multimodal & long-context applications.

Surpasses GPT-4 on all benchmarks!

This paper is super exciting, let's dive in ↓Image
The team developed a variety of model variants. First let's talk about the models they developed for language tasks.

The finetuning dataset is quite similar to Med-PaLM2, except with one major difference:

self-training with search

(2/14)Image
The goal is to improve clinical reasoning and ability to use search results.

Synthetic chain-of-thought w/ and w/o search results in context are generated, incorrect preds are filtered out, the model is trained on those CoT, and then the synthetic CoT is regenerated

(3/14)Image
Read 15 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(