Divam Gupta Profile picture
Nov 1, 2022 12 tweets 4 min read Read on X
DreamBooth is becoming popular for creating custom Stable Diffusion models using your images.

Here is a beginner friendly thread on how it works: 🧵 Image
First, what does DreamBooth do?

- It takes few images of a particular subject

- Then it teaches the model to generate more images of that subject in different styles.

For eg. you give 20 normal images of yourself and get a funky painting of yourself in return.

(ft. @ylecun ) Image
Models like Stable Diffusion already have strong priors for generating various things and combining them with several styles.

All you have to do is somehow add one additional subject to the model.

This is done by finetuning a pre-trained model on your images.
Let’s use the token ‘X’ as a unique identifier to represent our subject.

Along with the training images, you also need the class name of the subject.

So, if you are training the model on images of yourself, the class name will be “Person”, since you are a person.
Rather than using the prompt “An image of X”, it is better to use the prompt “An image of X person”.

This helps the model to use the semantics of a generic person while generating images of yourself. Image
We only have a few images of the ‘X’. So we don’t want the model to forget the knowledge of other things when training on ‘X’

This is fixed by leveraging a lot of images from the parent class of ‘X’

So, in our eg., the model will also use several images containing any person. Image
These extra images are used for the prior-preserving loss, which preserves the semantic knowledge of the class.

It encourages the model to generate diverse things belonging to the subject’s class. Image
In our example, the model will be trained on two objectives:

1) Given the prompt “A X person” generate images of X. → using your images

2) Given the prompt “A person” generate images of a person. → using general images of a person. Image
Now the question is, what should ‘X’ actually be?

DreamBooth uses a sequence of rare tokens in place of the subject ‘X’

These rare tokens are very unlikely to appear in prompts, so they won’t interfere with the prompt containing ‘X’
To produce high res images, DreamBooth fine-tunes a standard super-resolution model on the input images.

This is used to increase the resolution of the generated images.
Originally DreamBooth was implemented with ImageGen, but open-source versions are implemented with Stable Diffusion :

github.com/XavierXiao/Dre…
Here are some more examples of DreamBooth in action: ImageImageImage

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Divam Gupta

Divam Gupta Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @divamgupta

Jan 28, 2023
Most AI image generation tools include an invisible watermark in the generated image.

This watermark is invisible to human eyes, but can be identified by the algorithm 🧵.
First, the image is broken down into different sub-bands consisting of high-frequency (fine-grained) and low-frequency (course-grained) components.

This is done using discrete wavelet transform.
The watermark is added to the middle-frequency components.

Why?

- Changes in low-frequency components can be seen by human eyes
- Changes in high-frequency components can be destroyed by resizing/compressing the image.
Read 6 tweets
Oct 21, 2022
It’s been 2 months since Stable Diffusion’s release, it has been re-implemented in almost every popular ML framework out there.

Here are some popular implementations: 🧵 :
1) The original PyTorch implementation by Machine Vision and Learning LMU Munich.

github.com/CompVis/stable…
2) Tensorflow/Keras implementation by me and @fchollet

A good option if you want to easily run it across different hardware.

github.com/divamgupta/sta…
Read 9 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(