Post

@williamLberman

https://twitter.com/RisingSayak/status/1666871818035277825

More from @RisingSayak

Sayak Paul

@RisingSayak

Dec 16, 2024

Add structural control to Flux!

We're excited to release exp. version of Flux Control fine-tuning scripts.

Flux Control from @BlackForestLabs is, by far, the strongest alternative to ControlNets while being computationally far more efficient.

The idea of Flux Control is simple yet elegant:

1. Instead of maintaining a separate auxiliary module like ControlNet or T2I Adapter, increase the number of input channels for the image latents in the pretrained Flux DiT.

2. Compute the latents of the structural inputs (depth map, for example) with the VAE and concatenate it with the actual latents you started the denoising process with.

3. During training, only the original image latents are noised and the structural latents are then concatenated to it before it's fed to the denoiser.

4. We start from the pretrained T2I Flux DiT along with the additional channels and train further on a similar ControlNet-like dataset!

So, no auxiliary models are trained here.

During inference, only a single model is invoked in the iterative denoising process (typically as opposed to the auxiliary module and the denoiser, as seen in ControlNets).

Read 5 tweets

Sayak Paul

@RisingSayak

Nov 6, 2023

Long time no release 👀

Well, let's break the silence and allow me to present 🧨 diffusers 0.22.0 🔥

Three new pipelines, 🤗 PEFT integration, new functionalities, and QoL improvements 🔋🏆

Join in 🧵 to know more!

1/8

We're bringing you the core support for Latent Consistency Models (both T2I & I2I are supported) 🔥

LCMs are darn fast! 4-8 steps are enough for plausible images. With a tiny autoencoder, you can squeeze in the max 🏎️ gains.

Doc ⬇️

2/8 huggingface.co/docs/diffusers…

Next up, we have a new OpenRAIL-licensed high-quality transformer-based pipeline ✨PixArt-Alpha✨ from @Huawei and collaborators.

It was trained with T5 and has a max. seq length of 120. Fire up your imaginations 🫶 Runs in 11GBs of VRAM 🙃

Doc ⬇️

3/8 huggingface.co/docs/diffusers…

Read 8 tweets

Sayak Paul

@RisingSayak

Sep 27, 2023

A 🧵 on the officially supported training examples of 🧨 diffusers 🤯

* Vanilla fine-tuning
* DreamBooth
* InstructPix2Pix
* ControlNet
* T2I-Adapters
* Custom Diffusion
* Unconditional generation

Check'em here ⬇️

1/5github.com/huggingface/di…

Our training examples are educational, meaning we often compromise efficiency & comprehensiveness for readability.

Also, we try to make them as hardware-accessible as possible.

E.g., you can LoRA DreamBooth an SDXL on a free-tier @GoogleColab 🤗

2/5colab.research.google.com/github/hugging…

Many examples come in different variants.

E.g., vanilla fine-tuning can be applied to Stable Diffusion (v1, v2, and XL), Kandinsky, & Wuerstchen (soon).

They also come in LoRA variants, allowing you to quickly prototype things and gradually move to more costlier runs.

3/5

Read 5 tweets

Sayak Paul

@RisingSayak

Aug 16, 2023

Now generate "Trumppenheimer" but fassst 🏎⚡️

Presenting a series of SDXL ControlNet checkpoints that are 5 to 7x smaller and faster 🧨🤗

Led by the one and only @psuraj28 🔥

Join in the 🧵 to know more!

1/

We are releasing a total of 4 small SDXL ControlNet checkpoints today - 2 for Canny and 2 for Depth 💣

Find the figure below that gives a CUMULATIVE rundown of the savings on memory and inference latency (A10 GPU) 📊

Find the benchmarking script ⬇️

2/ https://t.co/lJShSIFmOUgist.github.com/sayakpaul/0211…

As always, we prefer transparency & simplicity 🤗

Our training script is open-sourced here:

We didn't do any distillation and initialized a smaller ControlNet model and trained it. Could have trained it harder 💪

Refer to the script to learn more!

3/github.com/huggingface/di…

Read 4 tweets

Sayak Paul

@RisingSayak

Jul 27, 2023

🧨 diffusers 0.19.0 is out and comes with the latest SDXL 1.0 🔥

1️⃣ New training scripts for SDXL
2️⃣ New pipelines for SDXL (ControlNet, InstructPix2Pix, Inpainting, etc.)
3️⃣ AutoPipeline
and MORE!

Release notes 📝

1/5 https://t.co/SpRrq0yonXgithub.com/huggingface/di…

SDXL 1.0 comes with permissive licensing. Additional pipelines for SDXL 🚀

* Inpainting
* Image-to-Image
* ControlNet
* InstructPix2Pix

We also provide support for using an ensemble of expert denoisers 🪄

Docs ⬇️

2/5 https://t.co/zTHFRJsTvShuggingface.co/docs/diffusers…

https://twitter.com/RisingSayak/status/1682359891900653570

This release comes with three training scripts exclusively for SDXL 🎛

* DreamBooth LoRA (UNet + 2 text encoders)
* ControlNet
* InstructPix2Pix

3/5

https://twitter.com/RisingSayak/status/1682359891900653570

Read 5 tweets

Sayak Paul

@RisingSayak

Apr 20, 2023

Multi-concept subject training is now supported in 🧨 diffusers through "Custom Diffusion".

Thanks to Nupur (author of Custom Diffusion) for working hard on the integration!

Cat and wooden pot -- two concepts blending in the image below 🐱🪵

Docs ⬇️ huggingface.co/docs/diffusers…

🧵

Custom Diffusion only fine-tunes the cross-attention layers of the UNet and also supports blending textual inversion for seamless learning on consumer hardware.

As a result, with just 250 steps, we can get pretty good results depending on the underlying new subjects.

Since we train only a limited set of layers, WITHOUT using any adapters like LoRA, the resultant parameters total to only ~300 MBs.

Loading is as easy as:

Read 5 tweets

Share this page!

Enter URL or ID to Unroll

Sayak Paul

Try unrolling a thread yourself!

More from @RisingSayak

Sayak Paul

Sayak Paul

Sayak Paul

Sayak Paul

Sayak Paul

Sayak Paul

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!