Stability AI Profile picture
Nov 24 9 tweets 4 min read
We are excited to announce the release of Stable Diffusion Version 2!

Stable Diffusion V1 changed the nature of open source AI & spawned hundreds of other innovations all over the world. We hope V2 also provides many new possibilities!

Link → stability.ai/blog/stable-di…
Text-to-Image

The V2.0 release includes robust text-to-image models trained using a new text encoder developed by LAION with support from Stability AI, which greatly improves the quality of the generated images compared to earlier V1 releases.
The text-to-image models can generate images with default resolutions of both 512x512 & 768x768.

The models are trained on a LAION-5B aesthetic subset, which is then further filtered with LAION’s NSFW filter.

Examples of images produced using V2.0, at 768x768 image resolution:
Upscaling

V2.0 also includes an Upscaler model that enhances the resolution of images by 4x. Here’s an example of our model upscaling a 128x128 generated image to 512x512. Along with our text-to-image models, V2.0 can now generate images with resolutions of 2048x2048– or higher.
Depth-to-Image

We also release a depth-guided stable diffusion model, depth2img. It infers the depth of an input image (using an existing model), and then generates new images using both the text and depth information.
depth2img can offer all sorts of new creative applications, delivering transformations that look radically different from the original, but which still preserve the coherence and depth of that image:
Inpainting

Finally, we also include a new text-guided inpainting model, fine-tuned on the new Stable Diffusion 2.0 base text-to-image model, which makes it super easy to switch out parts of an image intelligently and quickly.
We’ve worked hard to optimize the models to run on a single GPU, making it accessible to as many people as possible from the very start!

The models from this release will hopefully serve as a foundation of countless applications and enable an explosion of new creative potential.
For more details about accessing the model, please check our GitHub repository: github.com/Stability-AI/s…

These models will be available at @DreamStudioAI in the coming days. Devs can check our API Platform for docs and more info (platform.stability.ai).

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Stability AI

Stability AI Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @StabilityAI

Nov 10
Our very own @RiversHaveWings has trained a latent diffusion-based upscaler.

What does this mean and how does it work? (1/5)
The upscaler is itself also a diffusion model. It was trained on a high-resolution subset of the LAION-2B dataset. Being a 2x upscaler, it can take the usual 512x512 images obtained from Stable Diffusion and upscale it to 1024x1024. (2/5)
Like Stable Diffusion, the upscaler is itself a latent diffusion model, a diffusion model that operates in a compressed "latent" space, which is "decoded" into a full-resolution image. The upscaler uses the same encoder/decoder & therefore works in the same latent space. (3/5) Image
Read 5 tweets
Oct 29
We recently released two new fine-tuned decoders (trained by the amazing @rivershavewings) that improves the quality of images generated by @StableDiffusion.

Read on to see what this means and how you can try it out yourself! ↓
What does "fine-tuned decoders" even mean? Well, basically, Stable Diffusion actually is a diffusion model that operates in a compressed space, which is then "decoded" into a full-resolution image. This decoder itself is also a trained neural network.
So "fine-tuned decoder" simply means that this is a new set of decoder neural network parameters (i.e. changing the part circled in blue, diagram from @huggingface) after further training on a new dataset.
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(