We’re happy to release Stable Diffusion, Version 2.1!
With new ways of prompting, 2.0 provided fantastic results, and 2.1 supports the new prompting style, but also brings back many of the old prompts too!
The differences are more data, more training, and a less restrictive filtering of the dataset. The dataset for v2.0 was filtered aggressively LAION’s NSFW filter, making it a bit harder to get similar results generating people.
We listened to our users & adjusted the filters. Adult content is still stripped out but less aggressively so it cuts down on too many false positives. We fine-tuned v2.0 with this updated setting, giving us a model which hopefully captures the best of both worlds!
The model also has the power to render non-standard resolutions. That helps you do all kinds of awesome new things, like works with extreme aspect ratios that give you beautiful vistas and epic widescreen imagery. #StableDiffusion2
The community noticed that “negative prompts” worked wonders with 2.0 & they work even better in 2.1! Negative prompts allows a user to tell the model what not to generate.
Negative prompts are now supported in @DreamStudioAI by appending “|<negative prompt>:-1” to the prompt.
For more details about accessing and downloading v2.1, please check out the release notes on our GitHub: github.com/Stability-AI/s…
We know open is the future of AI and we’re committed to developing current & future versions of @StableDiffusion in the open. Expect more models and releases to come fast and furious and some amazing new capabilities as generative AI gets more and more powerful in the new year.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
The leading architecture magazine @Dezeen interviewed Stability AI’s Creative Director Bill Cusick @williamcusick about the uses of AI image generation in architecture, here are the highlights:
“AI is the foundation for the future of creativity."
AI is enabling new forms of creativity for architectural design. Cusick likens designing with AI to the experience of playing chess, in that it takes a short amount of time to learn but far longer to master.
AI has great potential as a tool for early stages of a project. Cusick created architectural sketches using DreamStudio to illustrate the approach best articulated by Andrew Kudless, @MatsysDesign—
"It's meant to capture a vision of a project quickly."
We are excited to announce the release of Stable Diffusion Version 2!
Stable Diffusion V1 changed the nature of open source AI & spawned hundreds of other innovations all over the world. We hope V2 also provides many new possibilities!
The V2.0 release includes robust text-to-image models trained using a new text encoder developed by LAION with support from Stability AI, which greatly improves the quality of the generated images compared to earlier V1 releases.
The text-to-image models can generate images with default resolutions of both 512x512 & 768x768.
The models are trained on a LAION-5B aesthetic subset, which is then further filtered with LAION’s NSFW filter.
Examples of images produced using V2.0, at 768x768 image resolution:
The upscaler is itself also a diffusion model. It was trained on a high-resolution subset of the LAION-2B dataset. Being a 2x upscaler, it can take the usual 512x512 images obtained from Stable Diffusion and upscale it to 1024x1024. (2/5)
Like Stable Diffusion, the upscaler is itself a latent diffusion model, a diffusion model that operates in a compressed "latent" space, which is "decoded" into a full-resolution image. The upscaler uses the same encoder/decoder & therefore works in the same latent space. (3/5)
We recently released two new fine-tuned decoders (trained by the amazing @rivershavewings) that improves the quality of images generated by @StableDiffusion.
Read on to see what this means and how you can try it out yourself! ↓
What does "fine-tuned decoders" even mean? Well, basically, Stable Diffusion actually is a diffusion model that operates in a compressed space, which is then "decoded" into a full-resolution image. This decoder itself is also a trained neural network.
So "fine-tuned decoder" simply means that this is a new set of decoder neural network parameters (i.e. changing the part circled in blue, diagram from @huggingface) after further training on a new dataset.