We are thrilled to announce Imagen, a text-to-image model with unprecedented photorealism and deep language understanding. Explore imagen.research.google and Imagen!
A large rusted ship stuck in a frozen lake. Snowy mountains and beautiful sunset in the background. #imagen
A plush toy koala bear relaxing on a lounge chair and working on a laptop. The chair is beside a rose flower pot. There is a window on the wall beside the flower pot with a view of snowy mountains. #imagen
Imagen uses a large pre-trained language model (T5-XXL) as a text encoder, and a cascade of diffusion models for 1024x1024 image generation. Imagen outperforms all existing techniques on MS-COCO benchmark by a considerable margin.
We introduce DrawBench, a comprehensive and challenging benchmark dataset to evaluate text-to-image model. Imagen outperforms all recent techniques on DrawBench.
A small thread on analysing text rendering capability of #imagen.
Starting with a simple example, Imagen is able to reliably render text without any need for rigorous cherrypicking. e.g. these are non-cherry picked examples for writing "Text-to-Image" on a storefront.
Imagen is also capable of writing text in a variety of
interesting settings. Here are some examples (1 sample picked out of 8 for each prompt).