Discover and read the best of Twitter Threads about #Imagen

Most recents (20)

Exciting news: #Parti and #Imagen teamed up to create a hybrid system with Parti creating 256x256 images which then recieve Imagen super resolution to produce 1024x1024 pixels! See the diagram below for how it works.

See thread for more info and new images with this system!
As @savvyRL noted in this great thread on #dalle2, #imagen and #parti, Parti had a small super resolution model. This definitely reduced Parti's visual impact. As Imagen is a cascaded model, it was a natural pairing to use Imagen's SR on Parti's outputs!

This required experimentation and tweaking: big props to Imagen's @mo_norouzi, @Chitwan_Saharia and @wchan212 for working on this with Parti's @alex_y_ku and @JHYUXM.

Parti's beaver librarians (fig 19,
arxiv.org/abs/2206.10789) now enjoy greater detail! (Note combined watermark.)
Read 15 tweets
That was fast. News sites are already using DALL-E (mini) to generate fake headline images. This move seems questionable to say the least @NextShark
This is paving the way for a very dangerous practice by media organizations. DALL-E mini is pretty obvious, but once models at the level of #dalle2 and #imagen are widely available, we are in trouble.
Kudos to @OpenAI for making this type of generation impossible with their model and API. That will slow things down a bit...

In the meantime it is important to solidify journalistic standards on this and distribution mechanisms (like Yahoo and Google News) to enforce them.
Read 4 tweets
it's time to talk about my side project:

👉🏻 Introducing Prompt Press (prompt.press) - AI generated artwork inspired by current events.

#aiart #generativeart #aiartcommunity #AiArtwork
Over the past few months I've found myself absolutely fascinated with #AIart or #generativeart - whatever you want to call it.

You may have heard of #dalle2, #imagen or #parti. These are all amazing models (with limited access) that generate images of whatever you can describe.
"whatever you describe", the prompt, is key to navigating latent space and generating an interesting image.

This is such a skill, there's a whole field emerging called Prompt Engineering.
Read 9 tweets
Good Morning!

I tried to use text-to-image models to combine historical architecture with other locations around the world.

Here is “The Great Wall of San Francisco” by #Imagen

🧵Thread👇🏽 ImageImageImage
“The Great Wall of Stanford” generated using #Imagen

Accurate: ImageImageImage
Let’s take this somewhere more exotic.

Here’s “The Great Wall of Bali” by #Imagen ImageImage
Read 13 tweets
A quick thread on "How DALL-E 2, Imagen and Parti Architectures Differ" with breakdown into comparable modules, annotated with size 🧵
#dalle2 #imagen #parti

* figures taken from corresponding papers with slight modification
* parts used for training only are greyed out A compilation of model architecture diagrams of three recent
By now we know that
- DALL-E & Imagen = diffusion; Parti = autoregressive
- Imagen & Parti use generic text encoders; DALLE uses CLIP enc

But in fact, one version of Imagen also used CLIP, one version of DALL-E also had AR prior. So there are more connections than it seemed.
If we break each architecture down into *modules*, the similarity/comparability is even more clear.

First of all, they all have a "text encoder", but differ in types and sizes:
- DALL-E uses CLIP text encoder
- Imagen uses T5-XXL
- Parti uses a generic transformer
Read 6 tweets
We are excited to share our work on our Pathways Autoregressive Text-to-Image model, Parti! #Parti achieves high-fidelity photorealistic image generation and supports content-rich synthesis involving complex compositions and world knowledge.

parti.research.google
Wombats are #Parti's spirit animal! The prompt for the above image is "A scholarly wombat wearing a vest and a bowtie. He is reading a book in a coffee shop. There is an espresso in a small glass cup on the table. dslr."

Here it is w/ “anime illustration” rather than “dslr”.
Parti is a two-stage model, like the original DALL-E and CogView models, that quantizes images using our ViT-VQGAN model (ai.googleblog.com/2022/05/vector…) and then translates language tokens to image tokens using a 20 billion parameter encoder-decoder Transformer.
Read 12 tweets
Today we're launching @clipdropapp Image Upscaler on @ProductHunt, the easiest way to upscale your images 2x or 4x with AI.

⚡️ Upscale, denoise and enhance in 1 click
🔍 Sharpen blurry edges
✨ Remove JPEG compression artifacts
🔌 Public API

→ producthunt.com/posts/clipdrop…
@clipdropapp @ProductHunt Once again, our team has used a couple of our favorite tricks to push the quality and speed of 2x and 4x image upscaling models.

It's an ongoing effort, but the result is already quite spectacular 🤓
@clipdropapp @ProductHunt We've been using it a lot lately to improve the images produced by image diffusion models such as #dalle2 #minidalle #imagen or more recently CogView2 - cc @hardmaru

Here's an example of @shashj amazing Mughal helicopters
Read 5 tweets
Good Morning!

“Professional photograph of bears in sports gear in a triathlon in Kyoto” made using both #Imagen and #Dalle
Triathlon Bears in Kyoto
#Imagen #Dalle
The real trophies are the friends we make along the way. #Imagen #Dalle
Read 8 tweets
Tried to use #Imagen to generate collectable Japanese postage stamps about VR cats. I love these results!

“Ukiyo-e painting of a cat hacker wearing VR headsets, on a postage stamp” ❤️
These metaverse cats come in all shapes and sizes.

#Imagen
Don’t worry, these VR cats are only slightly sentient 🙃
Read 7 tweets
En el #hospital la persona vive el ingreso como el entreacto de una función que seguirá tras el alta. Pero a veces ese entreacto se alarga porque tras la función cambiará el escenario y los actores sus papeles.Como si en su #HistoriadeVida, se iniciase una segunda parte. Image
Y allí la conocí, me habían avisado para valorar una pequeña #lesión en su rodilla. Pero tras cruzar cuatro frases, su #mirada, sus #palabras, me dejaron ver que había más #heridas que curar. Sus lágrimas dieron paso a un dolor escondido que aquel día salía ajeno a su voluntad.
Hay #lágrimas que cuesta #Compartir, nos avergüenza tener los ojos llenos de ellas pensando que lo que las hace brotar no las justifica.En un Hospital parece que llorar por lo que parecen pequeñas cosas se deba hacer sin ser visto.
Read 12 tweets
A small thread on analysing text rendering capability of #imagen.

Starting with a simple example, Imagen is able to reliably render text without any need for rigorous cherrypicking. e.g. these are non-cherry picked examples for writing "Text-to-Image" on a storefront. Image
Imagen is also capable of writing text in a variety of
interesting settings. Here are some examples (1 sample picked out of 8 for each prompt). ImageImageImageImage
Some more examples of interesting settings... ImageImageImageImage
Read 6 tweets
Good Morning!

Here’s an “Oriental painting of a dragon programming on a laptop in the Song dynasty” produced by #Imagen
More dragon coders from #Imagen🐲
“Oriental painting of a dragon programming on a laptop in the Song dynasty” by #Dalle
Read 6 tweets
Have you seen #dalle2 and #Imagen and wondered how it works?

Both models utilize diffusion models, a new class of generative models that have overtaken GANs in terms of visual quality.

Here are 10 resources to help you learn about diffusion models ⬇ ⬇ ⬇
1. "What are Diffusion Models?" by @ari_seff
Link →

This 3blue1brown-esque YouTube video is a great introduction to diffusion models!
2. "Introduction to Diffusion Models for Machine Learning" by @r_o_connor
Link → assemblyai.com/blog/diffusion…

This article provides a great deep-dive of the theoretical foundations for Diffusion Models.
Read 13 tweets
One takeaway for me from (#dalle2, #imagen, #flamingo) is there's no one "golden algorithm" to unlock these new transfer learning capabilities. Contrastive, AR, Freezing, Priors, they all can work. You almost can't stop these models from exhibiting these new types of behavior...
...It reminds me a lot of early DL days, when people used to think you needed sparsity regularization to learn nice gabor filters in NNs, but then it turned out than almost any model with convolution and enough natural data would learn them on their own...
...We shifted our attention to different parts of the problem, as the features of visual transfer learning with pretrained convnets were just "a given" to be nice representations regardless of the architecture and dataset (kind of crazy when you think about it)...
Read 4 tweets
We are thrilled to announce Imagen, a text-to-image model with unprecedented photorealism and deep language understanding. Explore imagen.research.google and Imagen!

A large rusted ship stuck in a frozen lake. Snowy mountains and beautiful sunset in the background. #imagen Image
A plush toy koala bear relaxing on a lounge chair and working on a laptop. The chair is beside a rose flower pot. There is a window on the wall beside the flower pot with a view of snowy mountains. #imagen Image
Imagen uses a large pre-trained language model (T5-XXL) as a text encoder, and a cascade of diffusion models for 1024x1024 image generation. Imagen outperforms all existing techniques on MS-COCO benchmark by a considerable margin. Image
Read 5 tweets
With the release of #Imagen from @GoogleAI yesterday, here's a quick follow-up thread on the progress of compositionality in vision-language models.🧵 1/11
A few weeks ago DALL-E 2 was unveiled. It exhibits both very impressive success cases and clear failure cases – especially when it comes to counting, relative position, and some forms of variable binding. Why? 2/11
Under the hood, DALL-E 2 uses a frozen CLIP model to encode captions into embeddings. CLIP's contrastive training objective leads it to learn only the features of images people tend to describe online (e.g., common objects/relations and aesthetic style) 3/11
Read 11 tweets
For the next few days, our timelines are gonna be full of cutesy images made by a new Google AI called #Imagen.

What you won't see are any pictures of Imagen's ugly side. Images that would reveal its astonishing toxicity. And yet these are the real images we need to see. 🧵
How do we know about these images? Because the team behind Imagen has acknowledged this dark side in a technical report, which you can read for yourself here. Their findings and admissions are troubling, to say the least.
gweb-research-imagen.appspot.com/paper.pdf
First, the researchers did not conduct a systematic study of the system's potential for harm. But even in their limited evaluations they found that it "encodes several social biases and stereotypes."
Read 17 tweets
#PasiónPorElArte
El perro acompaña al #hombre desde hace algo más de 12000 años. Atrás quedan los lobos domésticos europeos que se acercaron al hombre en busca de comida y huesos.
Hoy se crían más de 800 razas, muchas de ellas sin capacidad de sobrevivir en #libertad
#Egipto adoraba los animales domésticos. Eran objeto de momificación y su pérdida suponía gran consternación familiar (sus dueños se afeitaban la cabeza en señal de duelo)
Un lebrel, guardián real, fue objeto de entierro ceremonial (necrópolis de Guiza) premiándose así su labor.
Grecia y #Roma respetaban a unos animales destinados a la caza, pastoreo o guardia. Algunas casas romanas exhibían su rótulo "Cave canem" (cuidado con el perro) conscientes de la necesidad de alertar a futuros e inesperados visitantes.

El Renacimiento supondrá la consagración...
Read 28 tweets
La Comunidad Homosexual Argentina expresa su #repudio por la desvinculación del actor @ChrisSanchook de la marca de ropa interior #Lody para la que trabajaba como modelo.
Creemos que esa acción es un #acto #discriminatorio atentando contra la #orientaciónsexual tanto como contra la #libertad de #pensamiento
La desvinculación se produjo después de que #Sancho fuese la tapa de la Revista Caras donde declaraba "Apuesto por la #diversidadsexual", mientras manifestaba sus ideas de relacionarse con libertad en su vida personal:
Read 13 tweets
Veamos juntos una de las obras de arte más crípticas del arte medieval.

#HilodeArte para amantes de los secretos que esconden las vidrieras. Vidriera anagógica de la ba...
Siglo XII. El hito (¿o mito?) de la creación del #gótico estaba siendo llevado a cabo por el abad Suger (o Sugerio): la reforma integral de la abadía de Saint-Denis según un programa simbólico cargado de frescura iconográfica y unas técnicas absolutamente renovadoras. Vista del rosetón de la fac...Vista general de las naves ...Vista de las grandes vidrie...Plano de la basílica de Sai...
La #vidriera que nos ocupa no se puede entender sin el resto del proyecto: cuantiosos vitrales figurados, artes suntuarias y sobre todo la propia estructura arquitectónica (porque sí, amigos, existe #iconografía en la arquitectura). Pero hoy no podemos hablar de todo eso.
Read 50 tweets

Related hashtags

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!