Inspired by the amazing work of @HvnsLstAngel I've been experimenting with a "color-quantized VQGAN"
Essentially, I introduced a codebook of possible colors and apply quantization in rgb space.
It's always fascinating how removing entropy can make samples more interesting...
"Inception"
"The ancient temple of time"
"Daydreaming"
I'm using some of the default colormaps from Matplotlib here, any pointers to more esthetically pleasing colormaps would def be appreciated 😋🙏
"Garden of Eden, the new fragrance"
• • •
Missing some Tweet in this thread? You can try to
force a refresh
I continued exploring #stablediffusion's latent space over the weekend and oh my; there's still a LOT of treasure to be discovered inside this magnificent neural universe!
Here's a quick thread with some of my personal favorites and how I found them..
The fact that all this visual splendor is compressed in just 4Gb of neural network weights totally blows my mind. Call it compression, call it emergence, it's just 🤯🤯
Getting bored by a StyleGAN model after looking at samples for 20 minutes seems like a very distant past now..
Reminiscent of cut-up poetry, one cool trick I implemented is to: 1. Start with a list of great, proven prompts 2. Chunk the prompts into word groups of ~2-5 words 3. Randomly recombine multiple word groups into new 'pseudo-prompts'
Ok, so first of all, #stablediffusion did not come with code to make videos, so I came up with a way to interpolate between encoded prompt vectors (no worries if you don't know what that means) and thereby create video sequences from prompt sequences (1/n)
Next, I had to come up with a visual narrative that would work well with the style of the Diffusion interpolations. You can't just tell any story here: like with any medium, you have to work within the constraints of the technology. (2/n)
Once I settled on the "evolution" narrative, I wrote about a thousand different prompts, containing many variations on the narrative sequence I wanted. I then rendered all the corresponding stills with multiple seeds over roughly two nights of GPU time. (3/n)
"Voyage through Time"
is my first artpiece using #stablediffusion and I am blown away with the possibilities...
We're crossing a threshold where generative AI is no longer just about novel aesthetics, but evolving into an amazing tool to build powerful, human-centered narratives
This video was created using 36 consecutive phrases that define the visual narrative.
To find the best possible sequence, I tried over a thousand different prompts and seeds and applied many "prompt engineering" tricks in my code, to figure out what works and what doesn't
The way this model "interpolates" between the meaning of two sentences (in semantic rather than visual latent space) is a huge gamechanger for storytelling, and this is only just the beginning of a MASSIVE revolution in digital content creation powered by generative AI..
I discovered a bug in my own Diffusion + CLIP pipeline and suddenly the samples are unreal.. 🤯
Here's
"Just a liquid reality..." #AIart#notdalle2#Diffusion#clip
This is a "3D-diffusion" video created using a combination of four different AI models🤯
Welcome to the metaverse! 🌌😎
There's such incredible potential here that I want to explain how I made this, so here's a thread! (1/n)
The two main models that draw the pixels are a diffusion model guided by a language prompt through @OpenAI's CLIP model.
This idea was introduced by @advadnoun and later refined by many other creatives. My talk at @Kikk_Festival further explains this:
The diffusion model (I integrated code from @RiversHaveWings and @Somnai_dreams for this) generates images by iteratively denoising noisy-pixel images, every time you run this from different noise, you get a different image, guided by the language prompt:
Finally playing around with CLIP + diffusion models.
12 GPU hours in I gotta say I'm pretty impressed with the difference in esthetics compared to VQGAN👌
Big thanks to @RiversHaveWings & @Somnai_dreams for providing great starting code!
"a dystopian city"
"The real problem of humanity is that we have Paleolithic emotions, medieval institutions and godlike technology"