Tweet

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @xsteenbrugge

Xander Steenbrugge

@xsteenbrugge

Feb 24

This is a "3D-diffusion" video created using a combination of four different AI models🤯

Welcome to the metaverse! 🌌😎

There's such incredible potential here that I want to explain how I made this, so here's a thread! (1/n)

@OpenAI

The two main models that draw the pixels are a diffusion model guided by a language prompt through @OpenAI's CLIP model.
This idea was introduced by @advadnoun and later refined by many other creatives. My talk at @Kikk_Festival further explains this:

@RiversHaveWings

The diffusion model (I integrated code from @RiversHaveWings and @Somnai_dreams for this) generates images by iteratively denoising noisy-pixel images, every time you run this from different noise, you get a different image, guided by the language prompt:

Read 10 tweets

Xander Steenbrugge

@xsteenbrugge

Jan 21

@RiversHaveWings

Finally playing around with CLIP + diffusion models.

12 GPU hours in I gotta say I'm pretty impressed with the difference in esthetics compared to VQGAN👌
Big thanks to @RiversHaveWings & @Somnai_dreams for providing great starting code!

"a dystopian city"

"The real problem of humanity is that we have Paleolithic emotions, medieval institutions and godlike technology"

"The being"

Read 5 tweets

Xander Steenbrugge

@xsteenbrugge

Dec 19, 2021

Just felt like sharing some beautiful images, these are still hot from the GPU...

"The elder sphere"

"The Engine"

"We live in golden bubbles of the mind"

Read 4 tweets

Xander Steenbrugge

@xsteenbrugge

Oct 6, 2021

https://twitter.com/ak92501/status/1445872262771478529

Niiice! Hooking this up to CLIP as soon as the weights are released 🤞🤞😋

https://twitter.com/ak92501/status/1445872262771478529

TLDR:
1. Replaces the CNN encoder and decoder with a vision transformer ‘ViT-VQGAN’, leading to significantly better speed-quality tradeoffs compared to CNN-VQGAN

2. Vanilla VQVAE often learns rarily used / “dead” codebook vectors leading to wasted capacity. Here, they add a linear projection of the code vectors into a lower dimensional “lookup” space. This factorization of embedding / lookup consistently improves reconstruction quality.

Read 4 tweets

Xander Steenbrugge

@xsteenbrugge

Oct 6, 2021

https://twitter.com/xsteenbrugge/status/1445327569847521281

Note to self: don't use default matplotlib colormaps to make digital art🤦‍♂️😅

New samples from my 'color-quantized VQGAN' are looking great!

Here's "𝑨𝒄𝒄𝒐𝒓𝒅𝒊𝒏𝒈 𝒕𝒐 𝑾𝒊𝒕𝒕𝒈𝒆𝒏𝒔𝒕𝒆𝒊𝒏, 𝒂 𝒑𝒊𝒄𝒕𝒖𝒓𝒆 𝒊𝒔 𝒂 𝒎𝒐𝒅𝒆𝒍 𝒐𝒇 𝒓𝒆𝒂𝒍𝒊𝒕𝒚"

#clip #AIart

https://twitter.com/xsteenbrugge/status/1445327569847521281

"𝒎𝒚 𝒉𝒆𝒂𝒅 𝒊𝒔 𝒇𝒖𝒍𝒍 𝒐𝒇 𝒏𝒐𝒊𝒔𝒆"

"𝑵𝒊𝒈𝒉𝒕𝒎𝒂𝒓𝒆"

Read 7 tweets

Xander Steenbrugge

@xsteenbrugge

Oct 5, 2021

@HvnsLstAngel

Inspired by the amazing work of @HvnsLstAngel I've been experimenting with a "color-quantized VQGAN"
Essentially, I introduced a codebook of possible colors and apply quantization in rgb space.

It's always fascinating how removing entropy can make samples more interesting...

"Inception"

"The ancient temple of time"

Read 5 tweets

Share this page!

Xander Steenbrugge

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @xsteenbrugge

Xander Steenbrugge

Xander Steenbrugge

Xander Steenbrugge

Xander Steenbrugge

Xander Steenbrugge

Xander Steenbrugge

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?