Inspired by the amazing work of @HvnsLstAngel I've been experimenting with a "color-quantized VQGAN"
Essentially, I introduced a codebook of possible colors and apply quantization in rgb space.
It's always fascinating how removing entropy can make samples more interesting...
"Inception"
"The ancient temple of time"
"Daydreaming"
I'm using some of the default colormaps from Matplotlib here, any pointers to more esthetically pleasing colormaps would def be appreciated 😋🙏
"Garden of Eden, the new fragrance"
• • •
Missing some Tweet in this thread? You can try to
force a refresh
TLDR: 1. Replaces the CNN encoder and decoder with a vision transformer ‘ViT-VQGAN’, leading to significantly better speed-quality tradeoffs compared to CNN-VQGAN
2. Vanilla VQVAE often learns rarily used / “dead” codebook vectors leading to wasted capacity. Here, they add a linear projection of the code vectors into a lower dimensional “lookup” space. This factorization of embedding / lookup consistently improves reconstruction quality.