1/ I created this with Stable Diffusion using image inpainting and “walking through the latent space”
Without using tweening, every frame is generated by an interpolated embedding and variable denoising strength, so keeping continuity was tricky
See 🧵for process
2/ First off, finding the right combination of prompt, seed and denoising strength for an #img2img in-painting is a roll of the dice
Luckily it is easy to script large batches to cherrypick
3/ The first and last pairs were just regular #img2img ramped through a range of denoising strength of 0 to 0.8
4/ Transitions were done using a customized @huggingface 🧨Diffusers pipeline.
This lets me “slerp” between both noise latents AND text embeddings, for each given seed & prompt respectively
(while keeping denoising strength at ~0.8)
5/ Some tricks were required with blending and adjusting the inpainting mask to smoothly switch over the init images of the two real phones
(example generations on the right)
6/ Not all walks through the latent space were a smooth path, but it’s easy to script it to find pairs that work well (and let your GPU replace your central heating)
Having the ability to play with these models on this level is incredible.
More creative AI experiments to come!
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.