I used the #StableDiffusion 2 Depth Guided model to create architecture photos from dollhouse furniture.
By using a depth-map you can create images with incredible spatial consistency without using any of the original RGB image.
See 🧵
2/ This model is unique as it was fine-tuned from the Stable Diffusion 2 base with an extra channel for depth.
Using MiDaS (a model to predict depth from a single image), it can create new images with matching depth maps to your "init image"
3/ I set the denoising strength to 1.0 so that none of the original RGB image was used
Even with widely different prompts it was able to generate consistent objects
Using simple, recognizable shapes such as wooden doll-house furniture worked great for this
4/ Regular photos ended up having an unavoidable “doll-house” feel to them (even with heavy prompt tweaking) due to the extreme perspective.
I found that changing to a longer focal length (3x on an iPhone) and capturing from further away resolved this.
5/ Here are a few of the prompts used:
"A beautiful rustic Balinese villa, architecture magazine, modern bedroom, infinity pool outside, design minimalism, stone surfaces"
6/ "Luxurious modern studio bedroom, trending architecture magazine photo, colorful framed art hanging over bed, design minimalism, furry white rugs, trendy, industrial, pop art, boho chic"
7/ "Retro bedroom studio, arcade, 80's style, vintage framed posters, trending architecture magazine, rugs, metal industrial pipes, murals, guitars and sound equipment, grunge, concrete floor"
8/ There is some “creativity” in how the depth-map is matched under the prompt.
Here are a few outtakes where the model tried to match the plant to antlers, toys, candles, statues, a double-necked guitar and even a kid with Mickey ears🤯
Follow for more creative experiments 👨🎨
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.
