Bilawal Sidhu Profile picture
Mar 4 8 tweets 5 min read
🧠 AI experiment comparing #ControlNet and #Gen1. Video goes in ➡ Minecraft comes out.

Results are wild, and it's only a matter of time till this tech runs at 60fps. Then it'll transform 3D and AR.

How soon until we're channel surfing realities layered on top of the world?🧵
First ControlNet. Wow, this tool makes it very easy to get photorealistic results. I used the HED method for this scene and got some amazing results. I used EbSynth for smoother interpolation between ControlNet keyframes. Check out my prior posts for the end-to-end workflow.
Next up Gen 1: impressive is the word. The star of the show is the temporal consistency. Getting photoreal results is harder than ControlNet IMO. #Gen1 is almost its own stylized thing, so I advise leaning into that. But why does it matter - can't we just type text to get video?
Text prompts are cool, but control over the details is crucial for artists. These new AI tools turn regular photos/videos into an expressive form of performance capture. Record characters/scenes with your phone and use it to guide the generation. My buddy Don shows how it's done:
Of course, the input media can also be *synthetically* generated. Go from a blocked out 3D scene to final render in record time. Control the details you care about (e.g. blocking), and let AI help you with the rest (e.g. texturing). I cover use cases here: creativetechnologydigest.substack.com/p/depth2image-…
See where I'm going when I say gen AI is going to disrupt 3D rendering? You could be running a lightweight 3D engine in a browser, slap on a generative filter, and transform it into AAA game engine quality. No massive team required. And it's not limited to gaming either:
It's wild how fast things are moving in generative AI. Here's my video2minecraft results from just a few months ago, which look dated with all these new approaches to tame the chaotic diffusion process:
And that's a wrap! I like sharing my workflows openly with the AI & creator community, so if enjoyed this thread I'd appreciate it if you:
1. RT the thread below
2. Follow @bilawalsidhu for more
3. Subscribe to get some visual umami right to your inbox: creativetechnologydigest.substack.com

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Bilawal Sidhu

Bilawal Sidhu Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @bilawalsidhu

Mar 1
Before/after of Corridor's latest AI video is wild. They shot video on greenscreen, made virtual sets in Unreal, then reskinned it to anime by finetuning Stable Diffusion. Net result? 120 VFX shots done by a team of 3 on a dime. Bravo! This is a milestone in creative technology🧵
⚙ Corridor basically made an open source video2anime workflow to pull off this video. Key tools they used:
- Stable Diffusion model + DreamBooth fine-tuning
- Unreal Engine + asset store 3D models
- Img2Img + DeFlickering effect
- Heaps of gold ol' fashioned VFX compositing
Now let's deconstruct their creation workflow:
1. Train a model to replicate a specific style
2. Train a model to know a character 🔄
3. Run green screen video through img2img
4. Reduce flicker with Deflicker plugin
5. Add 3D elements in Unreal 5
6. Final VFX comp/edit in Resolve
Read 6 tweets
Feb 25
Multi ControlNet is a game changer for making an open source video2video pipeline. I spent some time hacking this NeRF2Depth2Image workflow using a combination of ControlNet methods + SD 1.5 + EbSynth.
🧵 Full breakdown of my workflow & detailed tips shared in the thread below ⬇
Here's an overview workflow we're going to deconstruct! At a high level:
Capture video (used my iPhone) ➡️ Train NeRF (used Luma AI) ➡️ Animate & Render RGB + Depth ➡️ Multi-Control Net (Depth + HED) ➡️ EbSynth ➡️ Blending & Compositing. Now let's break it down step by step:
For the input, I wanted to see if I can exploit the crispy depth maps you can get out of a Neural Radiance Field (NeRF) 3D scan.
- Left: 3D flythrough rendered from a NeRF (iPhone video ➡️ trained w/ Luma AI)
- Right: The corresponding depth map (notice the immaculate detail!)
Read 13 tweets
Feb 20
ControlNet experiment where I'm toggling through different styles of contemporary Indian décor, while keeping a consistent scene layout.

Loving how ControlNet is putting the artists back in control of AI image generation process.

🧵Thread

#ControlNet #StableDiffusion #EbSynth
For the input sequence I use a short animation made from a photogrammetry 3d scan I did a few years back of my parents living room in India.

- Top: Output generated with ControlNet + EbSynth

- Bottom: Input video sequence from my 3D scan
I used the ControlNet depth method for this experiment.

- Left: MiDaS depth map generated from a screen recording of my 3d scan

- Right: Awesome results reskinning the room! Pretty majestic if I say so myself

Next time I want to use the synthetic depth from my 3d scan
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(