Testing Multi-ControlNet on a scene with extended dialog, a range of facial expressions, and head turning. No EBSynth used. #AImusic from aiva.ai. Breakdown🧵1/15 #aicinema #controlnet #stablediffusion #aiia #aiartcommunity #aiva #machinelearning #deeplearning #ai
Overall flow: pre-process video > img seq > play with prompts > initial controlnet settings > control net batch render > upscale > clean up in post > key out background > deflicker > video post-production 2/15
The approach I used here was figuring out an initial workflow. There’s definitely a lot to play with and improve on. The orig vid is low-res and a little blurry so I started with a pre-process pass by bringing out the edges/brightness & upscaling. 3/15
Then I took the first frame and played with prompts in SD. After finding an initial prompt, I’d then test it on more extreme frames to see how it held up. I’d then tweak the prompt/settings until I got a look that worked across a range of frames. 4/15
In multi-controlnet, after trying lots of settings, I ended up on a mix of HED, normal, and depth. (Again, you want to test with several frames through out the video.) I also kept the denoising low at 0.3 to keep better color consistency and follow the lip sync closely. 5/15
Since I was going for a more cartoonish look, I knew the softer edges of HED with lower guidance/weight would be good for following the input img but allow the features to get exaggerated. (Canny tends to be more exact in following edges.) 6/15
Normal and depth just seemed to help get the image where I wanted over a range of the more problematic frames, but that was just from tinkering. I didn’t go in knowing what precise effect the mix or weight/guidance would have. 7/15
After the batch img2img render, I upscaled the video to 4K using @topazlabs video enhance AI. Then downscaled back to 1080 HD. This brought in more contrast and sharpness. 8/15
I had issues with flickering spots so I had to go in and manually erase them which took forever. Also highlights in the lips would be confused for teeth. I did a few things to improve the situation but a better solution is needed. 9/15
With the increased clarity I went in on removing the background. I tried several approaches with depth maps and color keying, seeing how it held up around the hair across the whole video. The depth map approach didn’t really work the best in this case, 10/15
so I ended up using different versions of color keying and layered them together. The result isn’t perfect (neither was the starting input), but end of the day the intent was to surface a workflow and learn from it, so I think it’s good enough for that. 11/15
To decrease the boiling/flickering I used several rounds of the deflicker effect in #davinciresolve (from @CorridorDigital’s awesome Anime video breakdown!) I’ve tried deflickering approaches before but 12/15
they don’t always help depending on which program/plugin you’re using and settings you go in with, but the deflicker in Resolve is awesome! (Only available in the paid version.) If stacking several deflickers doesnt do enough, try diff settings and several repeat rounds. 13/15
I used an image I generated in #midjourney. Blurred and animated in post. And I added a little grain and sharpened everything (deflickering can soften details a bit.) Animated some slow zooming and dropped in some music from aiva.ai. 14/15
Hope some of you found this thread useful! If so please share it with others! And I’m excited to see if any of you run similar experiments and make awesome stuff! 15/15

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with CoffeeVectors

CoffeeVectors Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @CoffeeVectors

Feb 1
Really impressed by the acting you can generate with @elevenlabsio #texttospeech! These are #AIvoices generated from text—dozens of "takes" stitched together. Breakdown thread: 1/8 #syntheticvoices #ainarration #autonarrated #aicinema #aiartcommunity #aiia #ai #MachineLearning
I started by having #chatGPT write a few rough drafts of a scene involving a panicked character calling her friend for help from a spaceship. I was going for something that would involve heightened emotions but not be too serious. 2/8
Then I wrote a short script using some of those ideas plus my own and put the whole thing into @elevenlabsio. I generated a few takes using low Stability (1-2%) and high Clarity (90-99%). Each take usually had parts I liked, or at least gave me ideas for direction. 3/8
Read 8 tweets
Jan 30
I wonder what the future of UX design (maybe apps in general) may be like if AI really allows us to customize our experience. Not to mention blend programs together through a 3rd party/custom UI if an AI can understand onscreen what's being displayed by the app's GUI. 1/5
Combined with no code platforms of the future and advanced templates, you could probably do weird stuff like Premiere Pro x Unreal Engine x a fighting game template x an anime you like and custom gameify your interface. 2/5
Or maybe you could just submit to a chat AI to combine several apps/aesthetics together and present different connections and gameification strategies based on knowledge of UI/UX design. 3/5
Read 5 tweets
Jan 6
Lately I've been thinking about how much of "reality" is a negotiation with useful illusions and the material world. I think it's safe to say that a portion of how we view things is through shortcuts and narratives. 1/11
To what degree we engage in fictions probably differs from person to person. Some believe the entire thing is a fiction passed on to us from evolution to navigate the food chain. Others think they concretely engage in reality the whole time. 2/11
Personally I think it's interesting how much of the world we can't “see” except through technology or layers of reasoning—radio waves, germs, the financial system, justice. These aren’t simple things we just look at and easily have collective intuition about. 3/11
Read 11 tweets
Dec 18, 2022
Used #stablediffusion2 #depth2img model to render a more photoreal layer ontop of a walking animation I made in #UnrealEngine5 with #realtime clothing and hair on a #daz model. Breakdown thread 1/6 @UnrealEngine @daz3d #aiart #MachineLearning #aiartcommunity #aiartprocess #aiia
Reused this animation I made from a post from a few months ago. Rendered in #UE5 @UnrealEngine using a @daz3d model with #realtime clothing in the @triMirror plugin. Walk was from the #daz store. Hair was from the Epic Marketplace. 2/6 #aiartprocess
Used SD2’s #depth2img model running locally in Automatic1111. Thanks to @TomLikesRobots for the help getting it working! And showing how the model retains more consistency than normal img2img. I basically did an img2img batch process on the image sequence. 3/6 #aiartprocess
Read 6 tweets
Dec 17, 2022
A thought on resistance to change. I recently had a convo with a friend of mine who went thru a serious breakup that left her rattled. She talked about how hard it was to let go of the future she had envisioned for herself; that she felt so sure was going to come. 1/7
I feel like part of the resistance to change isn’t just rooted in the past and present, but also your perception of how you thought the world was going to look like and your place in it. Expectations are set and not met. 2/7
It’s like trying to turn a race car, the more momentum, the more energy it will take to change course. It’s not true in all cases, but when it is true, it can be an incredible struggle. The weight of disappointment can be a terrible burden. 3/7
Read 7 tweets
Dec 15, 2022
I’m so fascinated by how much of understanding a concept can sometimes just be a language issue. Being able to ask #chatgpt to summarize, expand, rephrase and format explanations in different ways is so refreshing. 1/7
Like here’s #ChatGPT explaining how to cook a steak in pseudo code format. 2/7 ImageImage
Here I asked #ChatGPT to explain how version control works in @github but in the context of an anime scene from My Hero Academia. 3/7 Image
Read 7 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(