I could have created a similar scene in just Unreal Engine and Quixel but I wanted to see what I could do with this landscape image I generated in #midjouney 2/8 #aiartprocess
I'm also trying to do more collaborations with other AI Artists so I used this as an excuse to research depth maps further and see how I could push them. I generated this LeRes depth map using "Boosting Monocular Depth Estimation to High Resolution" on Github. 3/8 #aiartprocess
I applied the depth map as a displacement map on a high-poly plane in @sidefx#houdini. I like Houdini because of the way it helps me deconstruct process with nodes and code and I can experiment with workflows easily. Rendered this natively in Mantra. 4/8 #aiartprocess
I then took the scene mesh and imported it as a .obj into #UE5. I wanted to apply grass with the footage tool directly onto the mesh, but that didn't work. So I sculpted a plane in the shape of the foreground terrain using Unreal's native modelling tools. 5/8 #aiartprocess
I then used the foliage brush to add grass to the plane. There's lots of YouTube tutorials like this one that explain how. 6/8 #aiartprocess
I applied a Third-Person game template in #UnrealEngine5 and did a screen recording of a character I controlled running around the scene in real-time. I also animated a camera and rendered a cinematic with Movie Render Queue. (1st post.) 7/8 #aiartprocess
There's clearly some rough edges to this project, but it's basically an exploration of different ways a 2D image might be coverted into 3D with depth maps. I have some other ideas I'm going try soon. If you have a cool process involving that please share! 8/8 #aiartprocess
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Made this video with iPhone photos I took of my friend Stephanie that I used as keyframes in @LumaLabsAI! With the camera controls I can gen transitions between shots. I also built a custom web app in Next.js to help me speedramp and edit all the clips! Breakdown 🧵(1/18)
Basically if you have some photos, throw in a start and end frame, and start your prompt with the camera move, and then stuff like “smooth camera, steadicam” I find minimal prompts work best. And don’t enhance the prompt (that tends to add hand held). (2/18)
Sometimes I’ll add ‘motion blur’, ‘drone racing’ or ‘music video’ to see how it changes the results. “Perfect face” can help reduce cross-eyes, etc. You’ll still need to experiment, but long prompts or prompts describing the scene don’t usually help with this effect. (3/18)
Testing how LivePortrait works lip syncing 24fps lyrics on top of slow motion footage. Was curious to see if it might help with music videos. Quick explanation below! 🧵
Started with a clip from an Eminem song and passed it through Adobe Podcast to get the acapella. Passed that through @hedra_labs with a Midjourney portrait for the face animation. Used that as input into LivePortrait using ComfyUI and a slowmo clip from Die Hard.
I find it helps for the Live Portrait input to have a plain background. Otherwise you might get extra warping in the background behind the head.
Made this video (🎶) with a Midjourney v6 image! Started by upscaling/refining with @Magnific_AI, pulled a Marigold Depth Map from that in ComfyUI, then used as a displacement map in Blender where I animated this camera pass with some relighting and narrow depth of field.🧵1/12
Here's the base image and the before/after in @Magnific_AI. Even though MJv6 has an upscaler, Magnific gave me better eyelid and skin details for this case. (Fun fact, this image was from a v4 prompt from summer last year, when MJ had just released a new beta upscaler.) 2/12
Next step was using the new Marigold Depth Estimation node in ComfyUI to get an extremely detailed depth map. Note that I'm saving the result as an EXR file (important for adjusting levels later), and that the remap and colorizing nodes are just for visualization. 3/12
Testing LCM LORAs in an AnimateDiff & multi-controlnet workflow in ComfyUI. I was able to process this entire Black Pink music video as a single .mp4 input. The LCM lets me render at 6 steps (vs 20+) on my 4090 and uses up only 10.5 GB of VRAM. Here's a breakdown 🧵[1/11]
Entire thing took 81 minutes to render 2,467 frames, so about 2 seconds per frame. This isn't including the time to extract the img sequence from video and gen the ControlNet maps. Used Zoe Depth and Canny ControlNets in SD 1.5 at 910 x 512. [2/11]
Improving the output to give it a stronger style, more details & feel less rotoscope-ish, will require adjusting individual shots. But doing the entire video in one go lays down a rough draft for you to iterate on—build on fun surprises, troubleshoot problem areas. [3/11]
Timelapse of using #photoshop’s new generative fill feature to connect two images and build a scene around them using blank prompts. Was inspired by @MatthieuGB’s post doing something similar! Notice how I’m not adding any descriptions, but letting gen fill present options for… twitter.com/i/web/status/1…
Here’s the final image! 2/4
And here are the original images made in #midjourneyv51 3/4
Overall flow: pre-process video > img seq > play with prompts > initial controlnet settings > control net batch render > upscale > clean up in post > key out background > deflicker > video post-production 2/15
The approach I used here was figuring out an initial workflow. There’s definitely a lot to play with and improve on. The orig vid is low-res and a little blurry so I started with a pre-process pass by bringing out the edges/brightness & upscaling. 3/15