I started by having #chatGPT write a few rough drafts of a scene involving a panicked character calling her friend for help from a spaceship. I was going for something that would involve heightened emotions but not be too serious. 2/8
Then I wrote a short script using some of those ideas plus my own and put the whole thing into @elevenlabsio. I generated a few takes using low Stability (1-2%) and high Clarity (90-99%). Each take usually had parts I liked, or at least gave me ideas for direction. 3/8
I stuck to one voice I liked for simplicity. Changing voices can sometimes dramatically alter the sound to where it almost feels like diff mics were used. I decided I'd just change the pitch of the voices in post to differentiate them more. 4/8
After doing a few takes of the whole script, I generated individual lines. There I'd experiment with the "prompt" to see if I could direct the acting more by adding ellipses, diff punctuation, line breaks, and misspellings. Here's a sample of my history. 5/8
Then I laid everything out in #premierepro. I cut up the audio into sections with different takes and methodically edited down to my favorites, trying to choose parts that blended well together. 6/8
When parts wouldn't blend well together, I'd just rewrite the lines and generate a few more takes in @elevenlabsio. It's almost like instantaneous ADR. Then I used #adobeaudition for shifting the pitch in the voices and adding reverb. 7/8
Last step was using the script as rolling credits and put it on an image I made in #midjourney. I added the audio wave in After Effects. 8/8
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Made this video (🎶) with a Midjourney v6 image! Started by upscaling/refining with @Magnific_AI, pulled a Marigold Depth Map from that in ComfyUI, then used as a displacement map in Blender where I animated this camera pass with some relighting and narrow depth of field.🧵1/12
Here's the base image and the before/after in @Magnific_AI. Even though MJv6 has an upscaler, Magnific gave me better eyelid and skin details for this case. (Fun fact, this image was from a v4 prompt from summer last year, when MJ had just released a new beta upscaler.) 2/12
Next step was using the new Marigold Depth Estimation node in ComfyUI to get an extremely detailed depth map. Note that I'm saving the result as an EXR file (important for adjusting levels later), and that the remap and colorizing nodes are just for visualization. 3/12
Testing LCM LORAs in an AnimateDiff & multi-controlnet workflow in ComfyUI. I was able to process this entire Black Pink music video as a single .mp4 input. The LCM lets me render at 6 steps (vs 20+) on my 4090 and uses up only 10.5 GB of VRAM. Here's a breakdown 🧵[1/11]
Entire thing took 81 minutes to render 2,467 frames, so about 2 seconds per frame. This isn't including the time to extract the img sequence from video and gen the ControlNet maps. Used Zoe Depth and Canny ControlNets in SD 1.5 at 910 x 512. [2/11]
Improving the output to give it a stronger style, more details & feel less rotoscope-ish, will require adjusting individual shots. But doing the entire video in one go lays down a rough draft for you to iterate on—build on fun surprises, troubleshoot problem areas. [3/11]
Timelapse of using #photoshop’s new generative fill feature to connect two images and build a scene around them using blank prompts. Was inspired by @MatthieuGB’s post doing something similar! Notice how I’m not adding any descriptions, but letting gen fill present options for… twitter.com/i/web/status/1…
Here’s the final image! 2/4
And here are the original images made in #midjourneyv51 3/4
Overall flow: pre-process video > img seq > play with prompts > initial controlnet settings > control net batch render > upscale > clean up in post > key out background > deflicker > video post-production 2/15
The approach I used here was figuring out an initial workflow. There’s definitely a lot to play with and improve on. The orig vid is low-res and a little blurry so I started with a pre-process pass by bringing out the edges/brightness & upscaling. 3/15
I wonder what the future of UX design (maybe apps in general) may be like if AI really allows us to customize our experience. Not to mention blend programs together through a 3rd party/custom UI if an AI can understand onscreen what's being displayed by the app's GUI. 1/5
Combined with no code platforms of the future and advanced templates, you could probably do weird stuff like Premiere Pro x Unreal Engine x a fighting game template x an anime you like and custom gameify your interface. 2/5
Or maybe you could just submit to a chat AI to combine several apps/aesthetics together and present different connections and gameification strategies based on knowledge of UI/UX design. 3/5
Lately I've been thinking about how much of "reality" is a negotiation with useful illusions and the material world. I think it's safe to say that a portion of how we view things is through shortcuts and narratives. 1/11
To what degree we engage in fictions probably differs from person to person. Some believe the entire thing is a fiction passed on to us from evolution to navigate the food chain. Others think they concretely engage in reality the whole time. 2/11
Personally I think it's interesting how much of the world we can't “see” except through technology or layers of reasoning—radio waves, germs, the financial system, justice. These aren’t simple things we just look at and easily have collective intuition about. 3/11