5/๐คฉ #Gen1 is the beginning of an exciting new future of AI-generated videos, in which most online content will be, to some degree, AI-generated or -assisted. ๐ฅ๐ฅ
๐ Wide-angle lenses (10mm to 35mm)
๐บ Standard lenses (35mm to 85mm)
๐ Telephoto lenses (85mm to 300mm)
They often correspond to these shots:
Wide angle (wide-angle)
Medium to medium close-up (standard)
Close-up (tele)
Prompting for "close-up", getting medium close-ups ... ๐
We wanted tele, but are probably a bit off toward standard. However, given the depth of field and compression, closer to 80mm than to 50mm. Fair enough.
๐ฐ๐ป๐ก The industry is eager to save money & resources /w AI efficiency + #GPT4 is leading the charge. I expect production processes to be changed within a year.
The infamous #OpenAI study has shown: Creative tasks are the most exposed to AI automation
The consequences of AI are real, and they're coming to town. Now, how can AI improve the creative process? By helping with structural groundwork, for example: medium.com/design-bootcamโฆ #AI#storytelling
The model is simply called โtext-to-video synthesisโ. A brief summary:
- 1.7 billion parameters ๐ฅ
- training data includes public datasets like LAION5B (5,85 billion image-text pairs), ImageNet (14 million images) & WebVid (10 million video-caption pairs) ๐
- open source ๐ช
Text-to-video synthesis consists of three sub-networks that work together to produce short MP4 video clips:
- a text feature extraction,
- a text feature-to-video diffusion model,
- and a video-to-video diffusion model.
However, 3.6:1 (and higher ratios) seems to work better if you drop the cinematic prefixes (cinematic shot, film still, etc.) ๐คทโโ๏ธ๐ค
Here it's just scene & style description. 3.6:1, no letterboxing.
despite the letterboxing, exploring 4:1 is fun ... ๐๐ #MidjourneyV5