🔴 PERFUSION: a generative AI model from NVIDIA that fits on a floppy disk 💾
It takes up just 100KB. Yes, you heard it right, much less than any picture you take with your mobile phone! Why is this revolutionary and can change everything?
I'll tell you 🧵👇
Perfusion is a really lightweight "text-to-image" model (100KB) that also trains in just 4 minutes.
It allows creatively portraying objects and characters, maintaining their identity, using a novel mechanism they have called "Key-Locking."
Perfusion can also combine individually learned concepts into a single generated image.
Moreover, it allows controlling the balance between visual alignment and the text prompt at the time of inference, covering the entire Pareto front with just a single trained model.
And why is this revolutionary?
For several reasons.
1️⃣ Such great optimization means that we will soon have truly powerful AI models integrated into our mobile phones, computers, etc. Much lighter, faster to train, and consuming less computing power.
2️⃣ The costs of training models will be drastically reduced in the future with optimizations like this and new techniques that allow everything to be streamlined.
3️⃣ If, in just 100KB, a new technique (key-locking) has achieved such a large increase in the coherence of objects/characters between generations, as in this example, it means that we have only SCRATCHED THE SURFACE of what the future Generative AI will be able to do.
In short, a massive piece of news that I don't understand why it's going so unnoticed. Don't be fooled by "the low quality" of the images. The potential it has is truly MASSIVE.
If you liked this and would like me to continue writing similar threads, an RT on the first tweet of the thread will encourage me to keep doing so. Thanks! 😉👇
• No more AI plastic skins!
• Enhance EVERYTHING in your image, not only the skin!
• 3 different flavours + easy presets: improve light, level or reality, color grading, etc.
Let's dive in + tutorials + tips 🧵👇
First of all, if you can't wait, here you have the link! AVAILABLE NOW on Magnific & rolling out to Freepik users today!
I’ll also randomly grant access to some of you who reply with a interesting message 😘
There's no way Hollywood won't be affected by this.
I created this whole scene in less than 2h using Veo 3 (AI video), Magnific (upscaling), Suno (music, except the first 3s 😉) and CapCut (editing).
The Cambric Explosion of content has already started!
Full tutorial 👇
1. Idea
I've had this idea (a mood) of mixing a 7-eleven at night and a 🐲 for over 2y now.
The concept came to me then, but it wasn't until now that I've been able to bring it to life visually.
Veo 3 feels like being back in Apr 2022, when DALL·E 2 hit my brain like a truck.
2. Video generation using Veo 3 inside Freepik (not yet available but soon)
I used ChatGPT to craft all the prompts and then did all the video generation inside Freepik using Veo 3.
Something I've learned is that Veo 3 can handle really long and complex prompts, so don't hesitate to use very detailed descriptions to express the vision you want to create.
Example:
"Close-up shot of a pair of hands reaching toward a dusty black tome resting on a low shelf inside a dimly lit 7-Eleven. The book has a worn leather cover with a flaming dragon etched in glowing, fiery lines across the front. Above the image, an unreadable title is inscribed in ancient golden runes. The hands pick up the book slowly and carefully, as if sensing its weight and age. At the edges of the frame, part of a red puffy vest is visible over a faded denim jacket and a plaid shirt sleeve, revealing just enough of the young man’s layered clothing to hint at his presence."