🔴 PERFUSION: a generative AI model from NVIDIA that fits on a floppy disk 💾
It takes up just 100KB. Yes, you heard it right, much less than any picture you take with your mobile phone! Why is this revolutionary and can change everything?
I'll tell you 🧵👇
Perfusion is a really lightweight "text-to-image" model (100KB) that also trains in just 4 minutes.
It allows creatively portraying objects and characters, maintaining their identity, using a novel mechanism they have called "Key-Locking."
Perfusion can also combine individually learned concepts into a single generated image.
Moreover, it allows controlling the balance between visual alignment and the text prompt at the time of inference, covering the entire Pareto front with just a single trained model.
And why is this revolutionary?
For several reasons.
1️⃣ Such great optimization means that we will soon have truly powerful AI models integrated into our mobile phones, computers, etc. Much lighter, faster to train, and consuming less computing power.
2️⃣ The costs of training models will be drastically reduced in the future with optimizations like this and new techniques that allow everything to be streamlined.
3️⃣ If, in just 100KB, a new technique (key-locking) has achieved such a large increase in the coherence of objects/characters between generations, as in this example, it means that we have only SCRATCHED THE SURFACE of what the future Generative AI will be able to do.
In short, a massive piece of news that I don't understand why it's going so unnoticed. Don't be fooled by "the low quality" of the images. The potential it has is truly MASSIVE.
If you liked this and would like me to continue writing similar threads, an RT on the first tweet of the thread will encourage me to keep doing so. Thanks! 😉👇
There's no way Hollywood won't be affected by this.
7M views in 24 hours on my ES account 🤯
The most complex AI short I've ever made: a test of how advanced generative video really is. Here's exactly what I used 👇
If you made it to the credits, it says it pretty clearly:
• Yes, Seedance 2.0 all the way. I made pretty much 99% of the scenes with Seedance. It's by far the best generative video model out there right now... although I still haven't tried the new Grok one :) The "omni reference" model it's f*cking amazing and works PERFECTLY with reference images from nano banana.
• Freepik: Nano Banana Pro and Nano Banana 2 a lot through Freepik. For all the references used inside Seedance.
• Freepik: ElevenLabs for the voices, also through Freepik. I tested it on their site too, but the 'professional voice' failed for me, so in the end I had to use only 'fast voices'. That's easily the weakest part of the video. Honestly, I think video models will solve this themselves, because a huge part of a believable voice is the acting.
• And Magnific too, of course. I experimented with things like running single frames through Magnific and then feeding them into Seedance as references to improve output quality. I also upscaled some sequences and blended them back with the original video at around 60% to preserve more of the textures.
Any questions, feel free to ask!
A big part of why it went so insanely viral in Spain and Latin America (7M in 24 hours) is that it's a huge tribute to Spanish speaking viewers' favorite YouTubers.
• No more AI plastic skins!
• Enhance EVERYTHING in your image, not only the skin!
• 3 different flavours + easy presets: improve light, level or reality, color grading, etc.
Let's dive in + tutorials + tips 🧵👇
First of all, if you can't wait, here you have the link! AVAILABLE NOW on Magnific & rolling out to Freepik users today!
I’ll also randomly grant access to some of you who reply with a interesting message 😘