Bilawal Sidhu Profile picture
Blending reality & imagination. Mapped the world at Google. Now mapping the future of creation & computing. TED Curator. A16z Scout. https://t.co/w6Hg0z80rd
5 subscribers
Dec 16, 2024 8 tweets 4 min read
BREAKING: Google just dropped Veo 2 and Imagen 3 -- their next gen video and image generation models.

Turns out Google's been closing the gap quietly -- not just on LLMs, but on visual creation too.

Here’s everything you need to know w/o the hype 🧵 1/ First, let's get the Veo 2 updates out of the way:

• Up to 4K resolution (woot!)
• Increased detail & realism
• Improved human movement & expressions
• Better physics modeling & temporal coherence

On Meta's Movie Gen Bench, Veo holds it down against top video models: Image
Dec 11, 2024 9 tweets 4 min read
BREAKING: Here are the coolest things Google announced today; got the press briefing yesterday and here's my favorites w/o the hype.

TL;DR Gemini 2.0 brings multimodal creation, research agents, browser control, and massive compute upgrades. Plus dope research.

🧵 Let's dive in Image 1/ Let's talk Gemini 2.0 Flash:
• 2x faster than 1.5 Pro while outperforming it on key benchmarks
• Native tool use (Search + custom functions)
• New Multimodal Live API for realtime audio/video streaming w/ smart interrupt detection
• Available today; more model sizes in Jan Image
Oct 17, 2024 14 tweets 6 min read
Heads up! Mosaic dropped a pretty wild dataset of 1.26 million 360° images of Prague 🤯

If you're a researcher, creator or developer into 3D/AI/Geo, I think you're gonna wanna play with this

Here's the scoop on this 15 TERAPIXEL dataset & the crazy things you can do with it 🧵 The specs are nuts:

• 210,469 panos in 13K
• 1,262,814 source images (6 x 12MP)
• 1 image every meter
• 2cm pose accuracy

Not quite Google level, but the pano density is WAY higher. An image every meter means it's perfect for all sorts of spatial 3D stuff.
Feb 16, 2024 9 tweets 3 min read
OpenAI just dropped their Sora research paper.

As expected, the video-to-video results are flipping spectacular 🪄

A few other gems: Another superpower unlocked is the ability to seamlessly blend individual videos together.

Note how the drone transforms into a butterfly as gradually find ourselves underwater
Dec 30, 2023 9 tweets 4 min read
Top Gun Maverick. For a movie with no CGI, it sure has a lot of it.

A whopping 2,400 (!!) visual effects shots in fact.

But wait, wasn't everything filmed practically? 😉

Sure was. Yet almost every jet you see on-screen is CGI.

Let's dive into this "invisible" movie magic 👇 For starters, the level of practical filming in Top Gun is cool.

Much of the principal photography was filmed "for real" - ensuring the action always felt anchored in reality.

But make no mistake - there's a ton of invisible CGI involved that you probably didn't notice. 👇
Oct 8, 2023 5 tweets 2 min read
With Gaussian Splatting you get 3D editing support! So you can select, move, and delete stuff; apply shader fx. This type of editing has been tedious to do with NeRFs and their implicit black box representations.

Case in point (1/3) by @hybridherbst:
Case in point (2/3): repurpose your point cloud shaders to make something unreal like @Ruben_Fro
Jun 1, 2023 8 tweets 4 min read
AI just took 3D modeling to a whole new level 🤯

Introducing Neuralangelo, a new AI model by NVIDIA that reconstructs mind-blowingly detailed 3D surfaces directly from 2D videos — like photogrammetry on steroids. 🧙🏻‍♂️

Keep reading to see this crazy magic for yourself 🧵 So, what the heck is is this "photogrammetry" thing NVIDIA is supercharging with AI?

TL;DR photogrammetry is the art & science of measuring stuff in the real world using images and other sensors (e.g. LiDAR).

Here's a 60 second primer:
May 30, 2023 5 tweets 2 min read
🌐 Minecraft2Reality 🌍

Ever look at the blocky world of Minecraft and think, "Yeah, but what if it was real?" No? Just me then. 😌

Here's what happens when you feed Minecraft screen captures to an AI with an appetite for reality. 👇 🌍 🎮 Welcome to reality, Minecraft-style 🎮 🌍

I crammed a Minecraft screen capture into a fancy AI blender – namely ControlNet, EbSynth, and Stable Diffusion.

The result? Pure visual umami.

Imagine giving all your favorite video games an instant upgrade.
May 29, 2023 19 tweets 8 min read
Video-to-video AI models are like Snapchat filters on steroids 🔥

Capture a video once and transform it infinitely in post.

See below: Original vs. photoreal vs. cartoon-style.

Tons of stylistic range, yet plenty of room for improvement

Here's how to level up your AI videos🧵 Watch this classic Office Space clip.

Two main areas of improvement:

1. Stylistic Consistency: characters & environment transform abruptly between keyframes

2. Temporal Consistency: facial & body performance is often lost

Let's unpack each problem and discuss solutions 👇
May 29, 2023 4 tweets 2 min read
3D games + AI agents = win 🔥

Such a wild demo by NVIDIA.

Really makes me want to upgrade this ChatGPT-powered Tech CEO Debate Simulator to work in Omniverse 😁

Topic: "Can we regulate AI successfully?" 👇 Here's Varun, who already made such a 3D simulator inside Unreal Engine with multi-GPT agents that have personality, memory and have topic-based convos.

This is AI Seinfeld on steroids:
May 27, 2023 12 tweets 5 min read
I guess Trump decided to take a trip to India, and it was pretty lit 😁

Midjourney (AI) rendition of celebs continues to impress 🧵 ImageImageImage "Better than the Chicken Dance at Mar-a-Lago, folks!" Image
May 10, 2023 11 tweets 7 min read
🚀 Big news today with Google + Adobe joining forces!

We're talking about 3D content anchored to the real world at insane scale🌐 And of course, AI had a role to play.

I've got early access, and let's just say the physical & digital worlds are blurring 😎 Let's get into it!🧵 📽️ Remember when Times Square and Piccadilly Circus was transformed into a live @gorillaz concert with AR?

Imagine that kind of immersive experience, but created by ANYONE 🤯

That's the level of game-change we're talking about! ⬇
May 10, 2023 7 tweets 3 min read
🌳🎮 The physical and digital worlds are converging. I used AI to transform the historic Lodhi Garden in India into a Minecraft landscape 🕌🌳

🧩🍃 I created a 3D NeRF of this serene garden using GoPro video, then transformed it into the blocky Minecraft aesthetic using… twitter.com/i/web/status/1… It's crazy how fast things move.

Here's results from 6 months ago -- a jittery mess.

Just imagine where we'll be in 6 more months.
May 9, 2023 8 tweets 8 min read
Speaking at TED was incredible. Grateful for the opportunity and the experience.

Look out for the full talk and panel on @TEDTalks in the coming weeks. Or check it out on TED Live today.

In the meantime, enjoy some photos and takeaways from an unforgettable week in Vancouver: ImageImageImage But first -- a 3D scan at the TED venue reskinned by AI. Because why the heck not!

Unsurprisingly, my TED Talk was titled "Blending Reality & Imagination with Artificial Intelligence" 😁
May 7, 2023 4 tweets 2 min read
🚘🌌 AI-Powered Joyride: Cyberpunk San Francisco 🌉✨

🏙️ The world is changing quickly. Brace yourself as reality and fantasy intertwine, with AI turning into lenses through which we'll see the world. 🌐🌆

⚙ Brought to life by Kaiber Video2Video (featuring ControlNet, Stable… twitter.com/i/web/status/1… Actually, on second thought -- it's more like Solarpunk San Francisco
😎🌳🏡🌆 🌉 ImageImage
Apr 30, 2023 5 tweets 2 min read
🏞️ An Otherworldly Waterfall 😍
🏡 Solar-punk inspired AI video
🔮 NeRFs + ControlNet + EbSynth = Reality Bending Magic! 🪄 2/ Statue of Liberty 🗽 materializing and dematerializing ✨
Apr 5, 2023 10 tweets 6 min read
🤯 Wondering why creators like @SirWrender are losing their minds over @WonderDynamics?

Short answer: it’s a middle ground between 3D, VFX and editorial tools ⚔️

So what took 3 days across many tools — takes 3 minutes in just one tool!

🧵 Thread (0/8): 1/8 Historically, digital creation tools have lived in specialized silos — all chained together in the classical waterfall fashion.

This works quite well for producing long-form content in multi-year productions — with teams of specialized artists who do one thing very well. Image
Apr 3, 2023 5 tweets 3 min read
Creators rejoice — because NeRFs are finally coming to Unreal! 🎉

Easily digitize a space or place with 2D images alone, and conjure it up later with photorealistic rendition.

Combined with the real-time nature of Unreal — sky is the limit for VFX 🔥
These are full blown volumetric NeRFs. So I no longer need to composite VFX elements into source imagery *before NeRFing* to pull off effects like these glorious muzzle flashes:
Apr 1, 2023 22 tweets 10 min read
What a week for AI! Not yet scary, but a feeling is in the air. Things are heating up and people are conflicted.

Why are the brightest minds in AI asking for a 6 month pause, while others say it doesn't go far enough? 🤯

Here's why this debate deserves our attention.

🧵 Thread A modicum of relief was bestowed upon us this past week, after a two-week period riddled with launch-after-launch of the most advanced AI capabilities the world has ever seen.

The outcome? Unprecedented AI power to the people 👇
Mar 25, 2023 14 tweets 9 min read
Midjourney v5 has pushed into photorealism, a goal which has eluded the computer graphics industry for decades (!) 🤯

Insane progression, and all that by 11 people with a shared dream.

🧵 Let's explore what these breakthrough in Generative AI mean for 3D & VFX as we know it... First off, Midjourney v5 is far more photorealistic out-of-the-box. Where as it's predecessor has a more painterly, stylized bent.

Here's a thorough comparison of v5 vs v4 incase you want to go deeper. But let's keep going...
Mar 23, 2023 7 tweets 4 min read
If you though reskinning 2D videos was fun, how about reskinning 3D captures of the world?

That's exactly what you get when you combine NeRFs with InstructPix2Pix in this new paper by @ayaanzhaque et al.

Mini-thread🧵 InstructPix2Pix is applied to the input 2D views used for training to NeRF in an iterative fashion.

Notice below that the edits are gradually becoming more consistent over time.

I'm impressed with how well it works!