Professor for Visual Computing & Artificial Intelligence @TU_Muenchen
Co-Founder @synthesiaIO
3 subscribers
May 30 • 6 tweets • 2 min read
(1/6)
NeRF vs 3D Gaussians vs 3D Meshes
What's better? It's actually simple: the recent success in photo-realistic 3D reconstruction relies on efficient differentiable volumetric rendering. Rays across aligned images intersect, and we solve through ray integrals for the surface.
(2/6)
Why does it work? Volumetric rendering provides gradients for the entire volume - even when we don't know yet where the surface is. This is the case when we start the reconstruction.
Only during optimization will the 3D representation converge towards the actual surface.
Feb 23 • 11 tweets • 4 min read
(1/n)
𝐒𝐨𝐫𝐚 generates stunning videos and is a game changer!
- But how does it work technically?
- What’s different from existing video diffusion?
- How did we get there?
Lots of speculation - here's my take from a technical perspective! 🧵
(2/n)
First, we need to look at image diffusion models.
Starting from an image of Gaussian noise, the model gradually denoises the image in an iterative process.
Training can be conditioned on text and performed on datasets such as LAION with several billion text-image pairs.
Nov 1, 2022 • 6 tweets • 2 min read
(1/n)
Key to successful projects in Deep Learning are fast turnaround times of experiments.
For large models, training often takes several days or even weeks, and it might need countless runs to find hyperparameters that yield good results.
How to get things fast? A thread🧵
(2/n)
First, check timings of a single iteration!
Is it reasonably fast given the model complexity? Are we compute bound (backprop) or is the limitation in the data loader?
Important: understand how timings work on the GPU - CUDA calls are asynchronous and need device syncs!
May 18, 2022 • 9 tweets • 2 min read
(1/n)
We often hear that AI & machine learning produce great results but we don't understanding why. Specifically, many consider neural networks to be black boxes that no one understands.
However, I don't think that's true; in fact, research has quite some insights. A thread 🧵
(2/n)
First, what is machine learning? Modern ML fits parametric models (e.g., neural nets) to a data distribution. The model has to be large enough to capture its variety, but it also has to regularize to avoid simply memorizing train samples - then the model will generalize.
Sep 23, 2021 • 15 tweets • 4 min read
(1/n)
How to start a deep learning project?
We use a remarkably streamlined step-by-step process to set up deep learning projects. At the same time, people who are new to deep learning tend to always make the same (avoidable) mistakes.
Check out the thread below! 🧵
(2/n)
General advice: start simple -> use a small architecture (less than 1 mio params). In vision, ENet or a crippled ResNet-18 (only the first blocks) is a good choice. Common mistake: train model with 100mio+ params for 3 weeks only to notice that the data loader is broken.
Mar 2, 2021 • 10 tweets • 2 min read
(1/n)
In the past 2.5 years, I received about 1,000 PhD applications. I wanted to share some thoughts, which might be helpful to get into the right program. Experience is from a European perspective but should apply elsewhere.
Here's the lessons learned: 👇
(2/n)
Template applications gain little attention; e.g. "Dear respect Professor <𝒄𝒐𝒑𝒚 𝒑𝒂𝒔𝒕𝒆 𝒏𝒂𝒎𝒆 𝒘𝒊𝒕𝒉 𝒅𝒊𝒇𝒇𝒆𝒓𝒆𝒏𝒕 𝒇𝒐𝒏𝒕>" is not a great start. Pro tip: ctrl+shift+V pastes text without formatting. Also avoid generic phrasing like “I want to do AI”...