Nick Sharp Profile picture
3D geometry researcher: graphics, vision, 3D ML, etc | Senior Research Scientist @NVIDIA | running, hockey, baking, & cheesy sci fi | opinions my own | he/him

Feb 12, 2022, 16 tweets

📢Hot off the presses: we present **DiffusionNet** for simple and scalable deep learning on surfaces.

The networks generalize by construction across different samplings, resolutions, and even representations. Spatial support is automatically optimized as a parameter! 🧵👇 (1/N)

I've been excited about this one for a long time; now we can finally share widely! This is work with the excellent Souhaib Attaiki, @keenanisalive, and Maks Ovsjanikov. Will appear in ACM Transactions on Graphics and at SIGGRAPH 2022.

Here's a thread of the key ideas. (2/N)

The big idea is to build networks which propagate information not by passing messages between neighboring vertices/faces/points, but by smoothly diffusing features along the surface. (3/N)

We generally expect simulation and geometry algorithms to produce roughly the same result on different meshes of the same surface, yet existing graph-like neural nets eagerly overfit to mesh connectivity. DiffusionNet addresses this with physical PDE-based diffusion. (4/N)

Diffusion is like smooth mean pooling: diffusing for a short time yields local communication, while diffusing for a long time gives the global mean. Rather than manually tuning spatial support, we can directly optimize the diffusion time as a continuous parameter. (5/N)

There are many numerical schemes for diffusion; we mainly use a precomputed spectral decomposition, which makes diffusion easy to evaluate and autodiff via dense GPU arithmetic. This scales to >100k vertex inputs---surface learning is moving beyond small ~5k vertex meshes! (6/N)

That's mainly it for our networks! We alternate applying a pointwise MLP, diffusing feature channels, and constructing some handy gradient features (see below). Avoiding potentially-tricky operations like pooling hierarchies helps keep the method simple, fast, and robust. (7/N)

I was surprised that we can do away with traditional convolutions and use diffusion instead. Turns out, it's no coincidence---we prove that the function space of diffusion + an MLP contains all radially-symmetric convolutions (in the smooth setting, and in the wide limit). (8/N)

But, like much recent work, we want to incorporate directional info beyond radially symmetric filters. Our approach is to take the a spatial gradient of each feature channel at each point, then use (learned) inner products of those gradients as additional scalar features. (9/N)

These spatial gradient features capture directional context, and are invariant to the choice of tangent basis, as expected. Interestingly, they also capture the _orientation_, allowing us to disambiguate bilateral symmetry even in a purely intrinsic formulation. (10/N)

DiffusionNet leverages existing work on robust geometry/PDEs, rather than reinventing it. E.g., diffusion can be evaluated via the intrinsic Delaunay Laplacian, which is accurate even on poor-quality meshes. But diffusion is so stable that this usually isn't necessary. (11/N)

Results-wise, it's "state-of-the-art" for classification, segmentation, and correspondence.

But what I think is more important is that DiffusionNet works robustly & efficiently out-of-the-box on a wide range of inputs, a key property for reusable computational tools. (12/N)

One more great property: since these networks are defined via Laplacians and gradients, we can apply them to any representation where these ops are defined, and even use the same network weights.

Train on a mesh and test on a point cloud, or mix them in a dataset! (13/N)

Limitations? Diffusion along surfaces is sensitive to topology: there is no communication at all between disconnected components. One fix would be to combine spatial diffusion w/ some nonlocal operator at the later layers of the network. Give it a try! (14/N)

At the high level, I like this work because it's an example of how deep learning can benefit by building atop the foundational tools from geometry and visual computing, rather than reinventing them. What else can we all unlock with such thinking? (15/N)

Webpage: nmwsharp.com/research/diffu…
PDF (7MB): nmwsharp.com/media/papers/d…
Code (pytorch), data, & pretrained models: github.com/nmwsharp/diffu…
(N/N)

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling