Huge appreciation for all reviewers esp R5 in making our work better.
My goal in 🧵: Explain our work in my simplest terms to you. Don't worry if you get lost, it's admittedly dense :)
Disentanglement in your generative model means dimensions in its latent space can change a corresponding feature in its data space, e.g. adapting just 1️⃣ dim can make the output "sunnier" ☁️→🌥→⛅️→🌤→☀️ Contrast w/ this entangled mess ☁️→🌥→🌩→🌪→☀️
Disentanglement is a great thing for your generative model b/c it means your model has learned a very compact representation of the data. And that's arguably the best representation it can get — nothing extra. 👍
How do you know your model is disentangled? Well, evaluating it isn't straightforward. It typically depends on what type of generative model you have: GAN, VAE, etc. An evaluation metric that tells you whether your GAN is better than your VAE would be useful though... 👇🏿
Well, hello there! 👋🏻 Our method does just that. B/c prior methods rely on an external model—an encoder or a classifier—or a specific dataset, ours looks at an intrinsic property of your GAN, VAE, or what have you. That intrinsic property is the *manifold topology* 👇🏽
Manifold? Yeah, so your generative model learns a data manifold (think: IKEA lamp) that tries to approximate the real data manifold. The topology of the manifold is important and refers to the global structure, like how many holes it has—vs. geometry, which has local structure
One topological invariant is homology, or the study of holes in the manifold—seriously these IKEA lamps are starting to make sense. An easy & tractable way to estimate homology is to start by making simplicial complexes: think graph-like structures 👇🏾
These graph-like simplicial complexes start as generated data samples as nodes, no edges. How far apart they are matters: over time, imagine balls around each node with an increasing radius. At each radius, if two spheres intersect, you draw an edge btw those nodes.
See Geometry Score by Valentin Khrulkov & @oseledetsivan. This paper was our huge inspo and intro'd persistent homology for generative models. We brought it to disentanglement, speaking of which 👇🏼
So turns out, persistent homology can get the gist of how holey—holes, not 😇—the data manifold of your model is. For disentanglement, you want to know how each latent dimension impacts the data manifold, ie. to what extent ☁️→🌥→⛅️→🌤→☀️ happens.
If your model is truly disentangled, and you hold a latent dimension constant ("condition it") at a certain value and then also at another value, you'd expect the 2️⃣ resulting data manifolds w/r/t all the other dimensions to look the same. We call these conditional submanifolds.
We use persistent homology on these conditional submanifolds to see how similar they are. This similarity is used to evaluate disentanglement. That's the main thrust of our paper. What's 🆒 is that this method is unsupervised! We also include a supervised variant.
Check out the paper for additional steps taken using persistent homology. In that, we also introduce Wasserstein distance/barycenters into the calculation.
We include a sizable limitations section b/c while it might be better, it sure isn't perfect 🙈
Experiments on 9 models—5 VAEs and 4 GANs including StyleGAN—and 3 datasets for both unsupervised and supervised variants of our evaluation metric. Our results generally agreed with other disentanglement metrics (MIG, PPL, etc.) on experiments that the others were designed for.
We are actively 🧹 code here: github.com/stanfordmlgrou…. In case you want to dive one layer deeper, our appendix is quite large.
Hope this helps advance progress in generative models, by measuring how we're doing re: disentanglement across {models, datasets}. ❤️
I'd also like to acknowledge @_smileyball for helpful & fun early discussions. And for his fantastic smiley ball drawings that just keep you smiling...
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Lasting collaborations can come from transient places.
One of my collaborations came out of a train ride 🚊 where I traded notes with a mathematician, en route to Stockholm. Followed by making these corgis happen.
Story 🧵👇🏿
On the 4 hr 🚊ride, I talked about generative models and neural networks. He talked about fractals and Mobius transformations — and even how all this ties into making better compression socks 🧦.
Hours 1-2: Just a pen and a few loose pages.
Mobius transformations generalize affine transformations
(see cute dogs). They are found naturally in biology. Maybe... we could use these for data augmentation, without much tuning across a ton of different augmentations.