Follow @pfau

12,399 views

David Pfau

Follow @pfau

, 17 tweets, 6 min read

My Authors

Excited to share our latest work: the Geometric Manifold Component Estimator, or GEOMANCER, a nonparametric algorithm for symmetry-based disentangling! 1/n

Paper: arxiv.org/abs/2006.12982
Code: github.com/deepmind/deepm…

"Disentangling" is a somewhat nebulous term in ML, but it is broadly about building models that can separate out different latent factors of variation - for instance, in vision, separating translation, rotation, and changes in lighting or color that leave objects invariant. 2/n

There are many definitions of disentangling - this paper is focused on the "symmetry-based" definition, which formalizes different possible invariances in the world as a product of continuous transformations, also known as Lie groups 3/n

We formalized symmetry-based disentangling in a paper a few years ago - in short, a representation is disentangled if it matches the product structure of the group transformations that act on objects in the world: arxiv.org/abs/1812.02230 4/n

While this helped clarify terms used in the field, it did not provide any recipe for how to *learn* this product structure for Lie groups. That's where GEOMANCER comes in. 5/n

GEOMANCER was inspired by an observation about analogical reasoning. When working with vector representations, you can make analogies just by adding vectors together. For instance, hand + leg - arm = foot. This got popular in NLP in the mid-'10s with word2vec and GloVe. 6/n

But this model of analogies breaks down when you move from vectors representations to Lie groups. Suddenly things don't commute any more! This is especially a problem when dealing with 3D rotations, which are ubiquitous in computer vision. 7/n

Some image analogies we can complete without a problem. For instance, models like StyleGAN and StarGAN can transfer attributes from one type of face to another quite well. (Pictured: StarGAN) 8/n

Other analogical reasoning problems are a little more...ambiguous. 9/n

The idea behind GEOMANCER is to *use this ambiguity as a learning signal itself.* Directions that are disentangled from one another will be those such that analogies made in those directions can be completed unambiguously, even over long distances. 10/n

Formalizing this idea mathematically leads to a branch of differential geometry known as holonomy theory, that specifically deals with how much vectors deviate from their behavior in flat spaces when moved around a curved manifold. 11/n

Working through the math, we arrive at an algorithm based around the idea of subspaces undergoing random walk diffusion on a data manifold. We have an explanation of what this means in the paper, which the margin of this tweet is too narrow to contain ;) 12/n

On synthetic manifolds, we are able to automatically discover the correct number of submanifolds, their dimension, and (up to sampling noise) learn the disentangled directions almost exactly. This works on the product of as many as 5 manifolds, far more than other methods. 13/n

But, our method assumes that the data is already in a space where distances are correct and disentangled directions are at right angles. That usually isn't the case for raw data - so the problem of symmetry-based disentangling is only half-solved! 14/n

Because we start from the symmetry-based definition and work backwards from first principles, we believe that GEOMANCER is a promising first step in a research direction that will lead to more general and robust disentangling algorithms. 15/n

Huge thanks to my coauthors Irina Higgins, Alex Botev and Seb Racaniere (none of whom are on Twitter?)! And thanks to Chris Burgess and the team behind MONet (arxiv.org/abs/1901.11390) for sharing the gif in slide 2 of this thread. /fin

@sracaniere

@sracaniere

OK, you can find Seb at @sracaniere. And Chris is @cpburgess_. Anyone else I'm missing?

Try unrolling a thread yourself!

More from @pfau see all

Embed code for your website

Did Thread Reader help you today?