It’s hard to overstate how happy I am to finally see this come together, after years of careful progress towards our aim -- demonstrating that AI can be the mathematician’s 'pocket calculator of the 21st century'.
I hope you’ll enjoy it as much as I had fun working on it!
I was leading the GNN modelling on representation theory: working towards settling the combinatorial invariance conjecture, a long-standing open problem in the area.
My work earned me the co-credit of 'discovering math results', an honour I never expected to receive.
We showed that combining algorithmically-inspired GNNs and rudimentary explainability techniques enables mathematicians to make meaningful progress on such challenging tasks.
AI is well-positioned to help as meaningful patterns emerge only once inputs get very unwieldy in size.
To be clear: AI did _not_ discover the specific theorems and conjectures we expose here.
Mathematician experts did, _in synergy_ with predictions and interpretations derived from carefully-posed machine learning tasks. And this synergy is what makes the method so exciting, imho.
I’d like to deeply thank our collaborators: András Juhász, Marc Lackenby and Geordie Williamson. They successfully derived these results, put faith in our effort when we had no clear way to justify its promise, and remained a constant source of inspiration and encouragement.
"Let's say I have two spare days and want to really understand GNNs. What should I do?"
My answers led me to revisit my old 'hints for GNN resources' in light of the new material I've (co)produced. See the thread for a summary!
I'd say it is good to start with something a bit more theoretical, before diving into code. Specifically, I've been recommending my @Cambridge_CL talk on Theoretical GNN Foundations:
Why do I recommend this talk, specifically?
It is good to (a) have a rule-of-thumb to categorise the architectures you encounter, as GNNs evolve at an outrageous pace; (b) have a feel for the connections across different fields that propose GNNs, as each field (e.g. signal processing, NLP...) tends to use its own notation.
For large-scale transductive node classification (MAG240M), we found it beneficial to treat subsampled patches bidirectionally, and go deeper than their diameter. Further, self-supervised learning becomes important at this scale. BGRL allowed training 10x longer w/o overfitting.
For large-scale quantum chemical computations (PCQM4M), going deeper (32-50 GNN layers) yields monotonic and consistent gains in performance. To recover such gains, careful regularisation is required (we used Noisy Nodes). RDKit conformers provided a slight but significant boost.
We study a very common representation learning setting where we know *something* about our task's generative process. e.g. agents must obey some laws of physics, or a video game console manipulates certain RAM slots. However...
...explicitly making use of this information is often quite tricky, every step of the way! Depending on the circumstances, it may require hard disentanglement of generative factors, a punishing bottleneck through the algorithm, or necessitate a differentiable renderer!
I firmly believe in giving back to the community I came from, as well as paying forward and making (geometric) deep learning more inclusive to underrepresented communities in general.
Accordingly, this summer you can (virtually) find me on several summer schools! A thread (1/9)
At @EEMLcommunity 2021, I will give a lecture on graph neural networks from the ground up, followed by a GNN lab session led by @ni_jovanovic. I will also host a mentorship session with several aspiring mentees!
Based on 2020, I anticipate a recording will be available! (2/9)
Proud to share our 150-page "proto-book" with @mmbronstein@joanbruna@TacoCohen on geometric DL! Through the lens of symmetries and invariances, we attempt to distill "all you need to build the architectures that are all you need".
We have investigated the essence of popular deep learning architectures (CNNs, GNNs, Transformers, LSTMs) and realised that, assuming a proper set of symmetries we would like to stay resistant to, they can all be expressed using a common geometric blueprint.
But there's more!
Going further, we use our blueprint on less standard domains (such as homogeneous groups and manifolds), showing that the blueprint allows for nicely expressing recent advances in those areas, such as Spherical CNNs, SO(3)-Transformers, and Gauge-Equivariant Mesh CNNs.