For large-scale transductive node classification (MAG240M), we found it beneficial to treat subsampled patches bidirectionally, and go deeper than their diameter. Further, self-supervised learning becomes important at this scale. BGRL allowed training 10x longer w/o overfitting.
For large-scale quantum chemical computations (PCQM4M), going deeper (32-50 GNN layers) yields monotonic and consistent gains in performance. To recover such gains, careful regularisation is required (we used Noisy Nodes). RDKit conformers provided a slight but significant boost.
We study a very common representation learning setting where we know *something* about our task's generative process. e.g. agents must obey some laws of physics, or a video game console manipulates certain RAM slots. However...
...explicitly making use of this information is often quite tricky, every step of the way! Depending on the circumstances, it may require hard disentanglement of generative factors, a punishing bottleneck through the algorithm, or necessitate a differentiable renderer!
I firmly believe in giving back to the community I came from, as well as paying forward and making (geometric) deep learning more inclusive to underrepresented communities in general.
Accordingly, this summer you can (virtually) find me on several summer schools! A thread (1/9)
At @EEMLcommunity 2021, I will give a lecture on graph neural networks from the ground up, followed by a GNN lab session led by @ni_jovanovic. I will also host a mentorship session with several aspiring mentees!
Based on 2020, I anticipate a recording will be available! (2/9)
Proud to share our 150-page "proto-book" with @mmbronstein@joanbruna@TacoCohen on geometric DL! Through the lens of symmetries and invariances, we attempt to distill "all you need to build the architectures that are all you need".
We have investigated the essence of popular deep learning architectures (CNNs, GNNs, Transformers, LSTMs) and realised that, assuming a proper set of symmetries we would like to stay resistant to, they can all be expressed using a common geometric blueprint.
But there's more!
Going further, we use our blueprint on less standard domains (such as homogeneous groups and manifolds), showing that the blueprint allows for nicely expressing recent advances in those areas, such as Spherical CNNs, SO(3)-Transformers, and Gauge-Equivariant Mesh CNNs.
During the early stages of my PhD, one problem would often arise: I would come up with ideas that simply weren't the right kind of idea for the kind of hardware/software/expertise setup I had in my department. 2/15
This would lead me on 'witch hunts' that took months (sometimes forcing me to spend my own salary on compute!). Game-changer for me was corresponding w/ researchers that are influential to the work I'd like to do: first learn from their perspectives, eventually internships. 3/15
Over the past weeks, several people have reached out to me for comment on "Combining Label Propagation and Simple Models Out-performs Graph Neural Networks" -- a very cool LabelProp-based baseline for graph representation learning. Here's a thread 👇 1/14
Firstly, I'd like to note that, in my opinion, this is a very strong and important work for representation learning on graphs. It provides us with so many lightweight baselines that often perform amazingly well -- on that, I strongly congratulate the authors! 2/14
I think most of the discussion comes from the title -- most people reaching out to me ask "Does this mean we don't need GNNs at all?", "Have GNNs been buried?", etc.
In reality, this work reinforces something we've known in graph representation learning for quite some time. 3/14