"Let's say I have two spare days and want to really understand GNNs. What should I do?"
My answers led me to revisit my old 'hints for GNN resources' in light of the new material I've (co)produced. See the thread for a summary!
I'd say it is good to start with something a bit more theoretical, before diving into code. Specifically, I've been recommending my @Cambridge_CL talk on Theoretical GNN Foundations:
Why do I recommend this talk, specifically?
It is good to (a) have a rule-of-thumb to categorise the architectures you encounter, as GNNs evolve at an outrageous pace; (b) have a feel for the connections across different fields that propose GNNs, as each field (e.g. signal processing, NLP...) tends to use its own notation.
Armed with solid foundations, we can explore further.
Within my talk, I list pointers to many other useful resources, but one I’d particularly recommend, especially for gaining a good intuitive coding angle, is @gordic_aleksa's pytorch-GAT repository:
Aleksa wrote some amazing notebooks within this repo, that visually take you through the GNN's operations step-by-step.
The visualisations are great enough to teach the principles that I'd recommend this to all even though it’s just one model (GAT) in one framework (PyTorch).
Conveniently, Aleksa also provides three different common implementation strategies for GNNs, so you can also weigh the pros and cons of each.
Once you have an understanding of the primitive operations, migrating to libraries like PyTorch Geometric or DGL will feel more natural.
Lastly, if all this takes you less than 2 days and you want to broaden the perspective a bit further:
@mmbronstein, @joanbruna, @TacoCohen and I taught a full 12h lecture course about our Geometric DL proto-book, which covers GNNs as a special case:
Following on these materials, you can then either consult: (a) parts of the proto-book, (b) the lectures; especially on graphs & sets + applications, (c) Colab-based tutorials on GNNs with recordings.
Happy learning! :)
• • •
Missing some Tweet in this thread? You can try to
force a refresh
For large-scale transductive node classification (MAG240M), we found it beneficial to treat subsampled patches bidirectionally, and go deeper than their diameter. Further, self-supervised learning becomes important at this scale. BGRL allowed training 10x longer w/o overfitting.
For large-scale quantum chemical computations (PCQM4M), going deeper (32-50 GNN layers) yields monotonic and consistent gains in performance. To recover such gains, careful regularisation is required (we used Noisy Nodes). RDKit conformers provided a slight but significant boost.
We study a very common representation learning setting where we know *something* about our task's generative process. e.g. agents must obey some laws of physics, or a video game console manipulates certain RAM slots. However...
...explicitly making use of this information is often quite tricky, every step of the way! Depending on the circumstances, it may require hard disentanglement of generative factors, a punishing bottleneck through the algorithm, or necessitate a differentiable renderer!
I firmly believe in giving back to the community I came from, as well as paying forward and making (geometric) deep learning more inclusive to underrepresented communities in general.
Accordingly, this summer you can (virtually) find me on several summer schools! A thread (1/9)
At @EEMLcommunity 2021, I will give a lecture on graph neural networks from the ground up, followed by a GNN lab session led by @ni_jovanovic. I will also host a mentorship session with several aspiring mentees!
Based on 2020, I anticipate a recording will be available! (2/9)
Proud to share our 150-page "proto-book" with @mmbronstein@joanbruna@TacoCohen on geometric DL! Through the lens of symmetries and invariances, we attempt to distill "all you need to build the architectures that are all you need".
We have investigated the essence of popular deep learning architectures (CNNs, GNNs, Transformers, LSTMs) and realised that, assuming a proper set of symmetries we would like to stay resistant to, they can all be expressed using a common geometric blueprint.
But there's more!
Going further, we use our blueprint on less standard domains (such as homogeneous groups and manifolds), showing that the blueprint allows for nicely expressing recent advances in those areas, such as Spherical CNNs, SO(3)-Transformers, and Gauge-Equivariant Mesh CNNs.
The crowd has spoken! 🙃 A thread with early-stage machine learning research advice follows below. 👇🧵
Important disclaimer before proceeding: these are my personal views only, and likely strongly biased by my experiences and temperament. Hopefully useful nonetheless! 1/15
During the early stages of my PhD, one problem would often arise: I would come up with ideas that simply weren't the right kind of idea for the kind of hardware/software/expertise setup I had in my department. 2/15
This would lead me on 'witch hunts' that took months (sometimes forcing me to spend my own salary on compute!). Game-changer for me was corresponding w/ researchers that are influential to the work I'd like to do: first learn from their perspectives, eventually internships. 3/15