Over the past weeks, several people have reached out to me for comment on "Combining Label Propagation and Simple Models Out-performs Graph Neural Networks" -- a very cool LabelProp-based baseline for graph representation learning. Here's a thread 👇 1/14
Firstly, I'd like to note that, in my opinion, this is a very strong and important work for representation learning on graphs. It provides us with so many lightweight baselines that often perform amazingly well -- on that, I strongly congratulate the authors! 2/14
I think most of the discussion comes from the title -- most people reaching out to me ask "Does this mean we don't need GNNs at all?", "Have GNNs been buried?", etc.

In reality, this work reinforces something we've known in graph representation learning for quite some time. 3/14
Imagine a node classification dataset in which an edge (x, y) means that x and y are likely to share the downstream class. What this means is that, by simply averaging my features across neighbourhoods before applying a node-wise classifier, I get pretty strong features! 4/14
This model is basically a GNN that learn node features without any learnable parameters in the graph propagation, making it super-scalable!

This is the SGC model, detailed here: arxiv.org/abs/1902.07153

and it achieves very strong results on _many_ standard benchmarks. 5/14
Analogously, if I do have access to training labels in such a dataset, then merely propagating these labels along edges is going to be a strong predictor, by definition! The C&S paper rightfully exploits this across the board.

We usually call such datasets: homophilous. 6/14
But more generally, edges could tell you that entities are related, but not in a class-sharing kind of way (e.g. if I RT a post, it doesn't mean I agree with the tweet -- in fact, I could strongly disagree). Propagating OP's label to me is unlikely to classify me correctly. 7/14
This can be taken to the extreme in the case of reasoning tasks: e.g. algorithms and simulations. Edges now merely give you a recipe for information propagation, but do not necessarily encode anything about how neighbourhoods are downstream-related. 8/14
For these kind of datasets, results of homophily-favouring baselines tend to be significantly behind generic message-passing mechanisms (such as MPNNs or GraphNets). See, for example: arxiv.org/abs/1910.10593, arxiv.org/abs/2002.09405.

Very bluntly put: GNNs are *not* dead.

9/14
Back to C&S: what do its very strong results tell us?

Basically, homophilous datasets exploitable by LabelProp are all around us, even in strong benchmark suites such as OGB! Besides SGC, this eye-opening work from @shchur_ is further evidence: arxiv.org/abs/1811.05868. 10/14
A very important point to take from this paper is that we tend to go head-first into graph tasks with strong GNN models, without considering how there could be useful gains just from inspecting and exploiting the graph structure, often even just by smoothing across edges. 11/14
As @austinbenson put very well, there's decades-worth of network science research that should help us a lot here, and should not be ignored.

This can also be helpful to graph representation learning more generally, as a source of inspiration for designing stronger GNNs. 12/14
In summary: always try a simple model (SGC / C&S / ...) first before committing yourself to the latest MPNN.

Not only are they far easier to scale up (see e.g. PPRGo: arxiv.org/abs/2007.01570 and SIGN: arxiv.org/abs/2004.11198), often they will be _exactly_ what you need. 13/14
I'm very happy to take comments and/or discuss further!

Congrats again to the authors (@qhwang3, @cHHillee, @austinbenson et al.), for such an important paper. If I've missed or mis-attributed any claims above, please feel free to chime in. :) 14/14

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Petar Veličković

Petar Veličković Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @PetarV_93

17 Sep
As requested , here are a few non-exhaustive resources I'd recommend for getting started with Graph Neural Nets (GNNs), depending on what flavour of learning suits you best.

Covering blogs, talks, deep-dives, feeds, data, repositories, books and university courses! A thread 👇
For blogs, I'd recommend:
- @thomaskipf's post on Graph Convolutional Networks:
tkipf.github.io/graph-convolut…
- My blog on Graph Attention Networks:
petar-v.com/GAT/
- A series of comprehensive deep-dives from @mmbronstein: towardsdatascience.com/graph-deep-lea…
For a comprehensive overview of the area in the form of a talk, I would highly recommend @xbresson's guest lecture at NYU's Deep Learning course:

Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!