Petar Veličković Profile picture
Jul 20, 2021 6 tweets 4 min read Read on X
Delighted to share our work on reasoning-modulated representations! Contributed talk at @icmlconf SSL Workshop 🎉

arxiv.org/abs/2107.08881

Algo reasoning can help representation learning! See thread👇🧵

w/ Matko @thomaskipf @AlexLerchner @RaiaHadsell @rpascanu @BlundellCharles
We study a very common representation learning setting where we know *something* about our task's generative process. e.g. agents must obey some laws of physics, or a video game console manipulates certain RAM slots. However...
...explicitly making use of this information is often quite tricky, every step of the way! Depending on the circumstances, it may require hard disentanglement of generative factors, a punishing bottleneck through the algorithm, or necessitate a differentiable renderer!
Algorithmic reasoning blueprint to the rescue!

In RMR, we show that we can encapsulate the x_bar -> y_bar path using a high-dimensional GNN, pre-trained on large quantities of data (which we can usually pre-generate, even synthetically).

This alleviates all of the above issues.
We recover significant improvements over a baseline without pre-training. N.B. RMR still needs to learn how to meaningfully use representations from a completely different task!

Our evaluation spans bouncing balls data (elastic collisions) & Atari trajectories (RAM transitions).
Lastly, alongside other recent works like XLVIN, we believe this is only one of many exciting uses of algorithmic reasoning to come in the near future! Watch this space 🎆

Any thoughts, comments and feedback is highly welcome! :)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Petar Veličković

Petar Veličković Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @PetarV_93

Jun 7
Transformers need glasses! 👓

Read on to see how we expose fundamental weaknesses of decoder-only Transformers on important tasks (e.g. copying & counting) + simple ways to make things a bit easier on the Transformer :)

Work led by @fedzbar for his @GoogleDeepMind placement!
Image
Image
We start by asking a frontier LLM a simple query: copy the first & last token of bitstrings.

Not only does it fail past a certain length, it also fails in a very specific way: it fails when there's repetition (111...10), and it fails to copy the _last_ token, never the first. Image
This leads to our first result -- representational collapse.

We prove there must exist pairs of different inputs for which their last token representations cannot be distinguished.

To prove this, we use bitstrings of the form 11...10, where repetitions exacerbate the problem. Image
Read 8 tweets
Dec 12, 2022
If you are @LogConference, come to the virtual Poster Session in ~20 minutes -- we have _four_ posters on algorithmic alignment, reasoning and over-squashing in GNNs! 🕸️🍾🌐 Several of them are award-winning!

You're welcome to stop by for a chat. 😊
See the 🧵for details... 🔢
🌐 In "Reasoning-Modulated Representations", Matko Bošnjak, @thomaskipf, @AlexLerchner, @RaiaHadsell, Razvan Pascanu, @BlundellCharles and I demonstrate how to leverage arbitrary algorithmic priors for self-supervised learning. It even transfers _across_ different Atari games!
🤖 In "Continuous Neural Algorithmic Planners", @heyu0208, @pl219_Cambridge, @andreeadeac22 and I show how the ideas from XLVIN paper can generalise to continuous-action-space environments (such as MuJoCo!). CNAP won the Best Paper Runner-up Award at GroundedML @ ICLR'22!
Read 5 tweets
Jul 27, 2022
📢 New & improved material to dive into geometric deep learning! 💠🕸️

We (@mmbronstein @joanbruna @TacoCohen) delivered our Master's course on GDL @AIMS_Next once again & we make all materials publicly available!

geometricdeeplearning.com/lectures/

See thread 🧵 for gems 💎 & dragons 🐉!
What to expect in the 2022 iteration?

We made careful modifications to our content, making it more streamlined & accessible!

Featuring a revamped introductory lecture, clearer discussion of Transformers & a new lecture going beyond groups, into the realm of category theory! 🐲
Beyond this, we offer a completely revamped set of exciting guest seminars, with @Francesco_dgv @ffabffrasca @crisbodnar @Russb09 & Geordie Williamson...

...and Colab tutorials on GDL from @crisbodnar @DutaIulia @paulmorio @_gabrielecesa_ @charlieharris01 @chaitjo & Ramon Viñas!
Read 5 tweets
Jun 2, 2022
Proud to share our CLRS benchmark: probing GNNs to execute 30 diverse algorithms! ⚡️

github.com/deepmind/clrs
arxiv.org/abs/2205.15659 (@icmlconf'22)

Find out all about our 2-year effort below! 🧵

w/ Adrià @davidmbudden @rpascanu @AndreaBanino Misha @RaiaHadsell @BlundellCharles
Why an algorithmic benchmark?

Algorithmic reasoning has emerged as a very important area of representation learning! Many key works (feat. @KeyuluXu @jingling_li @StefanieJegelka @beabevi_ @brunofmr) explored important theoretical and empirical aspects of algorithmic alignment.
Critically, each one of these works (incl. mine!) operates over its own datasets, often making it hard to directly compare insight among papers.

Further, generating adequate datasets requires knowledge of theoretical computer science, raising barrier of entry to the field.
Read 10 tweets
Jun 1, 2022
Two years ago, I embarked on an 'engineering' project.

From my perspective (research scientist with 'decent' coding skill), it seemed simple enough. It turned out anything but.

In advance of celebrating our @icmlconf acceptance, an appreciation thread for AI engineering! 1/11
Why did I class the project as simple at first?

It required no (apparent) novel research (though it could enable lots of new research!), I had the theoretical skills to understand everything that needs to be implemented, and it amounted to standard supervised learning! 2/11
So I started implementing by myself. What could possibly go wrong? Turns out, pretty much everything. :)

Indeed, I understood all I needed to write generators of the data. But this didn't mean I knew how to most efficiently extract it, organise it, and make it accessible! 3/11
Read 11 tweets
Mar 9, 2022
This is a very cool paper!

However, if I understood it correctly, it doesn't invalidate the GNN-DP alignment result of @KeyuluXu et al. [33].

Rather, it shows a very interesting DP unsolvability result over arbitrarily-initialised features. See thread -- happy to discuss. 1/4
GNN _computations_ align with DP. If you initialise the node features _properly_ (e.g. identifying the source vertex):

r[s] = 1, r[u] = 0 (for u =/= s)
d[s] = 0, d[u] = -1

GNNs are then perfectly capable of finding shortest paths. The proof in the paper seems more subtle... 2/4
Namely, that GNNs are hopeless in solving some DP problems (e.g. path-finding) under _arbitrary, fixed_ (e.g. constant / randomised) initialisations. But that's, in my opinion, making a different statement to "GNNs don't align with DP"! 3/4
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(