Simone Scardapane Profile picture
Mar 10, 2022 13 tweets 11 min read Read on X
*Generative Flow Networks*

A new method to sample structured objects (eg, graphs, sets) with a formulation inspired to the state space of reinforcement learning.

I have collected a few key ideas and pointers below if you are interested. 👀

1/n

👇 Image
*Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation*
#NeurIPS paper by @folinoid @JainMoksh et al. introducing the method.

The task is learning to sample objects that can be built 1 piece at a time ("lego-style").

2/n

arxiv.org/abs/2106.04399 Image
For example: a complex molecule can be built by adding one atom at a time; an image by colouring one pixel per iteration; etc.

If you formalize this process, you get a state space where you move from an "empty" object to a complete object by traversing a graph.

3/n Image
The only thing you have is a reward function describing how good each object (eg, protein) is.

GFlowNets understand this reward as a flow of water running through the graph: the flow you get at the terminal nodes is the reward of the corresponding object.

4/n Image
Under this interpretation, you train a neural network to predict how the flow goes through the graph, by imposing that the incoming and outgoing flows at each node are conserved.

With this, you get one consistency equation per node that you can enforce with a loss function.

5/n Image
The network trained in this way (GFlowNet) is enough to solve your original problem: by traversing the graph with probabilities proportional to the flow, you sample objects proportionally to their reward!

6/n Image
*GFlowNet Foundations*

Now you can move to this mammoth paper by @TristanDeleu @edwardjhu @mo_tiwari @folinoid

They show GFlowNets can be extended in many ways, notably, to sample conditional paths or to compute entropies and other quantities.

7/n

arxiv.org/abs/2111.09266 Image
*Trajectory Balance: Improved Credit Assignment in GFlowNets*

Building on it, @JainMoksh @folinoid @ChenSun92 et al. show a much better training criterion by sampling entire trajectories, making training significantly faster.

8/n

arxiv.org/abs/2201.13259 Image
*Bayesian Structure Learning with Generative Flow Networks*
by @TristanDeleu @AntGois @ChrisEmezue @SimonLacosteJ

Moving to applications, here they leverage GFlowNets to get state-of-the-art results in learning the structure of Bayesian networks.

9/n

arxiv.org/abs/2202.13903 Image
*Biological Sequence Design with GFlowNets*
@bonadossou @JainMoksh @alexhdezgcia @folinoid @Mila_Quebec

Another cool application: the design of biological sequences with specific characteristics (I admit I am a little bit out of my depth here).

10/n

arxiv.org/abs/2203.04115 Image
*GFlowNets for Discrete Probabilistic Modeling*
@alex_volokhova

The basic GFlowNet assumes your reward function is given, but you can also train it jointly using ideas from energy-based modelling. In this work, they use it to generate images.

11/n

arxiv.org/abs/2202.01361 Image
Yoshua Bengio wrote about GFlowNets: "I have rarely been as enthusiastic about a new research direction", that "creative juices are boiling", and about "bridging the gap between SOTA AI and human intelligence".

More hype on AI, yay! 🤷‍♂️

12/n

yoshuabengio.org/2022/03/05/gen…
A few final pointers:

- Blog post on the original paper: folinoid.com/w/gflownet/
- A tutorial on Notion: milayb.notion.site/GFlowNet-Tutor…
- The original code: github.com/GFNOrg/gflownet

Let me know if I missed anything interesting!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Simone Scardapane

Simone Scardapane Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @s_scardapane

Sep 20, 2022
Gather round, Twitter folks, it's time for our beloved
**Alice's adventures in a differentiable wonderland**, our magical tour of autodiff and backpropagation. 🔥

Slides below 1/n 👇 Image
It all started from her belief that "very few things indeed were really impossible". Could AI truly be below the corner? Could differentiability be the only ingredient that was needed?

2/n Image
Wondering were to start, Alice discovered a paper by pioneer @ylecun promising "a path towards autonomous intelligent agents".

Intelligence would arise, it was argued, by several interacting modules, were everything was assumed to be *differentiable*.

3/n Image
Read 17 tweets
Dec 28, 2021
*Neural networks for data science* lecture 8 is out!

And it's already the last lecture! 🙀

What lies beyond classical supervised learning? It turns out, _way_ too many subfields!

/n
Here is my overview of everything that can happen when we have > 1 "task": fine-tuning, pre-training, meta learning, continual learning...

The slides have my personal selection of material. 😎

/n
The slides are here: sscardapane.it/assets/files/n…

All the material, as always, is here: sscardapane.it/teaching/nnds-…

/n
Read 4 tweets
Nov 3, 2021
*Neural networks for data science* lecture 4 is out! 👇

aka "here I am talking about convolutional neural networks while everyone asks me about transformers"

/n
CNNs are a great way to show how considerations about the data can guide the design of the model.

For example, only assuming locality (and not transl. invariance) we get locally-connected networks.

/n
Everything else is a pretty standard derivation of CNN ideas (stride, global pooling, receptive field, ...).

/n
Read 7 tweets
Aug 2, 2021
*Reproducible deep learning*: Time for exams!

To a practical course, a practical exam: I asked each student to include a new branch in the repository showcasing additional tools and libraries.

The result? *Everyone* loves some hyper-parameter optimization. 😄

/n
Thanks to their work, you'll find practical examples of fine-tuning parameters using @OptunaAutoML, AX (from @facebookai), @raydistributed Tune, and Auto-PyTorch and Talos coming soon.

So many ideas for next year! 😛

github.com/sscardapane/re…

/n
You will also find additional exercises on:

- Serving the model with TorchServe;
- Managing experiments with @DVCorg 2.0;
- Set up cron jobs for re-training.

BTW, if you'd like to add something, feel free to contact me or open a pull request. 🙂

github.com/sscardapane/re…

/n
Read 4 tweets
Jun 16, 2021
*Score-based diffusion models*

An emerging approach in generative modelling that is gathering more and more attention.

If you are interested, I collected some introductive material and thoughts in a small thread. 👇

Feel free to weigh in with additional material!

/n
An amazing property of diffusion models is simplicity.

You define a probabilistic chain that gradually "noise" the input image until only white noise remains.

Then, generation is done by learning to reverse this chain. In many cases, the two directions have similar form.

/n
The starting point for diffusion models is probably "Deep Unsupervised Learning using Nonequilibrium Thermodynamics" by @jaschasd Weiss @niru_m @SuryaGanguli

Classic paper, definitely worth reading: arxiv.org/abs/1503.03585

/n
Read 13 tweets
Jun 14, 2021
*LocoProp: Enhancing BackProp via Local Loss Optimization*
by @esiamid @_arohan_ & Warmuth

Interesting approach to bridge the gap between first-order, second-order, and "local" optimization approaches. 👇

/n Image
The key idea is to use a single GD step to define auxiliary local targets for each layer, either at the level of pre- or post-activations.

Then, optimization is done by solving local "matching" problems wrt these new variables.

/n Image
What is intriguing is that the framework interpolates between multiple scenarios: first solution step is the original GD, while closed-form solution (in one case) is similar to a pre-conditioned GD model. Optimization is "local" in the sense that it decouples across layers.

/n Image
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(