I fall in love with a new #machinelearning topic every month π |
Researcher @SapienzaRoma | Author: Alice in a diff wonderland https://t.co/A2rr19d3Nl
Sep 20, 2022 β’ 17 tweets β’ 8 min read
Gather round, Twitter folks, it's time for our beloved
**Alice's adventures in a differentiable wonderland**, our magical tour of autodiff and backpropagation. π₯
Slides below 1/n π
It all started from her belief that "very few things indeed were really impossible". Could AI truly be below the corner? Could differentiability be the only ingredient that was needed?
2/n
Mar 10, 2022 β’ 13 tweets β’ 11 min read
*Generative Flow Networks*
A new method to sample structured objects (eg, graphs, sets) with a formulation inspired to the state space of reinforcement learning.
I have collected a few key ideas and pointers below if you are interested. π
1/n
π
*Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation* #NeurIPS paper by @folinoid@JainMoksh et al. introducing the method.
The task is learning to sample objects that can be built 1 piece at a time ("lego-style").
*Neural networks for data science* lecture 8 is out!
And it's already the last lecture! π
What lies beyond classical supervised learning? It turns out, _way_ too many subfields!
/n
Here is my overview of everything that can happen when we have > 1 "task": fine-tuning, pre-training, meta learning, continual learning...
The slides have my personal selection of material. π
/n
Nov 3, 2021 β’ 7 tweets β’ 4 min read
*Neural networks for data science* lecture 4 is out! π
aka "here I am talking about convolutional neural networks while everyone asks me about transformers"
/n
CNNs are a great way to show how considerations about the data can guide the design of the model.
For example, only assuming locality (and not transl. invariance) we get locally-connected networks.
/n
Aug 2, 2021 β’ 4 tweets β’ 2 min read
*Reproducible deep learning*: Time for exams!
To a practical course, a practical exam: I asked each student to include a new branch in the repository showcasing additional tools and libraries.
The result? *Everyone* loves some hyper-parameter optimization. π
/n
Thanks to their work, you'll find practical examples of fine-tuning parameters using @OptunaAutoML, AX (from @facebookai), @raydistributed Tune, and Auto-PyTorch and Talos coming soon.
An emerging approach in generative modelling that is gathering more and more attention.
If you are interested, I collected some introductive material and thoughts in a small thread. π
Feel free to weigh in with additional material!
/n
An amazing property of diffusion models is simplicity.
You define a probabilistic chain that gradually "noise" the input image until only white noise remains.
Then, generation is done by learning to reverse this chain. In many cases, the two directions have similar form.
/n
Jun 14, 2021 β’ 4 tweets β’ 2 min read
*LocoProp: Enhancing BackProp via Local Loss Optimization*
by @esiamid@_arohan_ & Warmuth
Interesting approach to bridge the gap between first-order, second-order, and "local" optimization approaches. π
/n
The key idea is to use a single GD step to define auxiliary local targets for each layer, either at the level of pre- or post-activations.
Then, optimization is done by solving local "matching" problems wrt these new variables.
/n
May 11, 2021 β’ 8 tweets β’ 5 min read
*Reproducible Deep Learning*
The first two exercises are out!
We start quick and easily, with some simple manipulation on Git branches, scripting, audio classification, and configuration with @Hydra_Framework.
Small thread with all information π /n
Reproducibility is associated to production environments and MLOps, but it is a major concern today also in the research community.
Graph networks are limited to pairwise interactions. How to include higher-order components?
Read more below π /n
The paper considers simplicial complexes, nice mathematical objects where having a certain component (e.g., a 3-way interaction in the graph) means also having all the lower level interactions (e.g., all pairwise interactions between the 3 objects). /n
May 8, 2021 β’ 7 tweets β’ 5 min read
*MLP-Mixer: An all-MLP Architecture for Vision*
It's all over Twitter!
A new, cool architecture that mixes several ideas from MLPs, CNNs, ViTs, trying to keep it as simple as possible.
Small thread below. π /n
The idea is strikingly simple:
(i) transform an image into a sequence of patches;
(ii) apply in alternating fashion an MLP on each patch, and on each feature wrt all patches.
Mathematically, it is equivalent to applying an MLP on rows and columns of the matrix of patches. /n