I have also a brand new lab session on multi-task audio classification, using #TensorFlow Hub, @huggingface Datasets, the pre-trained Wav2Vec porting by @7vasudevgupta, and a language identification dataset: π€
To a practical course, a practical exam: I asked each student to include a new branch in the repository showcasing additional tools and libraries.
The result? *Everyone* loves some hyper-parameter optimization. π
/n
Thanks to their work, you'll find practical examples of fine-tuning parameters using @OptunaAutoML, AX (from @facebookai), @raydistributed Tune, and Auto-PyTorch and Talos coming soon.
An emerging approach in generative modelling that is gathering more and more attention.
If you are interested, I collected some introductive material and thoughts in a small thread. π
Feel free to weigh in with additional material!
/n
An amazing property of diffusion models is simplicity.
You define a probabilistic chain that gradually "noise" the input image until only white noise remains.
Then, generation is done by learning to reverse this chain. In many cases, the two directions have similar form.
/n
The starting point for diffusion models is probably "Deep Unsupervised Learning using Nonequilibrium Thermodynamics" by @jaschasd Weiss @niru_m@SuryaGanguli
*LocoProp: Enhancing BackProp via Local Loss Optimization*
by @esiamid@_arohan_ & Warmuth
Interesting approach to bridge the gap between first-order, second-order, and "local" optimization approaches. π
/n
The key idea is to use a single GD step to define auxiliary local targets for each layer, either at the level of pre- or post-activations.
Then, optimization is done by solving local "matching" problems wrt these new variables.
/n
What is intriguing is that the framework interpolates between multiple scenarios: first solution step is the original GD, while closed-form solution (in one case) is similar to a pre-conditioned GD model. Optimization is "local" in the sense that it decouples across layers.
Graph networks are limited to pairwise interactions. How to include higher-order components?
Read more below π /n
The paper considers simplicial complexes, nice mathematical objects where having a certain component (e.g., a 3-way interaction in the graph) means also having all the lower level interactions (e.g., all pairwise interactions between the 3 objects). /n
Simplicial complexes have many notions of "adjacency" (four in total), considering lower- and upper- interactions.
They first propose an extension of the Weisfeiler-Lehman test that includes all four of them, showing it is slightly more powerful than standard WL. /n