@jaschasd

@jaschasd

Most recents (15)

@s_scardapane

*Score-based diffusion models*

An emerging approach in generative modelling that is gathering more and more attention.

If you are interested, I collected some introductive material and thoughts in a small thread. 👇

Feel free to weigh in with additional material!

/n

An amazing property of diffusion models is simplicity.

You define a probabilistic chain that gradually "noise" the input image until only white noise remains.

Then, generation is done by learning to reverse this chain. In many cases, the two directions have similar form.

/n

The starting point for diffusion models is probably "Deep Unsupervised Learning using Nonequilibrium Thermodynamics" by @jaschasd Weiss @niru_m @SuryaGanguli

Classic paper, definitely worth reading: arxiv.org/abs/1503.03585

/n

Read 13 tweets

Simone Scardapane

@s_scardapane

@CristianBodnar

*Weisfeiler and Lehman Go Topological*

Fantastic #ICLR2021 paper by @CristianBodnar @ffabffrasca @wangyg85 @kneppkatt Montúfar @pl219_Cambridge @mmbronstein

Graph networks are limited to pairwise interactions. How to include higher-order components?

Read more below 👇 /n

The paper considers simplicial complexes, nice mathematical objects where having a certain component (e.g., a 3-way interaction in the graph) means also having all the lower level interactions (e.g., all pairwise interactions between the 3 objects). /n

Simplicial complexes have many notions of "adjacency" (four in total), considering lower- and upper- interactions.

They first propose an extension of the Weisfeiler-Lehman test that includes all four of them, showing it is slightly more powerful than standard WL. /n

Read 5 tweets

Vinay Prabhu

@vinayprabhu

@rethinkmlpapers

📢SPICEs: Survey Papers as Interactive Cheat-sheet Embeddings at #ICLR2021 @rethinkmlpapers workshop

Portal: bit.ly/3vLSdJx
Github: bit.ly/3utDHFT
Video:

Credits: @omarsar0 for the inspiration.
Co-authors: @joewhaley @MatthewMcAteer0

(2/n) Examples: Want to know more about what Machine learners mean when they say: "X- is all you need'?
We surveyed all 80-odd of them here.
Dataset: github.com/vinayprabhu/X-…

(3/3) X-former architecture survey you ask?
PDF: github.com/vinayprabhu/SP…

Read 3 tweets

Colin Raffel

@colinraffel

The #ICLR2021 Workshop on Enormous Language Models (WELM) is tomorrow, May 7th!

Full info: welmworkshop.github.io
Livestream: welmworkshop.github.io/livestream/
gathertown info for ICLR registrants: iclr.cc/virtual/2021/w…

Thread summarizing the talks & panels ⬇️ (1/14)

Our first talk will be by Thomas Margoni, who will provide some legal perspective on the use of web data for training large language models. He'll touch on topics like copyright law, rights, and licenses, as they pertain to training data for LMs. (2/14)

@JesseDodge

Then, @JesseDodge will give a talk on how to document datasets and improve reproducibility of research. He'll discuss the NLP reproducibility checklist, a recent study on documenting C4, and a framework for modeling bias in data. (3/14)

Read 14 tweets

Aaron Chan

@aarzchan

🚨 Our #ICLR2021 paper shows that KG-augmented models are surprisingly robust to KG perturbation! 🧐

arXiv: arxiv.org/abs/2010.12872
Code: github.com/INK-USC/deceiv…

To learn more, come find us at Poster Session 9 (May 5, 5-7PM PDT): iclr.cc/virtual/2021/p….

🧵[1/n]

KGs have helped neural models perform better on knowledge-intensive tasks and even “explain” their predictions, but are KG-augmented models really using KGs in a way that makes sense to humans?

[2/n]

We primarily investigate this question by measuring how the performance of KG-augmented models changes when the KG’s semantics and/or structure are perturbed, such that the KG becomes less human-comprehensible.

[3/n]

Read 10 tweets

Emily M. Bender

@emilymbender

@timnitGebru

Fantastic talk and Q&A by @timnitGebru at #ICLR2021

Among other things I really appreciate how Timnit is unerasing the contribution of our retracted co-authors and how key their contributions & perspectives were to the Stochastic Parrots paper.

@timnitGebru

@timnitGebru And so much else: @timnitGebru is absolutely brilliant at drawing connections between the research milieu, research content, geopolitics and individual, situated lived experience.

@timnitGebru

@timnitGebru On interdisciplinarity and the hierarchy of knowledge:

“If you have all the money, you don’t have to listen to anybody” —@timnitgebru

Read 9 tweets

Behnam Neyshabur

@bneyshabur

Come to our talks and posters at #ICLR2021 to discuss our findings on understanding and improving deep learning! Talks and posters are available now! Links to the talks, posters, papers and codes in the thread:

1/7

@XiaoxiaWShirley

When Do Curricula Work? (Oral at #ICLR2021)
with @XiaoxiaWShirley and @ethansdyer

Paper: openreview.net/forum?id=tW4QE…
Code: github.com/google-researc…
Video and Poster: iclr.cc/virtual/2021/p…

2/7

@Foret_p

Sharpness-Aware Minimization for Efficiently Improving Generalization (Spotlight at #ICLR2021 )
with @Foret_p, Ariel Kleiber and @TheGradient

Paper: openreview.net/forum?id=6Tm1m…
Code: github.com/google-researc…
Video and Poster: iclr.cc/virtual/2021/p…

3/7

Read 7 tweets

Jay Alammar

@JayAlammar

@rethinkmlpapers

Ecstatic to see "Machine learning research communication via illustrated and interactive web articles" published at @rethinkmlpapers workshop at #ICLR2021

In it, I describe my workflow for communicating ML to millions of readers.

Paper: openreview.net/pdf?id=WUrcJoy…

1/5

@ch402

I discuss five key ML communication artifacts:
1- The hero image
2- The Twitter thread
3- The illustrated article
4- The interactive article
5- Interpretability software

Here are excellent examples of 1 and 2 from @ch402, @karpathy , and @maithra_raghu.

2/5

For illustrated/animated articles, I discuss the importance of empathy towards the reader, putting intuition first, the importance of iteratively creating visual language to describe concepts, and reflect on pedagogical considerations.

3/5

Read 5 tweets

Jess Hamrick

@jhamrick

Model-based planning is often thought to be necessary for deep reasoning & generalization. But the space of choices in model-based deep RL is huge. Which work well and which don't? In our new paper (accepted to #ICLR2021), we investigate! arxiv.org/abs/2011.04021 1/

Spoiler: our findings really challenged some deeply-held assumptions we had about what planning is useful for and how much planning is really needed in popular MBRL benchmarks---even some "strategic" ones like Sokoban. 2/

@theophaneweber

This is joint work with @theophaneweber, Abe Friesen, @FeryalMP, Arthur Guez, @fabiointheuk, @simswitherspoon, Thomas Anthony, Lars Buesing, and @PetarV_93 . 3/

Read 10 tweets

Cees Snoek

@cgmsnoek

#ICLR2021 cam-ready II: "LiftPool: Bidirectional ConvNet Pooling" w/ Jiaojiao Zhao is now available: isis-data.science.uva.nl/cgmsnoek/pub/z… No more lossy down- and upsampling when pooling! 1/n

LiftPool adopts the philosophy of the classical #Lifting #Scheme from #signal #processing. LiftDownPool decomposes a feature map into various downsized sub-bands, each of which contains information with different frequencies. Because of its invertible properties, ... 2/n

by performing LiftDownPool backwards, a corresponding up-pooling layer #LiftUpPool is able to generate a refined upsampled feature map using the detail sub-bands, which is useful for #image-#to-#image #translation challenges. 3/n

Read 4 tweets

Sharon Zhou

@realSharonZhou

@Stanford

Excited to share our #ICLR2021 paper w/ CS & math depts @Stanford 🎊

Evaluating the Disentanglement of Deep Generative Models through Manifold Topology!

w/ @ericzelikman Fred Lu @AndrewYNg Gunnar Carlsson @StefanoErmon. Acknowledging @torbjornlundh Samuel Bengmark.

Thread 🧵

Before I start: camera-ready 📸 & math-inclined R5 burn 🔥 are here
openreview.net/forum?id=djwS0…

Huge appreciation for all reviewers esp R5 in making our work better.

My goal in 🧵: Explain our work in my simplest terms to you. Don't worry if you get lost, it's admittedly dense :)

Disentanglement in your generative model means dimensions in its latent space can change a corresponding feature in its data space, e.g. adapting just 1️⃣ dim can make the output "sunnier" ☁️→🌥→⛅️→🌤→☀️ Contrast w/ this entangled mess ☁️→🌥→🌩→🌪→☀️

Read 17 tweets

Blake Richards

@tyrell_turing

1/ I'm very happy to give a little thread today on our paper accepted at ICLR 2021!

🎉🎉🎉

In this paper, we show how to build ANNs that respect Dale's law and which can still be trained well with gradient descent. I will expand in this thread...

openreview.net/forum?id=eU776…

2/ Dale's law states that neurons release the same neurotransmitter from all of their axonal terminals.

en.wikipedia.org/wiki/Dale%27s_…

Practically speaking, this implies that neurons are either all excitatory or inhibitory. It's not 100%, nothing is in biology, but it's roughly true.

3/ You may have wondered, "Why don't more people use ANNs that respect Dale's law?"

The rarely discussed reason is this:

When you try to train an ANN that respects Dale's law with gradient descent, it usually doesn't work as well -- worse than an ANN that ignores Dale's law.

Read 16 tweets

Behnam Neyshabur

@bneyshabur

Some people say that one shouldn't care about publication and the quality matters. However, the job market punishes those who don’t have publications in top ML venues. I empathize with students and newcomers to ML whose good papers are not getting accepted. #ICLR2021
1/

Long thread at the risk of being judged:

I just realized that in the last 6 years, 21 of my 24 papers have been accepted to top ML conf in their FIRST submission even though the majority of them were hastily-written borderline papers (not proud of this). How is this possible?
2/

At this point, I'm convinced that this cannot be explained by a combination of luck and quality of the papers. My belief is that the current system has lots of unnecessary and sometimes harmful biases which is #unfair to new comers and anyone who is outside of the "norm".
3/

Read 17 tweets

Jia-Bin Huang

@jbhuang0604

@ylzou_Zack

Semi-supervised learning with consistency regularization and pseudo-labeling works great for CLASSIFICATION.

But how about STRUCTURED PREDICTION tasks? 🤔

Check out @ylzou_Zack's #ICLR2021 paper on designing pseudo-labels for semantic segmentation.
yuliang.vision/pseudo_seg/

How do we get pseudo labels from unlabeled images?

Unlike classification, directly thresholding the network outputs for dense prediction doesn't work well.

Our idea: start with weakly sup. localization (Grad-CAM) and refine it with self-attention for propagating the scores.

Using two different prediction mechanisms is great bc they make errors in different ways. With our fusion strategy, we get WELL-CALIBRATED pseudo labels (see the expected calibration errors in E below) and IMPROVED accuracy under 1/4, 1/8, 1/16 of labeled examples.

Read 6 tweets

Sergey Ivanov

@SergeyI49013776

So here is an analysis of #ICLR2021 decisions.

860 accepted out of 2997 -> 29% acceptance rate
53 Orals, 114 Spotlights, 693 Posters, 1756 Rejected, 381 Withdrawn.

Thread 🧵

All decisions in one table: docs.google.com/spreadsheets/d…

Distribution of decisions based on average rating.

Orals: top-6% of accepted papers, top-2% of all papers.
Average score: 7.5, Min score: 6.67

Spotlight: top-13% of accepted papers, top-4% of all papers.
Average score: 7, Min score: 6

Read 5 tweets

Discover and read the best of Twitter Threads about #ICLR2021

Most recents (15)

Related hashtags

Discover and read the best of Twitter Threads about #ICLR2021

Most recents (15)

Related hashtags

Did Thread Reader help you today?