Thread by @yanndubs on Thread Reader App

#NeurIPS2022
What are ideal representations for self-sup. learning (SSL)?

🤓We give simple optimality conditions and use them to improve/understand/derive SSL methods!

🔥outperform baselines on ImageNet

arxiv.org/abs/2011.10566
w. @tatsu_hashimoto @StefanoErmon @percyliang
🧵

Goal: ideally representations should allow linear probes to perfectly predict any task that is invariant to augmentations in the most sample-efficient way

Q: Which of the following representation is optimal?

2/8

A: last one.

More generally we show that representations are optimal if and only if:
1. *Predictability*: linear probes can predict equivalence classes
2. *High dimension*: representation dim d=# equiv-1
3. *Invariance*: representation of equivalent examples collapse

3/8

Key: ideal SSL = supervised classification from high dim. space to equiv. classes using probing architecture

This leads to a unifying SSL framework (contrastive or not) with actionable insights eg how to
- choose projection heads
- choose dim.
- simplify non-contrast. SSL
4/8

**Dimension**

We just showed that the dimensionality of representation should ideally be number of equivalence classes => much larger than currently

Smartly increasing dimension has a huge impact on performance without increasing parameters!!

≥ 2% acc gains on ImageNet
5/8

**Projection heads**

Current SSL uses 2 siamese networks with MLP projection heads

We prove that one head should be linear

Intuition: representations should be pretrained as they will be used downstream.
linear probing => one linear projection

This gives ≥ 1% acc gains
6/8

**Non-contrastive SSL**

We show that most prior non-contrastive objectives are approximations of optimal SSL

We provide DISSL: a much simpler objective (no stop-gradients / no EMA / no Sinkhorn) that better approximates optimal SSL

DISSL outperforms SwAV/DINO
7/8

Other actionable insights in the paper eg:
- how to perform SSL for non-linear probes
- choosing augmentations

If you are at #NeurIPS2022 come to our poster Hall J #905 tomorrow 4-6pm

Code and pretrained ImageNet models: github.com/YannDubs/Invar…
8/8

Many ideas come from prior work with great collaborators
-ideal supervised repr. arxiv.org/abs/2201.00057
-ideal robust repr. arxiv.org/abs/2201.00057
-invariance&compression arxiv.org/abs/2106.10800
@douwekiela @davidjschwab @rama_vedantam @YangjunR @cjmaddison Ben @karen_ullrich

Grateful for all discussions/feedback on SSL and visualizations from:
@ananyaku @shengjia_zhao @rtaori13 @mo_tiwari @sangmichaelxie @niladrichat @ShibaniSan @baaadas @chenlin_meng @MayeeChen @AlexTamkin @YangjunR @malikrali @jhaochenz @RishiBommasani @kaylburns @manim_community

**Edit** correct link is arxiv.org/abs/2209.06235

That’s the problem when you have too many arxiv tabs open 😅

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Share this page!

Enter URL or ID to Unroll