#NeurIPS2022
What are ideal representations for self-sup. learning (SSL)?
🤓We give simple optimality conditions and use them to improve/understand/derive SSL methods!
🔥outperform baselines on ImageNet
arxiv.org/abs/2011.10566
w. @tatsu_hashimoto @StefanoErmon @percyliang
🧵
Goal: ideally representations should allow linear probes to perfectly predict any task that is invariant to augmentations in the most sample-efficient way
Q: Which of the following representation is optimal?
2/8
A: last one.
More generally we show that representations are optimal if and only if:
1. *Predictability*: linear probes can predict equivalence classes
2. *High dimension*: representation dim d=# equiv-1
3. *Invariance*: representation of equivalent examples collapse
3/8
Key: ideal SSL = supervised classification from high dim. space to equiv. classes using probing architecture
This leads to a unifying SSL framework (contrastive or not) with actionable insights eg how to
- choose projection heads
- choose dim.
- simplify non-contrast. SSL
4/8
**Dimension**
We just showed that the dimensionality of representation should ideally be number of equivalence classes => much larger than currently
Smartly increasing dimension has a huge impact on performance without increasing parameters!!
≥ 2% acc gains on ImageNet
5/8
**Projection heads**
Current SSL uses 2 siamese networks with MLP projection heads
We prove that one head should be linear
Intuition: representations should be pretrained as they will be used downstream.
linear probing => one linear projection
This gives ≥ 1% acc gains
6/8
**Non-contrastive SSL**
We show that most prior non-contrastive objectives are approximations of optimal SSL
We provide DISSL: a much simpler objective (no stop-gradients / no EMA / no Sinkhorn) that better approximates optimal SSL
DISSL outperforms SwAV/DINO
7/8
Other actionable insights in the paper eg:
- how to perform SSL for non-linear probes
- choosing augmentations
If you are at #NeurIPS2022 come to our poster Hall J #905 tomorrow 4-6pm
Code and pretrained ImageNet models: github.com/YannDubs/Invar…
8/8
Many ideas come from prior work with great collaborators
-ideal supervised repr. arxiv.org/abs/2201.00057
-ideal robust repr. arxiv.org/abs/2201.00057
-invariance&compression arxiv.org/abs/2106.10800
@douwekiela @davidjschwab @rama_vedantam @YangjunR @cjmaddison Ben @karen_ullrich
Grateful for all discussions/feedback on SSL and visualizations from:
@ananyaku @shengjia_zhao @rtaori13 @mo_tiwari @sangmichaelxie @niladrichat @ShibaniSan @baaadas @chenlin_meng @MayeeChen @AlexTamkin @YangjunR @malikrali @jhaochenz @RishiBommasani @kaylburns @manim_community
**Edit** correct link is arxiv.org/abs/2209.06235
That’s the problem when you have too many arxiv tabs open 😅
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.