Latest Twitter Threads by @xxxnell on Thread Reader App

Feb 15, 2022 • 7 tweets • 4 min read

Our paper “How Do Vision Transformers Work?” was accepted as a Spotlight at #ICLR2022!!

We show that the success of ViTs is NOT due to their weak inductive bias & capturing long-range dependency.

paper: openreview.net/forum?id=D78Go…
code & summary: github.com/xxxnell/how-do…

👇 (1/7)

https://twitter.com/ak92501/status/1493417542911680516

We address the following three key questions of multi-head self-attentions (MSAs) and ViTs:

Q1. What properties of MSAs do we need to better optimize NNs?
Q2. Do MSAs act like Convs? If not, how are they different?
Q3. How can we harmonize MSAs with Convs?

(2/7)

Share this page!

Enter URL or ID to Unroll