My Authors
Read all threads
On the transformer side of #acl2020nlp, three works stood out to me as relevant if you've followed the Illustrated Transformer/BERT series on my blog:
1- SpanBERT
2- BART
3- Quantifying Attention Flow
(1/n)
SpanBERT (by @mandarjoshi_ @danqi_chen @YinhanL @dsweld @LukeZettlemoyer @omerlevy_) came out last year but was published in this year's ACL. It found that BERT pre-training is better when you mask continuous strings of tokens, rather than BERT's 15% scattered tokens. ImageImage
BART (@ml_perception @YinhanL @gh_marjan @omerlevy_ @vesko_st @LukeZettlemoyer) presents a way to use what we've learned from BERT (and spanBERT) back into encoder-decoder models, which are especially important for summarization, machine translation, and chatbots. 3/n #acl2020nlp ImageImageImage
In "Quantifying Attention Flow", @samiraabnar and Willem Zuidema show that in higher/later transformer blocks, you shouldn't rely on raw attention weights to tell which tokens are being attended to.

aclweb.org/anthology/2020…

4/n #acl2020nlp Image
(I now strongly prefer visualizing transformers with input coming from the top to the bottom, as that reads more naturally as one is scrolling down a page. But we're kinda stuck with language like "top layer" referring to the last layer)
SpanBERT paper from TACL (thank you @danqi_chen!) :
mitpressjournals.org/doi/pdf/10.116…

BART paper:
aclweb.org/anthology/2020…
Missing some Tweet in this thread? You can try to force a refresh.

Keep Current with Jay Alammar

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!