, 19 tweets, 8 min read Read on Twitter
#rep4nlp Yulia Tsvetkov talk #4

"Modeling Output spaces of NLP models" instead of the common #Bertology that focuses on Modeling input spaces only.

#ACL2019nlp
The focus in the presentation will be on Conditional language generation
#MT #summarization ..etc
"to be able to build diverse NLP models for 1000s of users we have to build 100ks of models for combinations of:

* Languages
* Tasks
* Domains
* People preferences
Instead of handling each of these tasks independently the way to tackle that by collapsing each of these axes.
GETTING RID OF THE SOFTMAX!

"We need to get rid of the discrete softmax layer in the generation" and move towards to generation of continuous output representations
Drawbacks of softmax:

* limited vocabulary + <unk> that
* the slowpoke layer :)
* high memory complexity

Several work that tried to get rid of the softmax
* sampling-based approx.
* structured based approx.
*subword units
A paper they published before (will add link later)

Generate into embeddings spaces instead of the softmax layer. Trained using some distance loss "e.g. L2" and decoding using KNN search

However optimizing L2 loss is not a good method
Cosine loss suffers from hubness. where some words are considered as hubs and KNN search doesn't work well at decoding well.

Max margin loss: works better but returns the problem of computational inefficiency as the softmax layer.
They introduced in this paper a new probabilistic loss vMF which is similar to cosine loss.
results are close - but still lower - BLEU score for MT to a baseline that uses softmax

But it becomes much more time and memory efficient which enables training with larger batch sizes that yield more speedups
drawbacks of continuous generation: synonyms and antonyms are words are very close so they can generate semantically wrong sentences

Benefits: they are much better at generating paraphrases!
part2:
Unpublished research + Ongoing projects:

A phrase-based continuous generation where the embedding spaces have bigrams and phrases

Maybe not SOTA results 183x speedups
Obvious extensions:

* Transformers (possible but not yet implemented)
* do KNN decoding efficiently
* other conditional language modelling task
* updating target embeddings during generation
* syntactic informed generation
Controllable language generation (brief!)

Lots of reasons and applications for controlled NLG

*Personalization
*Style transfer "controllable paraphrasing"
* Anonymization

The approach that has been very successful has been GANs
(GANs for Text until)?
GANs for text don't work, why she thinks?

the discriminators: are not the right type for language
The softmax layer that makes end-to-end gans non-differentiable

Their work (skipped during the presentation): train end to end language gans for continuous language generation.
End of talk
Conclusions see pic :)
Paper discussed in the talk:

VON MISES-FISHER LOSS FOR TRAINING SEQUENCE
TO SEQUENCE MODELS WITH CONTINUOUS OUTPUTS
ICLR 2019
arxiv.org/pdf/1812.04616…
👆👆👆👆
Summary of Yulia Tsvetkov's talk #4 #Rep4nlp #ACL2019nlp

Thanks for the great talk!!
This is a topic that interests me personally and I keep thinking every day that we need new ways of doing NLG that allows control rather than legacy ways of mapping inputs-outputs
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Hady Elsahar
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!