Profile picture
(((ل()(ل() 'yoav)))) @yoavgo
, 8 tweets, 2 min read Read on Twitter
aaargh the "When recurrent models don't need to be recurrent" paper is so frustrating!

On the one hand it presents important technical results.

On the other, so many people interpret it as "yo lets replace all RNNs with FF nets". This is wrong. This is NOT the result.
The paper shows that *stable* RNNs (with "stable" being a precise technical definition) can be approximated very well by feed-forward nets that see only k-words history. In other words, stable-RNNs are markovian.
What are the implication of this? well, besides the cool stability-in-RNN=markovian connection, which is important if you care about this sort of thing, all it tells you is that *some* sequential tasks/datasets can be approximated well with only k-items history. That's it.
If you think about it some more, it is almost a tautology. "if the phenomena does not require more than k-words of history to capture, then you can learn it with k-words history". well.
They also show that they can train a stable RNN that performs well on Language Modeling (well, sort-of well at least). What does this tell us? Well, that current LMs are mostly based on limited history.
But we do know that some information in language require more than a k-words history to model. And we also have evidence that it is captured by some RNNs. So maybe what we learn from this paper is just that perplexity is not such a great measure for LMs?
Also, many people seem to hold both of the following beliefs at the same time:

- ha cool we can do language models with feed-forward nets instead of RNNs!
- if we do LM well we will model all of language and achieve AGI!

It doesn't work this way. These are conflicting.
you want to replace unlimited history with size-k history because you think it is a good enough approximation? by all means please do. but don't expect to magically capture also the long-range stuff. it doesn't work this way. it cannot work this way.
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to (((ل()(ل() 'yoav))))
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($3.00/month or $30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!