Read on Twitter

, 8 tweets, 2 min read Read on Twitter

aaargh the "When recurrent models don't need to be recurrent" paper is so frustrating!

On the one hand it presents important technical results.

On the other, so many people interpret it as "yo lets replace all RNNs with FF nets". This is wrong. This is NOT the result.

The paper shows that *stable* RNNs (with "stable" being a precise technical definition) can be approximated very well by feed-forward nets that see only k-words history. In other words, stable-RNNs are markovian.

What are the implication of this? well, besides the cool stability-in-RNN=markovian connection, which is important if you care about this sort of thing, all it tells you is that *some* sequential tasks/datasets can be approximated well with only k-items history. That's it.

If you think about it some more, it is almost a tautology. "if the phenomena does not require more than k-words of history to capture, then you can learn it with k-words history". well.

They also show that they can train a stable RNN that performs well on Language Modeling (well, sort-of well at least). What does this tell us? Well, that current LMs are mostly based on limited history.

But we do know that some information in language require more than a k-words history to model. And we also have evidence that it is captured by some RNNs. So maybe what we learn from this paper is just that perplexity is not such a great measure for LMs?

Also, many people seem to hold both of the following beliefs at the same time:

- ha cool we can do language models with feed-forward nets instead of RNNs!
- if we do LM well we will model all of language and achieve AGI!

It doesn't work this way. These are conflicting.

you want to replace unlimited history with size-k history because you think it is a good enough approximation? by all means please do. but don't expect to magically capture also the long-range stuff. it doesn't work this way. it cannot work this way.

Like this thread? Get email updates or save it to PDF!

Subscribe to (((ل()(ل() 'yoav))))

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Like this thread? Get email updates or save it to PDF!

Subscribe to (((ل()(ل() 'yoav))))

This content may be removed anytime!

Try unrolling a thread yourself!

Trending hashtags

Like this thread? Get email updates or save it to PDF!

Subscribe to (((ل()(ل() 'yoav))))

This content may be removed anytime!

Try unrolling a thread yourself!

Related threads

Trending hashtags

Did Thread Reader help you today?