My Authors
Read all threads
Neuroevolution of Self-Interpretable Agents

Agents with a self-attention “bottleneck” not only can solve these tasks from pixel inputs with only 4000 parameters, but they are also better at generalization!

article attentionagent.github.io
pdf arxiv.org/abs/2003.08165

Read on 👇🏼
Work by @yujin_tang @nt_duong me @googlejapan

The agent receives the full input, but we force it to see its world through the lens of a self-attention bottleneck which picks only 10 patches from the input (middle)

The controller's decision is based only on these patches (right)
The agent has better generalization abilities, simply due to its ability to “not see things” that can confuse it.

Trained in the top-left setting only, it can also perform in unseen settings with higher walls, different floor textures, or when confronted with a distracting sign.
People who learn to drive during a sunny day can (sort of) also drive at night, on a rainy day, in a different car, or when bird droppings hit the windshield

Without further training, we also test on brighter/darker scenery, or with artifacts such as side bars or background blob
So is ‘Attention All We Need’ for Generalization?

Of course not!

If we modify the game by adding a fake lane next to the real lane, the agent prefers to look there and drive over instead—something human drivers with logical reasoning won't do, unless they're in another country!
The attention bottleneck doesn't generalize if the background changes dramatically.

Some fun failure cases in the Discussion section:

When we suddenly replace the green background with a YouTube cat video, it stops to look at the cat's fat belly, rather than focus on the road🐈
Distract RL agent's attention from driving with “King of Fighters” 🥊🔥
Attention loves noise.

Even if we train our agent from scratch in a noisy background setting, it still attends only to the noise and not to the road.

Surprisingly, it learns to interpret those points as obstacles, and by avoiding them, still manages to wobble through the track!
Perhaps lowering the number of patches (K) will force the agent to focus on the road.

But when we decrease K to 5, it still attends to noise rather than to the road. Not surprisingly, if we increase K to 20, it performs better.

CarRacingNoise-v0 will make a nice benchmark task.
Missing some Tweet in this thread? You can try to force a refresh.

Enjoying this thread?

Keep Current with hardmaru

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!