12,399 views

hardmaru

@hardmaru

, 9 tweets, 5 min read

My Authors

Neuroevolution of Self-Interpretable Agents

Agents with a self-attention “bottleneck” not only can solve these tasks from pixel inputs with only 4000 parameters, but they are also better at generalization!

article attentionagent.github.io
pdf arxiv.org/abs/2003.08165

Read on 👇🏼

@yujin_tang

@yujin_tang

Work by @yujin_tang @nt_duong me @googlejapan

The agent receives the full input, but we force it to see its world through the lens of a self-attention bottleneck which picks only 10 patches from the input (middle)

The controller's decision is based only on these patches (right)

The agent has better generalization abilities, simply due to its ability to “not see things” that can confuse it.

Trained in the top-left setting only, it can also perform in unseen settings with higher walls, different floor textures, or when confronted with a distracting sign.

People who learn to drive during a sunny day can (sort of) also drive at night, on a rainy day, in a different car, or when bird droppings hit the windshield

Without further training, we also test on brighter/darker scenery, or with artifacts such as side bars or background blob

So is ‘Attention All We Need’ for Generalization?

Of course not!

If we modify the game by adding a fake lane next to the real lane, the agent prefers to look there and drive over instead—something human drivers with logical reasoning won't do, unless they're in another country!

The attention bottleneck doesn't generalize if the background changes dramatically.

Some fun failure cases in the Discussion section:

When we suddenly replace the green background with a YouTube cat video, it stops to look at the cat's fat belly, rather than focus on the road🐈

Distract RL agent's attention from driving with “King of Fighters” 🥊🔥

Attention loves noise.

Even if we train our agent from scratch in a noisy background setting, it still attends only to the noise and not to the road.

Surprisingly, it learns to interpret those points as obstacles, and by avoiding them, still manages to wobble through the track!

Perhaps lowering the number of patches (K) will force the agent to focus on the road.

But when we decrease K to 5, it still attends to noise rather than to the road. Not surprisingly, if we increase K to 20, it performs better.

CarRacingNoise-v0 will make a nice benchmark task.

Enjoying this thread?

Keep Current with hardmaru

Stay in touch and get notified when new unrolls are available from this author!

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Enjoying this thread?

Try unrolling a thread yourself!

More from @hardmaru see all

Related threads

Trending hashtags

Embed code for your website

Did Thread Reader help you today?