, 5 tweets, 3 min read Read on Twitter
Excited to share some new work on ArXiv today: "Fast Task Inference with Variational Intrinsic Successor Features" Done with @wwdabney, Andre Barreto, Tom Van de Wiele, @dwf, and @VladMnih

TL;DR: Unsupervised pre-training for efficient RL
arxiv.org/abs/1906.05030
Imagine that unsupervised interaction is free/cheap, but evaluating the reward function isn't. We formalize this as a 2 phase training regime:

1) unlimited unsupervised interaction

2) few-shot rewarded interactions (standard RL setup)

We apply this regime to all 57 Atari games
The successor features (SF) framework (arxiv.org/abs/1606.05312) decouples state and reward dynamics. This allows you to infer the solution to a new task by solving a linear regression problem mapping features to rewards.

But where do the features come from?
Learning options with predictable behavior (ala VIC arxiv.org/abs/1611.07507 and DIAYN arxiv.org/abs/1802.06070) is an unsupervised objective that can be seen as implicitly learning controllable features.

Plug these features into SF and you're good to go!
After unsupervised learning of controllable successor features, we can now learn about a task very efficiently by solving a 5 parameter(!) linear regression problem instead of a non-linear RL problem.

Human-level performance on 14 Atari games.
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Steven Hansen
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!