Latest Twitter Threads by @Zergylord on Thread Reader App

Jun 13, 2019 • 5 tweets • 3 min read

Excited to share some new work on ArXiv today: "Fast Task Inference with Variational Intrinsic Successor Features" Done with @wwdabney, Andre Barreto, Tom Van de Wiele, @dwf, and @VladMnih

TL;DR: Unsupervised pre-training for efficient RL
arxiv.org/abs/1906.05030

Imagine that unsupervised interaction is free/cheap, but evaluating the reward function isn't. We formalize this as a 2 phase training regime:

1) unlimited unsupervised interaction

2) few-shot rewarded interactions (standard RL setup)

We apply this regime to all 57 Atari games

Share this page!

Enter URL or ID to Unroll