Senior Research Scientist at DeepMind. Intrinsically motivated to research intrinsic motivation. All opinions my own.
Jun 13, 2019 • 5 tweets • 3 min read
Excited to share some new work on ArXiv today: "Fast Task Inference with Variational Intrinsic Successor Features" Done with @wwdabney, Andre Barreto, Tom Van de Wiele, @dwf, and @VladMnih
TL;DR: Unsupervised pre-training for efficient RL arxiv.org/abs/1906.05030
Imagine that unsupervised interaction is free/cheap, but evaluating the reward function isn't. We formalize this as a 2 phase training regime: