What should a meta-learner optimize? What if we make it chase its own future outputs?

Turns out, it can improve meta-optimization, set new SOTAs, and lead to new types of meta-learning. arxiv.org/pdf/2109.04504…

w. Y. Schroecker, @tomzhavy, @hado, D. Silver, S. Singh.
🧵👇
The meta-learner's goal is to match a target. That gives us a lot of flexibility: the matching function can ease optimization and we can get open-ended meta-learning by generating targets from the meta-learner itself: the better the meta-learner gets, the better the targets 🚀
Of course, not all targets are good. But if the target is in a direction of performance improvement, target matching improves performance. In fact, by picking good targets, we can get larger performance improvements than if we directly meta-optimize the objective! 💥
By bootstrapping, we can have targets that are further along the learning curve without having to backpropagate through the updates. On Atari, we extend the effective meta-learning horizon by 5x, pay a 25% increase in wall-clock speed, and get 2x performance.
The flexibility of target matching opens up for new forms of meta-learning, such as meta-learning parameters that aren’t part of the learner’s update. We tried it out for size by meta-learning epsilon in a Q-learning agent to great effect.
Yet, we have only scratched the surface! There are many questions left unanswered, both in terms of picking targets, matching functions, and what new forms of meta-learning this affords. 🤔

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Sebastian Flennerhag

Sebastian Flennerhag Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(