Thread by @RobertTLange on Thread Reader App

Can memory-based meta-learning not only learn adaptive strategies 💭 but also hard-code innate behavior🦎? In our #AAAI2022 paper @sprekeler & I investigate how lifetime, task complexity & uncertainty shape meta-learned amortized Bayesian inference.

📝: arxiv.org/abs/2010.04466

We analytically derive the optimal amount of exploration for a bandit 🎰 which explicitly controls task complexity & uncertainty. Not learning is optimal in 2 cases:

1⃣ Optimal behavior across tasks is apriori predictable.
2⃣ There is on avg not enough time to integrate info⌛️

🧑‍🔬 Next, we compared the analytical solution to the amortized Bayesian inference meta-learned by LSTM-based RL^2 agents 🤖

We find that that memory-based meta-learning is indeed capable of learning to learn and not to learn (💭/🦎) depending on the meta-train distribution.

Where do inaccuracies at the edge between learning and not learning come from?🔺Close to the edge there exist multiple local optima corresponding to vastly different behaviors.

👉Highlighting the challenge of optimising meta-policies close to discontinuous behavioral transitions

Finally, we show that meta-learners overfit their respective training lifetime ⏲️ Agents may not generalise to longer time horizons if trained on short ones and vice versa.❓This raises Qs towards adaptive multi-timescale meta-policies & time-universal MetaRL 🔎

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Share this page!

Enter URL or ID to Unroll