🚀 I am very excited to share gymnax 🏋️ — a JAX-based library of RL environments with >20 different classic environments 🌎, which are all easily parallelizable and run on CPU/GPU/TPU.
📜[colab]: colab.research.google.com/github/RobertT…
gymnax inherits the classic gym API design 🧑🎨 and allows for explicit functional control over the environment settings 🌲 and randomness 🎲
reset and step operations can leverage JAX transformations such as jit-compilation, auto-vectorization and device parallelism 🤖
Dec 16, 2021 • 5 tweets • 3 min read
Can memory-based meta-learning not only learn adaptive strategies 💭 but also hard-code innate behavior🦎? In our #AAAI2022 paper @sprekeler & I investigate how lifetime, task complexity & uncertainty shape meta-learned amortized Bayesian inference.
📝: arxiv.org/abs/2010.04466
We analytically derive the optimal amount of exploration for a bandit 🎰 which explicitly controls task complexity & uncertainty. Not learning is optimal in 2 cases:
1⃣ Optimal behavior across tasks is apriori predictable.
2⃣ There is on avg not enough time to integrate info⌛️