Robert Lange Profile picture
Dec 16, 2021 5 tweets 3 min read Read on X
Can memory-based meta-learning not only learn adaptive strategies 💭 but also hard-code innate behavior🦎? In our #AAAI2022 paper @sprekeler & I investigate how lifetime, task complexity & uncertainty shape meta-learned amortized Bayesian inference.

📝: arxiv.org/abs/2010.04466
We analytically derive the optimal amount of exploration for a bandit 🎰 which explicitly controls task complexity & uncertainty. Not learning is optimal in 2 cases:

1⃣ Optimal behavior across tasks is apriori predictable.
2⃣ There is on avg not enough time to integrate info⌛️
🧑‍🔬 Next, we compared the analytical solution to the amortized Bayesian inference meta-learned by LSTM-based RL^2 agents 🤖

We find that that memory-based meta-learning is indeed capable of learning to learn and not to learn (💭/🦎) depending on the meta-train distribution.
Where do inaccuracies at the edge between learning and not learning come from?🔺Close to the edge there exist multiple local optima corresponding to vastly different behaviors.

👉Highlighting the challenge of optimising meta-policies close to discontinuous behavioral transitions
Finally, we show that meta-learners overfit their respective training lifetime ⏲️ Agents may not generalise to longer time horizons if trained on short ones and vice versa.❓This raises Qs towards adaptive multi-timescale meta-policies & time-universal MetaRL 🔎

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Robert Lange

Robert Lange Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @RobertTLange

Aug 29
📢 Two weeks since we released The AI Scientist 🧑‍🔬!

We want to take the time to summarize a lot of the discussions we’ve been having with the community, and give some hints about what we are working on! 🫶

We are beyond grateful for all your feedback and the community debate our work has sparked ✨Image
In public discussions of this paper, we frequently refer to it as the “Will Smith eating spaghetti” moment for AI Science 🍝.

While there are often minor errors in the outputs of the papers, we believe, like Will Smith’s fingernails being the wrong size originally, these problems will only improve - with newer models, more compute, and better methods.

This is the worst the AI Scientist will ever be! 📈
We consciously decided to open-source all the code to democratize access to all the individual tools 🔨 introduced in the AI Scientist agent pipeline.

Everyone can critically assess its competence and usage for their projects!

One useful data point we’ve been hearing is that people have often been surprised that AI can generate interesting research ideas for their own fields at all!

Image
Read 7 tweets
Aug 13
🎉 Stoked to share The AI-Scientist 🧑‍🔬 - our end-to-end approach for conducting research with LLMs including ideation, coding, experiment execution, paper write-up & reviewing.

Blog 📰:
Paper 📜:
Code 💻:

Work led together with @_chris_lu_, @cong_ml and jointly supervised by @j_foerst, @jeffclune, @hardmaru 🤗sakana.ai/ai-scientist/
arxiv.org/abs/2408.06292
github.com/SakanaAI/AI-Sc…
Given a starting code template 📝 we ask an LLM to propose new research directions. It checks the novelty of its idea proposals 💡 using Semantic Scholar and scores the "interestingness" as well as "novelty". Below you can find a Diffusion idea on "adaptive dual-scale denoising": Image
The LLM afterwards implements all the required code-level changes 🦾. We leverage the amazing aider tool by @paulgauthier with various different LLM backends including GPT-4o, Sonnet 3.5, DeepSeek Code and Llama 3.1 405B.

Afterward, the AI Scientist iteratively executes experiments to obtain statistics and plots. Below you can find an example code diff:Image
Read 9 tweets
Jun 9
📺 Exciting talk on the xLSTM architecture and the challenges of questioning the first-mover advantage of the Transformer 🤖 by @HochreiterSepp @scioi_cluster

📜:
💻:

arxiv.org/abs/2405.04517
github.com/NX-AI/xlstm

Image
🗿 The LSTM architecture has been a foundational pillar of modern Deep Learning. E.g including various breakthrough results in Deep RL (e.g. OpenAI's Dota), forecasting (e.g. weather) and the initial seq2seq models.

💡 xLSTM tackles several challenges in scaling the original architecture to long sequences (via exponential gating and memory mixing) and distributed training (via associative memories). Furthermore, it combines several advances in training large sequence models.
📉 The 1.3B parameter results are very impressive and the scaling results appear far from having reached a saturation point. Very much looking forward to the next generation!

Furthermore, it has recently also shown promising results on Vision tasks 📸


Image
Read 4 tweets
Jun 25, 2022
🚀 I am very excited to share gymnax 🏋️ — a JAX-based library of RL environments with >20 different classic environments 🌎, which are all easily parallelizable and run on CPU/GPU/TPU.

💻[repo]: github.com/RobertTLange/g…

📜[colab]: colab.research.google.com/github/RobertT…
gymnax inherits the classic gym API design 🧑‍🎨 and allows for explicit functional control over the environment settings 🌲 and randomness 🎲

reset and step operations can leverage JAX transformations such as jit-compilation, auto-vectorization and device parallelism 🤖 Image
It accelerates rollouts & facilitates the distributed Anakin Podracer (@matteohessel et al. 21) architecture 🏃 Data collection/learning directly runs on accelerators using replication/aggregation across devices.

👇 speed comparisons for different # workers, hardware, policies: Image
Read 7 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(