Humans reuse skills effortlessly to learn new tasks - can robots do the same? In our new paper, we show how to pre-train robotic skills and adapt them to new tasks in a kitchen.

tl;dr you’ll have a robot chef soon. 🧑‍🍳🤖

links / details below
thread 🧵 1/10
Title: Hierarchical Few-Shot Imitation with Skill Transition Models
Paper: arxiv.org/abs/2107.08981
Site: sites.google.com/view/few-shot-…
Main idea: fit generative “skill” model on large offline dataset, adapt it to new tasks
Result: show robot a new task, it will imitate it
2/10
We introduce Few-shot Imitation with Skill Transition Models (FIST). FIST first extracts skills from a diverse offline dataset of demonstrations, and then adapts them to the new downstream task. FIST has 3 steps (1) Extraction (2) Adaptation (3) Evaluation.
3/10 Image
Step 1: Skill Extraction

Here we fit a generative model to encode-decode action sequences into skills "z". We also learn an inverse skill dynamics model p(z|s,s’) and a contrastive distance function d(s,s’) to be used later for imitation.

4/10
Step 2: Skill Adaptation

Next for new downstream tasks, we quickly finetune the skill network and the inverse model to internalize the task. This part is very data-efficient. We require 1-10 demonstrations.

5/10 Image
Step 3: Semi-Parametric Evaluation

Finally, we use the inverse model p(z|s,s’) to select skills to best imitate the downstream demonstration. Since we don’t know s’ in advance, we use a contrastive distance function d(s,s’) to select the closest s’ from the few-shot demos.

6/10
With these three steps the FIST agent can generalize to new downstream tasks parts of which have never been seen before. We show results on three long-horizon tasks (including a kitchen robot) that FIST can solve from just 10 demonstrations.

7/10 Image
We also find that FIST is a strong one-shot learner for in-distribution downstream tasks. 4 pts is the max possible score in the table below. With 1 demo, FIST is able to chain several subtasks together to imitate the long-horizon demo without drifting off-distribution.

8/10 Image
Along with other recent work, this is an exciting step into pre-training for robotics. Hopefully, we can bring GPT-like capabilities to embodied agents in the future. Was a fun collaboration with @CyrusHakha @RuihanZhao & Albert Zhan who co-led this work and @pabbeel!

9/10
Also thanks to @KarlPertsch for amazing work on the SPiRL architecture which was used as the backbone to our algorithm.

10/10

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Michael (Misha) Laskin

Michael (Misha) Laskin Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @MishaLaskin

20 Jan
Is RL always data inefficient? Not necessarily. Framework for Efficient Robotic Manipulation (FERM) - shows real robots can learn basic skills from pixels with sparse reward in *30 minutes* using 1 GPU 🦾

paper: bit.ly/2M3CFPG
site / code: bit.ly/390Sz6g

1/N
Real-robot RL is challenging for a number of reasons, and data efficiency is chief among them. Common workarounds are training in simulation and transferring the learned policy to the real robot (Sim2Real) or parallelizing training with robot farms (QT-Opt).

2/N
But what makes RL data inefficient in the first place? One hypothesis - (i) representation learning and (ii) exploration. In principle, if we solve both problems, RL should be able to learn quickly.

3/N
Read 14 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(