Alex Profile picture
May 17 9 tweets 2 min read Read on X
Have been talking with people deploying foundational models/VLAs, + reading papers trying to increase real-world succes. Real world RL methods result in highest robust succes rates compared to any other approach.
It’s pretty simple, practical bottlenecks aren’t in the pre-training anymore, it’s in the post-training, mostly using real world RL.

When you try simulation, you’ll suffer from sim2real gap, even with DR you’ll find yourself tuning the model again with real world data
But tuning alone with real world data is not enough for 100% succes rates, recorded teleoperation data doesn’t contain enough OOD rollouts, so it fails during deployment because robustness is bad, not enough generalisation. But you can enforce robustness with RL.
So logically we try this cheaply:
SFT with mixture of sim and real rollouts from expert demos, then perform RL in simulation but use real-world rollouts during every batch gradient update to prevent catastrophic forgetting from sim-only training (RLinf-Co, Sim-and-Real Co)
But succes rates are nowhere close to 100% because the real-world changes during deployment: for robustness real-world RL is required, most promising approaches are currently:

1. Residual / adapter / token-based while freezing most of VLA: RL Token (from PI), PLD and iRe-VLA
2. Human-in-the-loop real-world online RL, let humans intervene/reset/correct rollouts and update the policy: RECAP (from pi0.6), ConRFT, VLAC, DAFT, Hi-ORS
3. Digital-twin-guided real-world RL, this executes the idea of real-to-sim-to-real: do exploration in the digital twin sim and online RL training in parallel: TwinRL-VLA
4. The most promising approach seems to be offline-to-online RL with human interventions and corrections across a fleet of robots instead of one, this feels like running vectorised Isaac Sim environments but instead of sim it’s real, and it works: LWD: Learning while Deploying
@threadreaderapp unroll

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Alex

Alex Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(