Tweet

Shane Gu

4 Dec, 4 tweets, 3 min read

@NeurIPSConf

If overwhelmed by # of papers in *offline* RL, check out our @NeurIPSConf Spotlight with Scott Fujimoto: we show how few lines change to TD3 (TD3+BC) can be competitive with SoTA algorithms, halving training time. Inspired by #minimalism #zen #konmari arxiv.org/abs/2106.06860

We propose "BC as a regularizer", which adds negligible compute cost to original TD3 objective, but makes it quite performative on offline RL.

https://twitter.com/shaneguML/status/1466960547640147971?s=20

For the table, we followed similar "algorithm" "implementation" separations suggested in our other NeurIPS paper

https://twitter.com/shaneguML/status/1466960547640147971?s=20

@NeurIPSConf

Lastly, we appreciate constructive feedback from reviewers and suggesting this "one git-pull"-size paper for Spotlight at @NeurIPSConf! The community could benefit from simple ideas.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @shaneguML

Shane Gu

@shaneguML

4 Dec

@EcoTheoryRL

Excited to co-run the @EcoTheoryRL "data-centric RL" workshop @NeurIPSConf! Schedule: sites.google.com/corp/view/ecor…

INCREDIBLE speakers:

1) @ShaneLegg (Co-founder/Chief Scientist of @DeepMind)
2) Joelle Pineau (McGill, FAIR, MILA)

@pyoudeyer

3) Pierre-Yves Oudeyer @pyoudeyer (INRIA)
4) @katjahofmann (@MSFTResearch @MSFTResearchCam)

@DrewPurves

5) Daniel Tanis (w/ @DrewPurves) @DeepMind

Read 6 tweets

Shane Gu

@shaneguML

23 Oct

Toy MuJoCo + Box2d envs in OpenAI Gym are moving to #brax! 100x GPU/TPU speedup + purely pythonic + jax/pytorch-enabled ready to be unleashed! An exciting news for #brax #braxlines #jax teams. Also check out #composer, where I am adding more demos github.com/openai/gym/iss…

https://twitter.com/shaneguML/status/1438696633437286400?s=20

#braxlines:

https://twitter.com/shaneguML/status/1438696633437286400?s=20

#composer: github.com/google/brax/tr…

arxiv: arxiv.org/abs/2110.04686

#brax still cannot (and probably won't ever) match the full specs with mujoco/pybullet. But esp with open-sourcing plans of mujoco, excited to see where could be synergies.

Good to see a lot of large-scale, algorithmic deep RL researchers are aligned: "I personally believe that hardware accelerator support is more important, hence choosing Brax."

Read 4 tweets

Share this page!

Shane Gu

Try unrolling a thread yourself!

More from @shaneguML

Shane Gu

Shane Gu

Did Thread Reader help you today?

Like this author's thread?