Ted Xiao Profile picture
Founding Member of Technical Staff at Project Prometheus. Previously Gemini, Robotics @GoogleDeepMind. Posts about frontier models, physical AGI, and scaling.
Dec 20, 2024 16 tweets 6 min read
Robotics + AI has completely transformed: a perfect storm of AI breakthroughs, hardware innovation, and capital inflow. But are general robot foundation models truly just around the corner?

At recent talks, I took an honest look at hype vs. reality to share what’s missing 🧵👇 We need to look no further than the field of foundation modeling to understand the critical parts of what’s unlocked hyperscaling: scaling laws, high-bandwidth contexts, and scalable evaluations. It’s not obvious what the robotics version of these are! (2/16) Image
Feb 10, 2023 14 tweets 8 min read
The optimism in robotics research is absolutely incredible these days! I believe all the pieces we need for a “modern attempt at embodied intelligence” are ready. At recent talks, I pitched a potential recipe, and I’d like to share it with you.

Let’s break down the key points 🔑 2) The first place to start might be to ask: why isn't robotics solved yet? The challenge is that even the most difficult robotics research settings are so many orders of magnitude less complex than the noise and chaos of the real world. How can we bridge this gap?
Jan 12, 2023 8 tweets 3 min read
🚨New RL impact just dropped🚨

1) My friend is a high level Rocket League player and just alerted me that an open-sourced agent trained with reinforcement learning + self play (github.com/Rolv-Arild/Nec…) has been steamrolling on public servers! It's in the top 0.5% ELO bracket. 2) The agent, called Nexto, uses a distributed self-play system (rlbot.org) to train Soft Actor-Critic agents on top of Perceiver networks. Recently, rogue Rocket League players have been deploying Nexto directly to ranked matches, where it’s reached very high ELOs.
Dec 21, 2022 16 tweets 6 min read
The golden days of internet-scale models achieving unprecedented zero-shot results seem to be waning. The new Big Thing is subsequent fine tuning with humans increasingly out of the loop. How does this work?

Let’s explore *Prior Amplification* 🔎

(1/N) Large internet datasets, whether they are digital art or literature or Reddit posts, reflect some innate notion of the human condition. These kernels of truth can be shaped (ie. NSFW or hate speech filters) but always stem from some subset of human-produced content. (2/N)
Nov 22, 2022 7 tweets 4 min read
Robot-language datasets have enabled tremendous progress in robotics🤖. However, semantic concepts may not be fully captured by existing language labels, which are often expensive to collect. In our new paper, we study how we can get more mileage out of existing datasets! 🧵👇 One picture (or episode) may be worth a thousand words, but our current datasets only provide a few words. Leveraging recent advances in VLMs, we propose generating instruction augmentations with CLIP - effectively importing internet-scale knowledge into offline datasets. (2/7) Image