Another huge week of AI and robotics news.
So, I summarized everything from OpenAI, Google, Meta, Microsoft, FutureHouse, Mistral, Unitree, Stanford, UC Berkeley, Hugging Face, and more.
Here's everything you need to know and how to make sense out of it:
OpenAI ditched its for-profit push, saying it will convert its existing for-profit arm into a PBC but keep its non-profit in control with a majority stake
This comes after pressure from several ex-employees and an ongoing legal battle
OpenAI also launched a GitHub connector for ChatGPT
The feature will allow users to connect their repos and use ChatGPT's Deep Research to read and search source code and PRs, creating a detailed report with citations
Google updated two key models:
—Gemini 2.5 Pro Preview (I/O Edition), with video understanding and improvements for UI, code, and agentic workflows
—Gemini 2.0 Flash image generation with improved quality, text rendering, and fewer content restrictions
Meta dropped two new models:
—Perception Language Model, an open AI for visual tasks like extracting details of a subject's actions at a given time
—Locate 3D, an object localization AI, aimed at helping robots understand and interact with surroundings
Microsoft updated its Copilot with "Pages," a ChatGPT Canvas-like feature
It allows users to collaborate with Copilot, asking the assistant to tweak, expand, or polish its responses
Notable it doesn't seem to have coding capabilities like Canvas
Microsoft also announced it's adopting Google's Agent2Agent (A2A) framework, launching it soon on Azure AI Foundry and Copilot Studio
The move will enable enterprises to develop AI agents that interoperate across platforms by design
Ex-Google CEO Eric Schmidt-backed FutureHouse dropped five 'AI Scientist' agents:
—Crow for general research
—Falcon for deep literature reviews
—Owl for identifying previous research
—Phoenix for chemistry workflows
—Finch for discovery in biology
Mistral released two big products:
—Medium 3, a multimodal AI that matches or surpasses 3.7 Sonnet, GPT-4o, and Llama 4 Maverick at 8x less cost
—Le Chat Enterprise, an agentic AI assistant for businesses with tools like Google Drive and agent building
Unitree is teaming up with SF-based Reborn to co-develop advanced AI to make its robots smarter, more adaptable, and capable of complex tasks
It will use multiple Reborn offerings, including its Roboverse simulator, motion datasets, and developer tools
Stanford researchers debuted a Teleoperated Whole-Body Imitation System (TWIST)
It enables coordinated, versatile, whole-body movements of humanoids, using a single neural network
This will enable functional general-purpose robots in different domains!
UC Berkeley researchers announced VideoMimic, a real-to-sim-to-real pipeline that trains robots with mobile videos
It mines videos, reconstructs the humans and the environment, and produces policies for humanoids, enabling skills like climbing stairs
Hugging Face released Open Computer Agent, an open-source AI agent for automating web tasks — similar to OpenAI's Operator
It is free to use via web browsers, but is reported to be slow and capable of handling only basic multi-step tasks
Anthropic released web search capabilities in the API
The feature allows web developers to build applications that can search the web for up-to-date information and provide grounded answers with relevant citations
UC Berkeley researchers also introduced PyRoki, a modular, extensible, and cross-platform toolkit for kinematic optimization
It solves inverse kinematics, trajectory optimization, and motion retargeting for a wide range of robots, including humanoids
We're hiring for hundreds of roles @Figure_robot:
> AI Engineers (many)
> Staff Security Engineer
> HMI Design Lead
> System Integration & Test (many)
> Legal (many)
> Manufacturing (many)
Apply here: figure.ai/careers x.com/adcock_brett/s…
@Figure_robot That's it for this week's AI and Robotics breakdown.
I share the latest research every week, so follow me @adcock_brett for more.
If you found this valuable, consider a like/retweet to spread the word.
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.
