Thread by @SergioPaniego on Thread Reader App

ECHO (Environment Cross-entropy Hybrid Objective) demo support just landed in OpenEnv, and it's a cool idea: train agents to learn a world model almost for free

original paper by @VaishShrivas, Piero Kauffman, Ahmed Awadallah and @DimitrisPapail @MSFTResearch

when an agent acts in an env, a rollout has 2 sides: what the agent writes and what the env writes back

normal agent RL would only train on the agent's side

train a CLI agent with GRPO and the reward shapes the action tokens, while the env's responses get masked out of the loss

all that ground-truth about what actually happened gets thrown away

ECHO proposes using that part too instead of discarding it

on top of the usual RL loss on actions, it adds a small cross-entropy loss on the env's tokens, so the model also learns to predict what the env does

L = GRPO(actions) + λ · CE(observations)

and this is almost free: those tokens already passed through the same forward pass, the logits are already computed, so no extra rollout and no teacher model

you get a world model as a side effect, even failed rollouts turn into signal, and the gains are real:
up to 2.3x faster training and TerminalBench 2.0 pass@1 roughly doubles

to learn more about the idea check out the article by one of the paper's authors (@DimitrisPapail): x.com/DimitrisPapail…

concretely, OpenEnv now lets you tag, per token, what was an action vs an env observation, plus a world-model coefficient

it ships with two runnable demo examples

check them out here: github.com/huggingface/Op…

this brand new research already sits inside the open standard of OpenEnv!

additionally, check these other resources:
> PI blog: primeintellect.ai/blog/true-agen…
> ECHO in enterprise RL with Foundry (by @t2govind): devblogs.microsoft.com/foundry/outcom…

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Share this page!

Enter URL or ID to Unroll