Latest Twitter Threads by @chuanyang_jin on Thread Reader App

May 20 • 10 tweets • 5 min read

What are users thinking during their interactions with LLMs?

We introduce ThoughtTrace — the first large-scale dataset that captures what users think during real-world human–AI conversations, not just what they type.
→ 10,174 thought annotations
→ 2,155 multi-turn conversations, 17,058 turns
→ 1,058 users
→ 20 LLMs

These thoughts improve user behavior prediction (+41.7%) and model alignment (+25.6%).
This opens a new paradigm of user-centric LLM research. Full information in the thread 🧶

Read our paper: arxiv.org/abs/2605.20087
Check our project website: thoughttrace-project.github.io

Conversational AI has reached billions of users, yet every dataset captures only what people say, never what they think.

ThoughtTrace pairs each turn with the user’s own latent thought: 🟦reasons for sending a prompt 🟧 reactions to the assistant's response.

Feb 26, 2025 • 9 tweets • 4 min read

How to achieve human-level open-ended machine Theory of Mind?

Introducing #AutoToM: a fully automated and open-ended ToM reasoning method combining the flexibility of LLMs with the robustness of Bayesian inverse planning, achieving SOTA results across five benchmarks. 🧵[1/n]

Theory of Mind (ToM), the ability to understand people’s minds, is known to be challenging. Current approaches either rely on prompting LLMs, which are prone to systematic errors, or use rigid, hand-crafted Bayesian ToM models, which are more robust but cannot generalize across different domains.

To address this, we introduce #AutoToM, the first model-based ToM method that addresses open-ended scenarios. [2/n]

Share this page!

Enter URL or ID to Unroll