Latest Twitter Threads by @realJessyLin on Thread Reader App

Aug 27 • 10 tweets • 5 min read

🔍 How do we teach an LLM to 𝘮𝘢𝘴𝘵𝘦𝘳 a body of knowledge?

In new work with @AIatMeta, we propose Active Reading 📙: a way for models to teach themselves new things by self-studying their training data. Results:

* 𝟔𝟔% on SimpleQA w/ an 8B model by studying the wikipedia docs (+𝟑𝟏𝟑% vs plain finetuning)
* a domain-specific expert model: 𝟏𝟔𝟎% vs FT on FinanceBench knowledge
* an 8B wikipedia expert competitive w/ 405B on factuality (💥open-sourced!)

🧵[1/n]

Currently, we train models by doing a single pass over the data.

Contrast w/ how humans learn: when we read a textbook, we use many strategies to internalize new info: thinking about a concept in different ways, imagining practice problems, or relating to things we already know.

We apply this idea to LLMs: for each training doc, we have the model itself propose study strategies, "actively reading" to synthesize its own augmented training corpus.

🧵 [2/n]

Jun 1, 2023 • 10 tweets • 5 min read

How can agents like LLMs become decision-making partners for humans?

💬 Excited to share a new paper + suite of envs for 𝘥𝘦𝘤𝘪𝘴𝘪𝘰𝘯-𝘰𝘳𝘪𝘦𝘯𝘵𝘦𝘥 𝘥𝘪𝘢𝘭𝘰𝘨𝘶𝘦𝘴, where agents + humans collab to solve hard everyday problems. [1/n]

Site: collaborative-dialogue.github.io

A lot of everyday problems involve making decisions with messy constraints—from researching a laptop to buy to prioritizing a company roadmap.

Agents could help us make these decisions! But they need to integrate the fuzzy real-world knowledge and preferences that we know.

Apr 18, 2022 • 8 tweets • 6 min read

How can agents infer what people want from what they say?

In our new paper at #acl2022nlp w/ @dan_fried, Dan Klein, and @ancadianadragan, we learn preferences from language by reasoning about how people communicate in context.

Paper: arxiv.org/abs/2204.02515
[1/n]

@dan_fried @ancadianadragan We’d like AI agents that not only follow our instructions (“book this flight”), but learn to generalize to what to do in new contexts (know what flights I prefer from our past interactions and book on my behalf) — i.e., learn *rewards* from language. [2/n]

Share this page!

Enter URL or ID to Unroll