PhD @Berkeley_AI, visiting researcher @AIatMeta. Interactive language agents ๐ค ๐ฌ
Aug 27 โข 10 tweets โข 5 min read
๐ How do we teach an LLM to ๐ฎ๐ข๐ด๐ต๐ฆ๐ณ a body of knowledge?
In new work with @AIatMeta, we propose Active Reading ๐: a way for models to teach themselves new things by self-studying their training data. Results:
* ๐๐% on SimpleQA w/ an 8B model by studying the wikipedia docs (+๐๐๐% vs plain finetuning)
* a domain-specific expert model: ๐๐๐% vs FT on FinanceBench knowledge
* an 8B wikipedia expert competitive w/ 405B on factuality (๐ฅopen-sourced!)
๐งต[1/n]
Currently, we train models by doing a single pass over the data.
Contrast w/ how humans learn: when we read a textbook, we use many strategies to internalize new info: thinking about a concept in different ways, imagining practice problems, or relating to things we already know.
We apply this idea to LLMs: for each training doc, we have the model itself propose study strategies, "actively reading" to synthesize its own augmented training corpus.
๐งต [2/n]
Jun 1, 2023 โข 10 tweets โข 5 min read
How can agents like LLMs become decision-making partners for humans?
๐ฌ Excited to share a new paper + suite of envs for ๐ฅ๐ฆ๐ค๐ช๐ด๐ช๐ฐ๐ฏ-๐ฐ๐ณ๐ช๐ฆ๐ฏ๐ต๐ฆ๐ฅ ๐ฅ๐ช๐ข๐ญ๐ฐ๐จ๐ถ๐ฆ๐ด, where agents + humans collab to solve hard everyday problems. [1/n]
Site: collaborative-dialogue.github.io
A lot of everyday problems involve making decisions with messy constraintsโfrom researching a laptop to buy to prioritizing a company roadmap.
Agents could help us make these decisions! But they need to integrate the fuzzy real-world knowledge and preferences that we know.
Apr 18, 2022 โข 8 tweets โข 6 min read
How can agents infer what people want from what they say?
In our new paper at #acl2022nlp w/ @dan_fried, Dan Klein, and @ancadianadragan, we learn preferences from language by reasoning about how people communicate in context.
Paper: arxiv.org/abs/2204.02515
[1/n] @dan_fried@ancadianadragan Weโd like AI agents that not only follow our instructions (โbook this flightโ), but learn to generalize to what to do in new contexts (know what flights I prefer from our past interactions and book on my behalf) โ i.e., learn *rewards* from language. [2/n]