Latest Twitter Threads by @chelseabfinn on Thread Reader App

Jan 27, 2023 • 6 tweets • 3 min read

LLMs like ChatGPT are becoming more fluent – how can we detect if something was written by a language model or a human?

We developed DetectGPT: a method for detecting if a passage was written by a particular language model.

Why does this matter?

Large language models are already being used to:
* write news articles (sometimes with major errors!)
* cheat on homework

Can we help humans spot LLM-written text?

cnet.com/tech/cnet-is-t…

Oct 27, 2022 • 4 tweets • 3 min read

Common fine-tuning wisdom is to adapt the last layer or the entire neural net.

We find that, sometimes, fine-tuning *only* the first layers or middle layers works best.

Paper: arxiv.org/abs/2210.11466

A short 🧵

One of the most reliable ways to handle distr. shift is to fine-tune on a small amt. of data.

We find that the best layers to fine-tune depends on the *type* of shift!

Compared to fine-tuning the whole network, fine-tuning just one block achieves similar or higher accuracy. ⬇️

Feb 9, 2022 • 7 tweets • 4 min read

What should ML models do when there's a *perfect* correlation between spurious features and labels?

This is hard b/c the problem is fundamentally _underdefined_

DivDis can solve this problem by learning multiple diverse solutions & then disambiguating
arxiv.org/abs/2202.03418
🧵

Prior works have made progress on robustness to spurious features but also have important weaknesses:
- They can't handle perfect/complete correlations
- They often need labeled data from the target distr. for hparam tuning

https://twitter.com/kchonyc/status/1455589673229770757

Oct 22, 2021 • 4 tweets • 3 min read

Large language models (LLMs) often make mistakes that are difficult to correct.

We study the problem of quickly editing these models:
Paper: arxiv.org/abs/2110.11309
Code: github.com/eric-mitchell/…

w/ @_eric_mitchell_, C. Lin, @ABosselut, @chrmanning

thread 🧵👇

We assume a pre-trained model & a dataset that covers many possible model edits

Then, we meta-train a model editor that predicts a model update that:
- edits the model
- otherwise keeps the model behavior the same

(2/4)

Sep 22, 2021 • 4 tweets • 2 min read

RL methods so often learn from _scratch_. Can they leverage offline experience from previous tasks?

They can. And if they do, they will learn new tasks ~2x faster.

Paper: arxiv.org/abs/2109.09180
Website: sites.google.com/view/retain-ex…

Led by Annie Xie. 🧵👇(1/4)

Many prior transfer learning methods try to transfer weights, e.g through fine-tuning.

We consider whether we can also transfer past *experiences*, rather than throwing away the prior data.
(2/4)

Jul 20, 2021 • 12 tweets • 7 min read

Thrilled to share new work on AI for education: can we give detailed, high-quality feedback to students?

Post: ai.stanford.edu/blog/prototran…
NYT Coverage: nytimes.com/2021/07/20/tec…

A collab w. the amazing @mike_h_wu @chrispiech & co 🧵

2/ Student feedback is a fundamental problem in scaling education.

Providing good feedback is hard: existing approaches provide canned responses, cryptic error messages, or simply provide the answer.

Apr 1, 2021 • 5 tweets • 3 min read

How can robots generalize to new environments & tasks?

We find that using in-the-wild videos of people can allow learned reward functions to do so!
Paper: arxiv.org/abs/2103.16817

Led by @_anniechen_, @SurajNair_1
🧵(1/5)

To get reward functions that generalize, we train domain-agnostic video discriminators (DVD) with:
* a lot of diverse human data, and
* a narrow & small amount of robot demos

The idea is super simple: predict if two videos are performing the same task or not.
(2/5)

Jul 8, 2020 • 8 tweets • 4 min read

Convolution is an example of structure we build into neural nets. Can we _discover_ convolutions & other symmetries from data?

Excited to introduce:
Meta-Learning Symmetries by Reparameterization
arxiv.org/abs/2007.02933

w/ @allan_zhou1 @TensorProduct @StanfordAILab
Thread👇 To think about this question, we first look at how equivariances are represented in neural nets.

They can be seen as certain weight-sharing & weight-sparsity patterns. For example, consider convolutions.
(2/8)

Jul 7, 2020 • 6 tweets • 3 min read

Supervised ML methods (i.e. ERM) assume that train & test data are from the same distribution, & deteriorate when this assumption is broken.

To help, we introduce adaptive risk minimization (ARM):
arxiv.org/abs/2007.02931

With M Zhang, H Marklund @abhishekunique7 @svlevine
(1/6) Prior works on dIstributionally-robust optimization (DRO) aim to be _robust_ to distribution shift.

Group DRO aims for robustness to shifts in groups underlying the dataset. (e.g. see arxiv.org/abs/1611.02041)
(2/6)

Share this page!

Enter URL or ID to Unroll