MIT faculty (on leave) and a researcher at OpenAI. Working on making ML better understood and more reliable. Thinking about the impact of ML on society too.
2 subscribers
Jul 20, 2023 • 5 tweets • 2 min read
Why does my model think that hats are cats?
Our latest work presents a new perspective on backdoor attacks: backdoors and features are *indistinguishable*, and for a good reason.
with @Alaa_Khaddaj @gpoleclerc @AMakelov @kris_georgiev1 @hadisalmanX @andrew_ilyas [1/5]
Indeed, imagine choosing 5% of the cat images in ImageNet training set, and superimposing synthetically generated hats on top of them.
The hat feature (which now is associated with cats) is a valid and effective backdoor trigger! (And you can find “natural triggers” too.) [2/5]
Mar 27, 2023 • 6 tweets • 5 min read
As ML models/datasets get bigger + more opaque, we need a *scalable* way to ask: where in the *data* did a prediction come from?
Presenting TRAK: data attribution with (significantly) better speed/efficacy tradeoffs:
w/ @smsampark@kris_georgiev1@andrew_ilyas@gpoleclerc 1/6
Turns out: Existing data attribution methods don't scale---they're either too expensive or too inaccurate. But TRAK can handle ImageNet classifiers, CLIP, and LLMs alike. (2/6)
My @MIT students hacked a way to "immunize" photos against edits: gradientscience.org/photoguard/ (1/8)
Remember when Trevor shared (on Instagram) a photo with @michaelkosta at a tennis game? (2/8)
Feb 2, 2022 • 6 tweets • 4 min read
Can we cast ML predictions as simple functions of individual training inputs? Yes! w/ @andrew_ilyas@smsampark@logan_engstrom@gpoleclerc, we introduce datamodels (arxiv.org/abs/2202.00622), a framework to study how data + algs -> predictions. Blog: gradientscience.org/datamodels-1/ (1/6)
We trained *hundreds of thousands* of models on random subsets of computer vision datasets using our library FFCV (ffcv.io). We then used this data to fit *linear* models that can successfully predict model outputs. (2/6)
Jan 18, 2022 • 4 tweets • 3 min read
ImageNet is the new CIFAR! My students made FFCV (ffcv.io), a drop-in data loading library for training models *fast* (e.g., ImageNet in half an hour on 1 GPU, CIFAR in half a minute).
FFCV speeds up ~any existing training code (no training tricks needed) (1/3)
FFCV is easy to use, minimally invasive, fast, and flexible: github.com/MadryLab/ffcv#…. We're really excited to both release FFCV today, and start unveiling (soon!) some of the large-scale empirical work it has enabled us to perform on an academic budget. (2/3)