Post

How to get URL link on X (Twitter) App

On the Twitter thread, click on or icon on the bottom
Click again on or Share Via icon
Click on Copy Link to Tweet
Paste it above and click "Unroll Thread"!
More info at Twitter Help

Aleksander Madry

@aleks_madry

Nov 3, 2022 • 9 tweets • 7 min read • Read on X

Scrolly

@TheDailyShow

Last week on @TheDailyShow, @Trevornoah asked @OpenAI @miramurati a (v. important) Q: how can we safeguard against AI-powered photo editing for misinformation?

My @MIT students hacked a way to "immunize" photos against edits: gradientscience.org/photoguard/ (1/8)

@michaelkosta

Remember when Trevor shared (on Instagram) a photo with @michaelkosta at a tennis game? (2/8)

Using cutting-edge image generation models like #dalle2 and #stablediffusion, someone can easily manipulate the above photo to get this (fake) one: (3/8)

@hadisalmanX

Could Trevor have done anything to prevent this? My students @hadisalmanX @Alaa_Khaddaj @gpoleclerc @andrew_ilyas spent an enjoyable weekend hacking together a potential answer: adding small (imperceptible) noise to the original photo can make it “immune” to such edits! (4/8)

After such “immunization”, the same edit of this photo looks much worse.
So, Trevor could have applied such “immunization” to his photo before posting it to protect it against this kind of malicious edits. (5/8)

@hadisalmanX

And it is not only about Trevor’s and Michael’s photo. In fact, the lead student on this project @hadisalmanX has a selfie with Trevor too. Now, Hadi is attempting to “deepen” his (imaginary) friendship with @Trevornoah by manipulating this selfie (and he succeeds!) (6/8)

However, again, had this selfie been “immunized”, this would not have been possible! Indeed, images generated from an immunized version of Hadi’s photo with Trevor are totally unrealistic. (7/8)

This works for other edits too (although, for now, might be specific to the photo-editing engine we had on our hands)! Check out our blog post gradientscience.org/photoguard/ for more examples and more details. And stay tuned for the paper! (8/8)

Also, here is the code if you want to play with it: github.com/MadryLab/photo… (9/8)

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @aleks_madry

Aleksander Madry

@aleks_madry

Jul 20, 2023

Why does my model think that hats are cats?

Our latest work presents a new perspective on backdoor attacks: backdoors and features are *indistinguishable*, and for a good reason.

with @Alaa_Khaddaj @gpoleclerc @AMakelov @kris_georgiev1 @hadisalmanX @andrew_ilyas [1/5]

Indeed, imagine choosing 5% of the cat images in ImageNet training set, and superimposing synthetically generated hats on top of them.

The hat feature (which now is associated with cats) is a valid and effective backdoor trigger! (And you can find “natural triggers” too.) [2/5]

Now, since backdoors are fundamentally *indistinguishable* from other features in the data, we need to make some assumptions.

What is the right assumption to make though?

In our work, we assume that backdoors correspond to the “strongest” feature in the data [3/5]

Read 5 tweets

Aleksander Madry

@aleks_madry

Mar 27, 2023

@smsampark

As ML models/datasets get bigger + more opaque, we need a *scalable* way to ask: where in the *data* did a prediction come from?

Presenting TRAK: data attribution with (significantly) better speed/efficacy tradeoffs:

w/ @smsampark @kris_georgiev1 @andrew_ilyas @gpoleclerc 1/6

Turns out: Existing data attribution methods don't scale---they're either too expensive or too inaccurate. But TRAK can handle ImageNet classifiers, CLIP, and LLMs alike. (2/6)

Paper: arxiv.org/abs/2303.14186
Blog: gradientscience.org/trak
Website: trak.csail.mit.edu

What can you do with TRAK? One example: *fact tracing*---identifying data sources that caused a model to generate a fact (arxiv.org/abs/2205.11482). Surprisingly, models influenced *more* by data sources found with TRAK than *ground-truth* data sources containing that fact: (3/6)

Read 6 tweets

Aleksander Madry

@aleks_madry

Feb 2, 2022

@andrew_ilyas

Can we cast ML predictions as simple functions of individual training inputs? Yes! w/ @andrew_ilyas @smsampark @logan_engstrom @gpoleclerc, we introduce datamodels (arxiv.org/abs/2202.00622), a framework to study how data + algs -> predictions. Blog: gradientscience.org/datamodels-1/ (1/6)

We trained *hundreds of thousands* of models on random subsets of computer vision datasets using our library FFCV (ffcv.io). We then used this data to fit *linear* models that can successfully predict model outputs. (2/6)

We then use datamodels to: (1) Predict data counterfactuals (i.e., what if I remove subset R from the train set?) and find that you can flip model predictions for *over 50%* of test examples on CIFAR-10 by removing only 200 (target-specific) training images (0.4% of total) (3/6)

Read 6 tweets

Aleksander Madry

@aleks_madry

Jan 18, 2022

ImageNet is the new CIFAR! My students made FFCV (ffcv.io), a drop-in data loading library for training models *fast* (e.g., ImageNet in half an hour on 1 GPU, CIFAR in half a minute).
FFCV speeds up ~any existing training code (no training tricks needed) (1/3)

FFCV is easy to use, minimally invasive, fast, and flexible: github.com/MadryLab/ffcv#…. We're really excited to both release FFCV today, and start unveiling (soon!) some of the large-scale empirical work it has enabled us to perform on an academic budget. (2/3)

@gpoleclerc

You can start using FFCV today: check out the repo (github.com/MadryLab/ffcv) and docs (docs.ffcv.io)---we even have a Slack! Stay tuned for a blog post, and a paper explaining the details. w/ @gpoleclerc @andrew_ilyas @logan_engstrom @smsampark @hadisalmanx (3/3)

Read 4 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Enter URL or ID to Unroll

Aleksander Madry

Try unrolling a thread yourself!

More from @aleks_madry

Aleksander Madry

Aleksander Madry

Aleksander Madry

Aleksander Madry

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!