Aleksander Madry Profile picture
MIT faculty (on leave) and a researcher at OpenAI. Working on making ML better understood and more reliable. Thinking about the impact of ML on society too.

Nov 3, 2022, 9 tweets

Last week on @TheDailyShow, @Trevornoah asked @OpenAI @miramurati a (v. important) Q: how can we safeguard against AI-powered photo editing for misinformation?

My @MIT students hacked a way to "immunize" photos against edits: gradientscience.org/photoguard/ (1/8)

Remember when Trevor shared (on Instagram) a photo with @michaelkosta at a tennis game? (2/8)

Using cutting-edge image generation models like #dalle2 and #stablediffusion, someone can easily manipulate the above photo to get this (fake) one: (3/8)

Could Trevor have done anything to prevent this? My students @hadisalmanX @Alaa_Khaddaj @gpoleclerc @andrew_ilyas spent an enjoyable weekend hacking together a potential answer: adding small (imperceptible) noise to the original photo can make it “immune” to such edits! (4/8)

After such “immunization”, the same edit of this photo looks much worse.
So, Trevor could have applied such “immunization” to his photo before posting it to protect it against this kind of malicious edits. (5/8)

And it is not only about Trevor’s and Michael’s photo. In fact, the lead student on this project @hadisalmanX has a selfie with Trevor too. Now, Hadi is attempting to “deepen” his (imaginary) friendship with @Trevornoah by manipulating this selfie (and he succeeds!) (6/8)

However, again, had this selfie been “immunized”, this would not have been possible! Indeed, images generated from an immunized version of Hadi’s photo with Trevor are totally unrealistic. (7/8)

This works for other edits too (although, for now, might be specific to the photo-editing engine we had on our hands)! Check out our blog post gradientscience.org/photoguard/ for more examples and more details. And stay tuned for the paper! (8/8)

Also, here is the code if you want to play with it: github.com/MadryLab/photo… (9/8)

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling