Read on Twitter

12,399 views

Shreya Shankar

@sh_reya

, 16 tweets, 4 min read Read on Twitter

as machine learning & AI systems are increasing in popularity, we are right to be concerned about the security of such systems.

after reading papers & talking to peers, i realized that someone should probably explain the field in layman terms. here's my attempt! (1/14)

most big systems have malicious actors.

ex: corrupt politicians in the government, "mean girls" in middle school social dynamics

there are always malicious people trying to deceive ML systems. ex: uploading an inappropriate video to YouTube kids (2/14)

let's say a malicious person intentionally creates some input (inappropriate video) to fool an ML system (YouTube's filter for appropriateness). we call such an input an *adversarial example.* (Goodfellow et al. 2017, Gilmer et al. 2018) (3/14)

in the research community, the first big adversarial examples paper (Szegedy et al. 2014) focused on small, imperceptible changes to images that completely fooled classifiers (ML systems). (4/14)

since the story of machine learning security started with imperceptible perturbations, the research community jumped on this train! many researchers started constraining the definition of an adversarial example to this small, imperceptibly changed input. uh oh... (5/14)

naturally, much of the initial literature on adversarial examples focused on this narrow definition. but malicious actors don't care about a specific kind of adversarial example. in practice, many ML systems are broken by simple things like image rotations or stickers. (6/14)

after a multiple year long battle between adversarial attacks and defenses, we saw a "showdown:" Athalye et al. 2018 took the best defenses from the previous top AI conference and found some adversarial examples to break them all. oh snap! what do we do now? (7/14)

one line of thinking: are humans, the best machines right now, vulnerable to adversarial examples? yes! it can be simple, like knocking down stop signs so drivers run through intersections, or complicated like adding specific perturbations to images (Elsayed et al. 2018). (8/14)

what does it mean for an ML system to be "good enough?" Engstrom et al. 2019 argue that adversarial examples aren't "bugs." they say inputs (ex: cat image) have robust (ears, eyes, etc) features and non-robust features (random other pixels, patterns we can't see). (9/14)

if someone messes with a robust feature (ex: mess up a cat's face), we humans will be confused, because we care about robust features way more! current ML systems care about non-robust features a lot, so "imperceptible perturbations" make successful adversarial examples. (10/14)

another line of thinking: how can we measure the vulnerability of an ML system? is it possible to prove that ML systems are robust to certain types of adversarial examples (ex: "imperceptibly changed" examples)? researchers are beginning to work on this. (11/14)

as ML systems (ex: facial recognition) have just recently begun to spread, we're going to see types of adversarial examples we didn't anticipate. we *need* to all be on the same page about what adversarial examples are so we don't miss them. (12/14)

there's no doubt that the growth of ML use will bring up new types of problems. every new technology does this (ex: smartphones, the Internet, social media). for many understandable reasons (which i won't get into), people are afraid of the consequences of ML systems. (13/14)

humans are great at finding problems to solve. i don't think "staying away from problems" is a good reason to stay away from developing AI.

yes, there is high risk involved. but high rewards come from high risk. i choose to be an optimist. i hope you do, too. (14/14)

does this stuff interest you? some good papers (in my opinion):

* Motivating the Rules of the Game for Adversarial Example Research: arxiv.org/abs/1807.06732 (Gilmer et al. 2018)
* Adversarial Examples Are Not Bugs, They Are Features: arxiv.org/abs/1905.02175 (Ilyas et al. 2019)

@catherineols

@catherineols

some great blog posts (in my opinion):

* Unsolved research problems vs. real-world threat models: medium.com/@catherio/unso… (@catherineols)
* Is attacking machine learning easier than defending it?: cleverhans.io/security/priva… (@goodfellow_ian and @NicolasPapernot )

Like this thread? Get email updates or save it to PDF!

Subscribe to Shreya Shankar

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Like this thread? Get email updates or save it to PDF!

Subscribe to Shreya Shankar

This content may be removed anytime!

Try unrolling a thread yourself!

More from @sh_reya see all

Related threads

Trending hashtags

Did Thread Reader help you today?