Profile picture
nick frosst @nickfrosst
, 7 tweets, 3 min read Read on Twitter
1/7 Our new paper on adversarial attack detection and capsule networks with @sabour_sara and Geoff Hinton is out on arxiv today! arxiv.org/abs/1811.06969 it will be presented at the #NeurIPS Workshop on Security. Don't have time to read the paper? Read this thread instead! :)
2/7 The problem with adversarial examples is that they dont look like what they are classified as. Capsule networks output both a classification and a reconstruction of the input conditioned on the classification. A reconstruction of an adversarial looks different from the input
3/7 We can create a detection algorithm by defining a threshold for reconstruction error from a validation set, and flag inputs as adversarial if the reconstruction error exceeds this threshold.
4/7 This is an attack agnostic detection algorithm that works quite well for the three datasets we tested - MNIST, fashionMNIST and SVHN and the attacks we tested - black box and white box FGSM and BIM attacks
5/7 The reconstruction error is itself differentiable; we can make a stronger attack by minimize reconstruction error and maximize classification error. This attack can trick our detection, but the results aren't really ‘adversarial’ - they resemble members of the target class
6/7 This is true for MNIT and fashionMNIST
7/7 And SVHN. Read the paper to see to our work in full :) arxiv.org/abs/1811.06969
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to nick frosst
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member and get exclusive features!

Premium member ($30.00/year)

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!