, 10 tweets, 4 min read Read on Twitter
CleverHans blog post with @nickfrosst: we explain how the Deep k-Nearest Neighbors (DkNN) and soft nearest-neighbor loss (SNNL) help recognize data that is not from the training distribution. The post includes an interactive figure (credit goes to Nick): cleverhans.io/security/2019/…
Models are deployed with little input validation, which boils down to expecting the classifier to correctly classify any input. This goes against one of the fundamental assumptions of ML: models should be presented at test time with inputs that fall on their training manifold.
If we deploy a model on inputs that may fall outside of this data manifold, we need mechanisms for figuring out whether a specific input/output pair is acceptable for a given ML model. In security, we sometimes refer to this as admission control (see arxiv.org/abs/1811.01134)
The DkN breaks the black-box myth around deep learning. Patterns extracted by hidden layers on a test input are compared to those found during training to ensure that when a label is predicted, patterns that led to this prediction can be found in the training data for this label.
This allows us to measure uncertainty in a way different from how neural nets typically compute class scores (we argue that the softmax is not ideal at test time). When patterns found in the training data agree with test-time patterns, the prediction has high credibility.
Adversarial examples typically gradually turn a small change in the input domain into a large change in the model’s output space. This results in layers closer to the input of the model having representations that are closer to the correct class of the input while
layers towards the output have representations that are closer to the wrong class. Credibility helps distinguish legitimate data from outlier data by forcing inputs to have more consistency across the network’s architecture.
We note that evaluations that do not take into account the credibility metric (as done in Sitawarin and Wagner arxiv.org/abs/1903.08333 to be presented at #sp19 's DLS workshop) are not sufficient to draw conclusions on the robustness of the DkNN.
Finally, with the SNNL, we modify the training objective of our neural net to improve the similarity structure of its hidden representations (which are then analyzed with the DkNN). This led us to a surprising observation:
encouraging hidden layers to entangle data (to bring points from different classes closer together) improved the similarity search performed by the DkNN more than encouraging representations to disentangle data, which would help achieve a large (SVM-like) margin between classes Logits of a normal model trained with cross-entropy (left); entangled model trained with both cross-entropy and the soft nearest-neighbor loss (right). The entangled model to the right better distinguishes outlier data (in blue) from legitimate data (in green) than the normal model to the left.
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Nicolas Papernot
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!