Suffering from overconfident softmax scores? Time to use energy scores!

Excited to release our NeurIPS paper on "Energy-based Out-of-distribution Detection", a theoretically motivated framework for OOD detection. 1/n

Paper: arxiv.org/abs/2010.03759 (w/ code included)
(2/) Joint work w/ Weitang Liu, Xiaoyun Wang, and John Owens. We show that energy is desirable for OOD detection since it is provably aligned with the probability density of the input—samples with higher energies can be interpreted as data with a lower likelihood of occurrence.
(3/) In contrast, we show mathematically that softmax confidence score is a biased scoring function that is not aligned with the density of the inputs and hence is not suitable for OOD detection.
(4/) Importantly, energy score can be derived from a purely discriminative classification model without relying on a density estimator explicitly, and therefore circumvents the difficult optimization process in training generative-based models such as JEM.
(5/) Within our framework, we demonstrate that energy can be flexibly used as a scoring function for any pre-trained neural classifier as well as a trainable cost function to shape the energy surface explicitly for OOD detection.
(6/) Results highlight: on WideResNet, the energy score reduces the average FPR95 by 18.03% on CIFAR-10 compared to using the softmax confidence score. With energy-based training, our method outperforms existing SoTA.
(7/) Previous approaches such as ODIN and Mahalanobis may have hyperparameters to be tuned. In contrast, the energy score is a parameter-free measure, which is easy to use and implement, and in many cases, achieves comparable or even better performance.
(8/) More broadly, our work builds on the insights and principles from energy-based models by @ylecun et al. We were also inspired by the early work on energy-based GAN by Jake Zhao, and JEM by Grathwohl et al.
(9/)
Happy to get feedback if you have more detailed comments or want to engage with us!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Sharon Y. Li

Sharon Y. Li Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!