pith. sign in

arxiv: 1909.11786 · v1 · pith:ZNEFWMYSnew · submitted 2019-09-25 · 📊 stat.ML · cs.LG

Probabilistic Modeling of Deep Features for Out-of-Distribution and Adversarial Detection

classification 📊 stat.ML cs.LG
keywords adversarialfeaturesdeepsamplesapproachdetectingdistributionsmodeling
0
0 comments X
read the original abstract

We present a principled approach for detecting out-of-distribution (OOD) and adversarial samples in deep neural networks. Our approach consists in modeling the outputs of the various layers (deep features) with parametric probability distributions once training is completed. At inference, the likelihoods of the deep features w.r.t the previously learnt distributions are calculated and used to derive uncertainty estimates that can discriminate in-distribution samples from OOD samples. We explore the use of two classes of multivariate distributions for modeling the deep features - Gaussian and Gaussian mixture - and study the trade-off between accuracy and computational complexity. We demonstrate benefits of our approach on image features by detecting OOD images and adversarially-generated images, using popular DNN architectures on MNIST and CIFAR10 datasets. We show that more precise modeling of the feature distributions result in significantly improved detection of OOD and adversarial samples; up to 12 percentage points in AUPR and AUROC metrics. We further show that our approach remains extremely effective when applied to video data and associated spatio-temporal features by detecting adversarial samples on activity classification tasks using UCF101 dataset, and the C3D network. To our knowledge, our methodology is the first one reported for reliably detecting white-box adversarial framing, a state-of-the-art adversarial attack for video classifiers.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. From Local Geometry to Global Pseudo Labeling for Robust Positive Unlabeled Learning under Covariate Shift

    cs.CV 2026-05 unverdicted novelty 6.0

    SPUNA leverages spectral neighborhood annotation on visual feature manifolds to enable robust PU learning for covariate shift detection, matching fully supervised performance.

  2. Anatomy of a failure: When, how, and why deep vision fails in scientific domains

    cs.CV 2026-05 unverdicted novelty 6.0

    Deep learning on information-rich scientific images collapses to one-dimensional predictions due to a mismatch between data priors and the model's simplicity bias, even after robustification techniques.