pith. sign in

arxiv: 1902.00236 · v1 · pith:6GNNCIM5new · submitted 2019-02-01 · 💻 cs.LG · cs.CV· stat.ML

Natural and Adversarial Error Detection using Invariance to Image Transformations

classification 💻 cs.LG cs.CVstat.ML
keywords adversarialapproacherrorsimagenaturaltransformationsattacksclassifications
0
0 comments X p. Extension
pith:6GNNCIM5 Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{6GNNCIM5}

Prints a linked pith:6GNNCIM5 badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

We propose an approach to distinguish between correct and incorrect image classifications. Our approach can detect misclassifications which either occur $\it{unintentionally}$ ("natural errors"), or due to $\it{intentional~adversarial~attacks}$ ("adversarial errors"), both in a single $\it{unified~framework}$. Our approach is based on the observation that correctly classified images tend to exhibit robust and consistent classifications under certain image transformations (e.g., horizontal flip, small image translation, etc.). In contrast, incorrectly classified images (whether due to adversarial errors or natural errors) tend to exhibit large variations in classification results under such transformations. Our approach does not require any modifications or retraining of the classifier, hence can be applied to any pre-trained classifier. We further use state of the art targeted adversarial attacks to demonstrate that even when the adversary has full knowledge of our method, the adversarial distortion needed for bypassing our detector is $\it{no~longer~imperceptible~to~the~human~eye}$. Our approach obtains state-of-the-art results compared to previous adversarial detection methods, surpassing them by a large margin.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.