Theoretically Principled Trade-off between Robustness and Accuracy

https://arxiv · 2019 · cs.LG · arXiv 1901.08573

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open full Pith review browse 6 citing papers arXiv PDF

abstract

We identify a trade-off between robustness and accuracy that serves as a guiding principle in the design of defenses against adversarial examples. Although this problem has been widely studied empirically, much remains unknown concerning the theory underlying this trade-off. In this work, we decompose the prediction error for adversarial examples (robust error) as the sum of the natural (classification) error and boundary error, and provide a differentiable upper bound using the theory of classification-calibrated loss, which is shown to be the tightest possible upper bound uniform over all probability distributions and measurable predictors. Inspired by our theoretical analysis, we also design a new defense method, TRADES, to trade adversarial robustness off against accuracy. Our proposed algorithm performs well experimentally in real-world datasets. The methodology is the foundation of our entry to the NeurIPS 2018 Adversarial Vision Challenge in which we won the 1st place out of ~2,000 submissions, surpassing the runner-up approach by $11.41\%$ in terms of mean $\ell_2$ perturbation distance.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Landseer: Exploring the Machine Learning Defense Landscape

cs.CR · 2026-05-26 · unverdicted · novelty 6.0

Landseer offers a containerized modular system to integrate and evaluate combinations of machine learning defenses, with an initial analysis of 35 defenses highlighting replicability challenges.

A combination of noise and bilateral filters achieve supralinear and scalable adversarial robustness in CNNs

cs.LG · 2026-06-01 · unverdicted · novelty 5.0

A preprocessor of Gaussian noise plus bilateral filtering yields supralinear adversarial robustness in CNNs and, when paired with adversarial training, ranks near the top of RobustBench while using far less compute, parameters, epochs, and data than prior defenses.

SORA: Free Second-Order Attacks in Fast Adversarial Training

cs.LG · 2026-05-30 · unverdicted · novelty 5.0

SORA is an adaptive step-size adversarial training algorithm that formalizes epsilon overfitting, introduces the PertAlign metric to predict catastrophic overfitting, and dynamically adjusts perturbations to achieve state-of-the-art robustness and clean accuracy with fixed hyperparameters.

Graph Interpolating Activation Improves Both Natural and Robust Accuracies in Data-Efficient Deep Learning

cs.LG · 2019-07-16 · unverdicted · novelty 5.0

Graph Laplacian interpolating activation replaces softmax in DNNs and improves natural accuracy, robust accuracy, and data efficiency.

Laundering AI Authority with Adversarial Examples

cs.CR · 2026-05-05 · unverdicted · novelty 5.0

Adversarial examples enable AI authority laundering by causing production VLMs to give authoritative but wrong responses on subtly perturbed images, with success rates of 22-100% using decade-old attack methods.

Auto-ART: Structured Literature Synthesis and Automated Adversarial Robustness Testing

cs.CR · 2026-04-22 · unverdicted · novelty 5.0

Auto-ART delivers the first structured synthesis of adversarial robustness consensus plus an executable multi-norm testing framework that flags gradient masking in 92% of cases on RobustBench and reveals a 23.5 pp robustness gap.

citing papers explorer

Showing 5 of 5 citing papers after filters.

Landseer: Exploring the Machine Learning Defense Landscape cs.CR · 2026-05-26 · unverdicted · none · ref 121 · internal anchor
Landseer offers a containerized modular system to integrate and evaluate combinations of machine learning defenses, with an initial analysis of 35 defenses highlighting replicability challenges.
A combination of noise and bilateral filters achieve supralinear and scalable adversarial robustness in CNNs cs.LG · 2026-06-01 · unverdicted · none · ref 57 · internal anchor
A preprocessor of Gaussian noise plus bilateral filtering yields supralinear adversarial robustness in CNNs and, when paired with adversarial training, ranks near the top of RobustBench while using far less compute, parameters, epochs, and data than prior defenses.
SORA: Free Second-Order Attacks in Fast Adversarial Training cs.LG · 2026-05-30 · unverdicted · none · ref 9 · internal anchor
SORA is an adaptive step-size adversarial training algorithm that formalizes epsilon overfitting, introduces the PertAlign metric to predict catastrophic overfitting, and dynamically adjusts perturbations to achieve state-of-the-art robustness and clean accuracy with fixed hyperparameters.
Laundering AI Authority with Adversarial Examples cs.CR · 2026-05-05 · unverdicted · none · ref 71
Adversarial examples enable AI authority laundering by causing production VLMs to give authoritative but wrong responses on subtly perturbed images, with success rates of 22-100% using decade-old attack methods.
Auto-ART: Structured Literature Synthesis and Automated Adversarial Robustness Testing cs.CR · 2026-04-22 · unverdicted · none · ref 7
Auto-ART delivers the first structured synthesis of adversarial robustness consensus plus an executable multi-norm testing framework that flags gradient masking in 92% of cases on RobustBench and reveals a 23.5 pp robustness gap.

Theoretically Principled Trade-off between Robustness and Accuracy

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer