Theoretically Principled Trade-off between Robustness and Accuracy

https://arxiv · 2019 · cs.LG · arXiv 1901.08573

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

We identify a trade-off between robustness and accuracy that serves as a guiding principle in the design of defenses against adversarial examples. Although this problem has been widely studied empirically, much remains unknown concerning the theory underlying this trade-off. In this work, we decompose the prediction error for adversarial examples (robust error) as the sum of the natural (classification) error and boundary error, and provide a differentiable upper bound using the theory of classification-calibrated loss, which is shown to be the tightest possible upper bound uniform over all probability distributions and measurable predictors. Inspired by our theoretical analysis, we also design a new defense method, TRADES, to trade adversarial robustness off against accuracy. Our proposed algorithm performs well experimentally in real-world datasets. The methodology is the foundation of our entry to the NeurIPS 2018 Adversarial Vision Challenge in which we won the 1st place out of ~2,000 submissions, surpassing the runner-up approach by $11.41\%$ in terms of mean $\ell_2$ perturbation distance.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Graph Interpolating Activation Improves Both Natural and Robust Accuracies in Data-Efficient Deep Learning

cs.LG · 2019-07-16 · unverdicted · novelty 5.0

Graph Laplacian interpolating activation replaces softmax in DNNs and improves natural accuracy, robust accuracy, and data efficiency.

Laundering AI Authority with Adversarial Examples

cs.CR · 2026-05-05 · unverdicted · novelty 5.0

Adversarial examples enable AI authority laundering by causing production VLMs to give authoritative but wrong responses on subtly perturbed images, with success rates of 22-100% using decade-old attack methods.

Auto-ART: Structured Literature Synthesis and Automated Adversarial Robustness Testing

cs.CR · 2026-04-22 · unverdicted · novelty 5.0

Auto-ART delivers the first structured synthesis of adversarial robustness consensus plus an executable multi-norm testing framework that flags gradient masking in 92% of cases on RobustBench and reveals a 23.5 pp robustness gap.

citing papers explorer

Showing 3 of 3 citing papers.

Graph Interpolating Activation Improves Both Natural and Robust Accuracies in Data-Efficient Deep Learning cs.LG · 2019-07-16 · unverdicted · none · ref 21 · internal anchor
Graph Laplacian interpolating activation replaces softmax in DNNs and improves natural accuracy, robust accuracy, and data efficiency.
Laundering AI Authority with Adversarial Examples cs.CR · 2026-05-05 · unverdicted · none · ref 71
Adversarial examples enable AI authority laundering by causing production VLMs to give authoritative but wrong responses on subtly perturbed images, with success rates of 22-100% using decade-old attack methods.
Auto-ART: Structured Literature Synthesis and Automated Adversarial Robustness Testing cs.CR · 2026-04-22 · unverdicted · none · ref 7
Auto-ART delivers the first structured synthesis of adversarial robustness consensus plus an executable multi-norm testing framework that flags gradient masking in 92% of cases on RobustBench and reveals a 23.5 pp robustness gap.

Theoretically Principled Trade-off between Robustness and Accuracy

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer