pith. sign in

arxiv: 1906.00555 · v2 · pith:LHMX57DTnew · submitted 2019-06-03 · 💻 cs.LG · stat.ML

Adversarially Robust Generalization Just Requires More Unlabeled Data

classification 💻 cs.LG stat.ML
keywords datageneralizationrobustunlabeledadversariallypartadversarialstability
0
0 comments X
read the original abstract

Neural network robustness has recently been highlighted by the existence of adversarial examples. Many previous works show that the learned networks do not perform well on perturbed test data, and significantly more labeled data is required to achieve adversarially robust generalization. In this paper, we theoretically and empirically show that with just more unlabeled data, we can learn a model with better adversarially robust generalization. The key insight of our results is based on a risk decomposition theorem, in which the expected robust risk is separated into two parts: the stability part which measures the prediction stability in the presence of perturbations, and the accuracy part which evaluates the standard classification accuracy. As the stability part does not depend on any label information, we can optimize this part using unlabeled data. We further prove that for a specific Gaussian mixture problem, adversarially robust generalization can be almost as easy as the standard generalization in supervised learning if a sufficiently large amount of unlabeled data is provided. Inspired by the theoretical findings, we further show that a practical adversarial training algorithm that leverages unlabeled data can improve adversarial robust generalization on MNIST and Cifar-10.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Homogenization of $\ell_2$-Adversarial Training in High-Dimensions: Exact Dynamics under Stochastic Gradient Descent

    math.OC 2026-06 unverdicted novelty 7.0

    Derives ODE deterministic equivalents and an adversarial homogenized SDE for SGD iterates in high-dim ℓ2-adversarial training, showing no constant learning rate ensures monotone descent for single-class adversarial le...

  2. Robust Alignment: Harmonizing Clean Accuracy and Adversarial Robustness in Adversarial Training

    cs.CV 2026-04 unverdicted novelty 5.0

    RAAT harmonizes clean accuracy and adversarial robustness by using fixed reduced perturbations for boundary samples and Domain Interpolation Consistency Adversarial Regularization to align input and latent spaces.