On the existence of consistent adversarial attacks in high-dimensional linear classification

Bruno Loureiro; Lenka Zdeborov\'a; Matteo Vilucchio

arxiv: 2506.12454 · v1 · pith:F35SHTT6new · submitted 2025-06-14 · 📊 stat.ML · cond-mat.dis-nn· cs.CR· cs.LG

On the existence of consistent adversarial attacks in high-dimensional linear classification

Matteo Vilucchio , Lenka Zdeborov\'a , Bruno Loureiro This is my paper

classification 📊 stat.ML cond-mat.dis-nncs.CRcs.LG

keywords adversarialattacksmodelmodelsvulnerabilityclassificationconsistentdata

0 comments

read the original abstract

What fundamentally distinguishes an adversarial attack from a misclassification due to limited model expressivity or finite data? In this work, we investigate this question in the setting of high-dimensional binary classification, where statistical effects due to limited data availability play a central role. We introduce a new error metric that precisely capture this distinction, quantifying model vulnerability to consistent adversarial attacks -- perturbations that preserve the ground-truth labels. Our main technical contribution is an exact and rigorous asymptotic characterization of these metrics in both well-specified models and latent space models, revealing different vulnerability patterns compared to standard robust error measures. The theoretical results demonstrate that as models become more overparameterized, their vulnerability to label-preserving perturbations grows, offering theoretical insight into the mechanisms underlying model sensitivity to adversarial attacks.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Explaining Machine Learning and Memorization with Statistical Mechanics
cs.LG 2026-06 unverdicted novelty 3.0

Thesis uses statistical mechanics to study DAM and RBM models for understanding memorization, low-dimensional learning, and adversarial robustness in neural networks.