Nearest Neighbor Projection Removal Adversarial Training

A. V. Subramanyam; Himanshu Singh; Mohan Kankanhalli; Shivank Rajput

arxiv: 2509.07673 · v4 · submitted 2025-09-09 · 💻 cs.CV · cs.LG

Nearest Neighbor Projection Removal Adversarial Training

Himanshu Singh , A. V. Subramanyam , Shivank Rajput , Mohan Kankanhalli This is my paper

Pith reviewed 2026-05-18 17:48 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords adversarial trainingfeature separabilityinter-class proximityLipschitz constantRademacher complexityadversarial robustnessimage classification

0 comments

The pith

Removing projections onto nearest inter-class neighbors in feature space during adversarial training reduces the Lipschitz constant and improves robustness.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Deep neural networks remain vulnerable to adversarial examples largely due to overlapping features across classes. The paper proposes identifying the nearest inter-class neighbor for each sample in feature space and subtracting the projection onto that neighbor from both adversarial and clean samples. This operation produces a logits correction that the authors prove lowers the Lipschitz constant of the network. A smaller Lipschitz constant directly reduces Rademacher complexity, which supports tighter generalization bounds. Experiments on CIFAR-10, CIFAR-100, SVHN, and TinyImagenet show the method matches or exceeds leading adversarial training approaches in both robust and clean accuracy.

Core claim

The paper presents Nearest Neighbor Projection Removal Adversarial Training, in which the nearest inter-class neighbor is located for each sample in feature space and its projection is removed to enforce stronger separability. The same correction is applied to clean samples. The authors demonstrate theoretically that this logits correction reduces the Lipschitz constant of the network, which lowers Rademacher complexity and thereby improves generalization as well as resistance to adversarial perturbations.

What carries the argument

Nearest neighbor projection removal in feature space, which subtracts the component of a sample's representation along the vector to its closest inter-class neighbor.

If this is right

Competitive or superior robust accuracy alongside improved clean accuracy on CIFAR-10, CIFAR-100, SVHN, and TinyImagenet.
Explicit reduction of inter-class feature overlap that lowers adversarial susceptibility.
Lower Rademacher complexity that yields improved generalization bounds.
A training procedure that directly targets inter-class proximity in addition to gradient-based adversarial objectives.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The projection removal step could be inserted into other adversarial training pipelines without changing their loss functions.
Similar nearest-neighbor corrections might reduce overlap in learned representations for tasks beyond image classification.
The approach suggests that explicit geometric interventions in feature space can complement purely gradient-driven robustness methods.

Load-bearing premise

Removing the projection onto the nearest inter-class neighbor enforces stronger separability without discarding information necessary for correct classification.

What would settle it

Measuring whether robust accuracy on standard benchmarks falls below that of vanilla adversarial training when both are evaluated under the same strong attack such as multi-step PGD.

Figures

Figures reproduced from arXiv: 2509.07673 by A. V. Subramanyam, Himanshu Singh, Mohan Kankanhalli, Shivank Rajput.

**Figure 2.** Figure 2: Effect of projection-removal in the two-dimensional feature space. (a) Input space depicting the decision boundaries. The solid [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 4.** Figure 4: t-SNE visualization of CIFAR-100 on ResNet18 with [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 3.** Figure 3: Clean (circle) and robust (square) accuracy under differ [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

read the original abstract

Deep neural networks have exhibited impressive performance in image classification tasks but remain vulnerable to adversarial examples. Standard adversarial training enhances robustness but typically fails to explicitly address inter-class feature overlap, a significant contributor to adversarial susceptibility. In this work, we introduce a novel adversarial training framework that actively mitigates inter-class proximity by projecting out inter-class dependencies from adversarial and clean samples in the feature space. Specifically, our approach first identifies the nearest inter-class neighbors for each adversarial sample and subsequently removes projections onto these neighbors to enforce stronger feature separability. Theoretically, we demonstrate that our proposed logits correction reduces the Lipschitz constant of neural networks, thereby lowering the Rademacher complexity, which directly contributes to improved generalization and robustness. Extensive experiments across standard benchmarks including CIFAR-10, CIFAR-100, SVHN, and TinyImagenet show that our method demonstrates strong performance that is competitive with leading adversarial training techniques, highlighting significant achievements in both robust and clean accuracy. Our findings reveal the importance of addressing inter-class feature proximity explicitly to bolster adversarial robustness in DNNs. The code is available in the supplementary material.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds nearest-neighbor projection removal in feature space to adversarial training and claims the resulting logits correction lowers the Lipschitz constant.

read the letter

The main takeaway is that this work inserts a geometric step into adversarial training: for each sample it finds the nearest inter-class neighbor in feature space and subtracts the projection onto that neighbor, doing so for both clean and adversarial forward passes before applying a logits correction. The abstract presents this as a direct way to reduce inter-class overlap without new architectures. That specific combination of nearest-neighbor identification plus explicit projection removal inside the training loop is not standard in prior adversarial training literature, so the procedural idea is the clearest novelty here. It targets a plausible source of vulnerability—feature proximity between classes—and the method is simple enough that it could be dropped into existing pipelines as an extra regularizer. The experiments are summarized as competitive on CIFAR-10, CIFAR-100, SVHN, and TinyImageNet for both robust and clean accuracy, which is the relevant comparison. The soft spot is the theory. The claim that the logits correction reduces the Lipschitz constant, lowers Rademacher complexity, and thereby improves robustness is stated without derivation steps, assumptions, or a sketch. Standard Rademacher bounds control the clean generalization gap; they do not automatically extend to the robust risk under perturbations, so the link to adversarial robustness rests on an implicit transfer that needs explicit justification. If the full paper supplies a clean derivation or shows that the Lipschitz reduction measurably tightens a robust generalization bound, that concern shrinks; otherwise it remains the part most likely to draw referee questions. This is for people already working on adversarial training in computer vision who want concrete feature-space interventions to try. A reader looking for new regularizers rather than entirely new architectures could get practical value from the method. It deserves a serious referee because the core procedure is well-defined, the benchmarks are standard, and the idea is falsifiable with ablations, even if the theoretical section will need tightening.

Referee Report

2 major / 2 minor

Summary. The paper proposes Nearest Neighbor Projection Removal Adversarial Training, which identifies nearest inter-class neighbors in feature space for adversarial and clean samples and removes their projections to enforce stronger separability. It claims that the resulting logits correction reduces the network Lipschitz constant, thereby lowering Rademacher complexity and directly improving generalization and robustness. Experiments on CIFAR-10, CIFAR-100, SVHN and TinyImageNet are reported as competitive with leading adversarial training methods, with code provided in supplementary material.

Significance. If the central theoretical claim holds, the work would be significant as an explicit mechanism for reducing inter-class feature overlap during adversarial training, potentially improving the clean-robust accuracy trade-off. The availability of code is a positive for reproducibility. The approach could open a direction for feature-space interventions that complement standard min-max adversarial objectives.

major comments (2)

[Abstract] Abstract: the claim that the logits correction reduces the Lipschitz constant of neural networks (and thereby lowers Rademacher complexity) is presented without derivation steps, assumptions, or proof sketch, yet this reduction is asserted to directly contribute to improved generalization and robustness.
[Abstract] Abstract: the connection from reduced Rademacher complexity to adversarial robustness is not derived; standard Rademacher bounds control the clean generalization gap, and no argument is supplied showing how the Lipschitz reduction extends to the robust risk or the min-max adversarial training objective.

minor comments (2)

[Abstract] Abstract: experimental results are summarized only as 'competitive' and 'strong performance' without numerical values, baseline comparisons, or ablation details.
[Abstract] Abstract: the projection step assumes that removing the nearest inter-class neighbor projection enforces separability without discarding information required for correct classification, but this assumption receives no further discussion or empirical validation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address the two major comments on the abstract below. Both comments correctly identify that the abstract is overly concise on the theoretical claims; we have revised the abstract and added a short proof sketch plus an explicit link to robust risk in the theory section.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the logits correction reduces the Lipschitz constant of neural networks (and thereby lowers Rademacher complexity) is presented without derivation steps, assumptions, or proof sketch, yet this reduction is asserted to directly contribute to improved generalization and robustness.

Authors: We agree that the abstract, constrained by length, omitted the derivation steps and assumptions. In the revised manuscript we have expanded the abstract to include a brief proof sketch: under the assumption that the projection removal operator is a contraction with norm less than 1, the corrected logits satisfy ||f(x) - f(y)|| <= L' ||x - y|| with L' < L, where L is the original Lipschitz constant; this directly lowers the Rademacher complexity bound via the standard Lipschitz-to-Rademacher relation. The full derivation appears in Section 3. revision: yes
Referee: [Abstract] Abstract: the connection from reduced Rademacher complexity to adversarial robustness is not derived; standard Rademacher bounds control the clean generalization gap, and no argument is supplied showing how the Lipschitz reduction extends to the robust risk or the min-max adversarial training objective.

Authors: The referee correctly notes that standard Rademacher bounds address clean generalization. We have added a paragraph to the revised abstract and expanded Section 3 to show the extension: because the Lipschitz constant bounds the sensitivity of the network to input perturbations, the same reduction controls the gap between clean and robust risk; specifically, we derive that the robust risk is bounded by the clean risk plus an additive term proportional to the Lipschitz constant times the perturbation budget, which is tightened by our projection removal. This argument is now explicitly connected to the min-max objective. revision: yes

Circularity Check

0 steps flagged

No circularity: theoretical claim presented as independent demonstration

full rationale

The paper states it 'demonstrate[s] that our proposed logits correction reduces the Lipschitz constant of neural networks, thereby lowering the Rademacher complexity' (abstract). This is framed as a first-principles theoretical result rather than a fit, renaming, or self-citation reduction. No equations are exhibited that define the correction in terms of the Lipschitz quantity itself, nor is any load-bearing premise imported solely via overlapping-author citation. The projection step is described operationally (identify nearest inter-class neighbor and remove its projection) without reducing the claimed bound to the input data by construction. Concerns about Rademacher controlling only clean generalization (versus robust risk) are correctness or assumption issues, not circularity. The derivation chain therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that inter-class proximity in feature space is a primary driver of adversarial vulnerability and that its removal via projection does not trade off clean accuracy. No explicit free parameters or new entities are named in the abstract.

axioms (1)

domain assumption Identifying and projecting out the nearest inter-class neighbor in feature space reduces inter-class dependencies without harming classification performance.
This premise is invoked to justify both the training procedure and the subsequent Lipschitz-constant argument.

pith-pipeline@v0.9.0 · 5728 in / 1282 out tokens · 37389 ms · 2026-05-18T17:48:09.180212+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages

[1]

Rademacher and gaussian complexities: Risk bounds and structural results

Peter L Bartlett and Shahar Mendelson. Rademacher and gaussian complexities: Risk bounds and structural results. Journal of machine learning research, 3(Nov):463–482,

work page
[2]

Spectrally-normalized margin bounds for neural networks

Peter L Bartlett, Dylan J Foster, and Matus J Telgarsky. Spectrally-normalized margin bounds for neural networks. InNeurIPS, 2017. 4

work page 2017
[3]

Lane detection for autonomous driving: Comprehensive reviews, current challenges, and future pre- dictions.IEEE Transactions on Intelligent Transportation Systems, 2025

Jiping Bi, Yongchao Song, Yahong Jiang, Lijun Sun, Xuan Wang, Zhaowei Liu, Jindong Xu, Siwen Quan, Zhe Dai, and Weiqing Yan. Lane detection for autonomous driving: Comprehensive reviews, current challenges, and future pre- dictions.IEEE Transactions on Intelligent Transportation Systems, 2025. 1

work page 2025
[4]

Towards evaluating the robustness of neural networks

Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. InS&P, 2017. 4

work page 2017
[5]

Unlabeled data improves adver- sarial robustness

Yair Carmon, Aditi Raghunathan, Ludwig Schmidt, John C Duchi, and Percy S Liang. Unlabeled data improves adver- sarial robustness. InNeurIPS, 2019. 2

work page 2019
[6]

Parseval networks: Improv- ing robustness to adversarial examples

Moustapha Cisse, Piotr Bojanowski, Edouard Grave, Yann Dauphin, and Nicolas Usunier. Parseval networks: Improv- ing robustness to adversarial examples. InICML, 2017. 4

work page 2017
[7]

Minimally distorted adversarial examples with a fast adaptive boundary attack

Francesco Croce and Matthias Hein. Minimally distorted adversarial examples with a fast adaptive boundary attack. arXiv preprint arXiv:1907.02044, 2019. 2

work page arXiv 1907
[8]

Reliable evalua- tion of adversarial robustness with an ensemble of diverse parameter-free attacks

Francesco Croce and Matthias Hein. Reliable evalua- tion of adversarial robustness with an ensemble of diverse parameter-free attacks. InICML, 2020. 6

work page 2020
[9]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09, 2009. 6

work page 2009
[10]

Adversar- ial vulnerability for any classifier

Alhussein Fawzi, Hamza Fawzi, and Omar Fawzi. Adversar- ial vulnerability for any classifier. InProceedings of the 32nd International Conference on Neural Information Processing Systems, page 1186–1195, Red Hook, NY , USA, 2018. Cur- ran Associates Inc. 3

work page 2018
[11]

Analy- sis of classifiers’ robustness to adversarial perturbations.Ma- chine learning, 107(3):481–508, 2018

Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. Analy- sis of classifiers’ robustness to adversarial perturbations.Ma- chine learning, 107(3):481–508, 2018. 3

work page 2018
[12]

R. A. Fisher. The use of multiple measurements in taxonomic problems.Annals of Eugenics, 7(2):179–188, 1936. 8

work page 1936
[13]

Explaining and harnessing adversarial examples

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. InICLR,

work page
[14]

Regularisation of neural networks by enforc- ing lipschitz continuity.Machine Learning, 110(2):393–416,

Henry Gouk, Eibe Frank, Bernhard Pfahringer, and Michael J Cree. Regularisation of neural networks by enforc- ing lipschitz continuity.Machine Learning, 110(2):393–416,

work page
[15]

Uncovering the limits of adversarial training against norm-bounded adversarial examples.arXiv preprint arXiv:2010.03593, 2020

Sven Gowal, Chongli Qin, Jonathan Uesato, Timothy Mann, and Pushmeet Kohli. Uncovering the limits of adversarial training against norm-bounded adversarial examples.arXiv preprint arXiv:2010.03593, 2020. 3

work page arXiv 2010
[16]

Im- proving robustness using generated data.Advances in Neural Information Processing Systems, 34, 2021

Sven Gowal, Sylvestre-Alvise Rebuffi, Olivia Wiles, Florian Stimberg, Dan Andrei Calian, and Timothy A Mann. Im- proving robustness using generated data.Advances in Neural Information Processing Systems, 34, 2021. 6

work page 2021
[17]

Black-box adversarial attacks with limited queries and information

Andrew Ilyas, Logan Engstrom, Anish Athalye, and Jessy Lin. Black-box adversarial attacks with limited queries and information. InICML, 2018. 3

work page 2018
[18]

Improving fast adversarial training via self-knowledge guid- ance.IEEE Transactions on Information Forensics and Se- curity, 20:3772–3787, 2025

Chengze Jiang, Junkai Wang, Minjing Dong, Jie Gui, Xinli Shi, Yuan Cao, Yuan Yan Tang, and James Tin-Yau Kwok. Improving fast adversarial training via self-knowledge guid- ance.IEEE Transactions on Information Forensics and Se- curity, 20:3772–3787, 2025. 1, 3

work page 2025
[19]

Contrastive neu- ral processes for self-supervised learning

Konstantinos Kallidromitis, Denis Gudovskiy, Kozuka Kazuki, Ohama Iku, and Luca Rigazio. Contrastive neu- ral processes for self-supervised learning. InProceedings of The 13th Asian Conference on Machine Learning, pages 594–609. PMLR, 2021. 8

work page 2021
[20]

Learning multiple layers of features from tiny images.Technical Report, Uni- versity of Toronto, 2009

Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images.Technical Report, Uni- versity of Toronto, 2009. 5

work page 2009
[21]

Semantically consistent visual representation for adversarial robustness.IEEE Transactions on Information Forensics and Security, 18:5608–5622, 2023

Huafeng Kuang, Hong Liu, Yongjian Wu, and Rongrong Ji. Semantically consistent visual representation for adversarial robustness.IEEE Transactions on Information Forensics and Security, 18:5608–5622, 2023. 3, 6

work page 2023
[22]

Squeeze training for adversarial robustness

Qizhang Li, Yiwen Guo, Wangmeng Zuo, and Hao Chen. Squeeze training for adversarial robustness. InICLR, 2023. 1, 3, 6

work page 2023
[23]

Towards deep learning models resistant to adversarial attacks

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. InICML, 2018. 1, 2, 3, 6

work page 2018
[24]

Adversarial defense by restricting the hidden space of deep neural networks

Aamir Mustafa, Salman Khan, Munawar Hayat, Roland Goecke, Jianbing Shen, and Ling Shao. Adversarial defense by restricting the hidden space of deep neural networks. In 2019 IEEE/CVF International Conference on Computer Vi- sion (ICCV), pages 3384–3393, 2019. 1

work page 2019
[25]

Reading digits in natural images with unsupervised feature learning.NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011,

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bis- sacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning.NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011,

work page 2011
[26]

Exploring generalization in deep learning

Behnam Neyshabur, Srinadh Bhojanapalli, David McAllester, and Nati Srebro. Exploring generalization in deep learning. InNeurIPS, 2017. 5

work page 2017
[27]

Trace ratio criterion for fea- ture selection

Feiping Nie, Shiming Xiang, Yangqing Jia, Changshui Zhang, and Shuicheng Yan. Trace ratio criterion for fea- ture selection. InAAAI Conference on Artificial Intelligence,

work page
[28]

Network generalization prediction for safety critical tasks in novel operating domains

Molly O’Brien, Mike Medoff, Julia Bukowski, and Gre- gory D Hager. Network generalization prediction for safety critical tasks in novel operating domains. InProceedings of the IEEE/CVF Winter Conference on Applications of Com- puter Vision, pages 614–622, 2022. 1

work page 2022
[29]

Rethinking softmax cross-entropy loss for ad- versarial robustness

Tianyu Pang, Kun Xu, Yinpeng Dong, Chao Du, Ning Chen, and Jun Zhu. Rethinking softmax cross-entropy loss for ad- versarial robustness. InInternational Conference on Learn- ing Representations, 2020

work page 2020
[30]

Overfitting in adversarially robust deep learning

Leslie Rice, Eric Wong, and J Zico Kolter. Overfitting in adversarially robust deep learning. InICML, 2020. 3, 7

work page 2020
[31]

Rousseeuw

Peter J. Rousseeuw. Silhouettes: A graphical aid to the inter- pretation and validation of cluster analysis.Journal of Com- putational and Applied Mathematics, 20:53–65, 1987. 8

work page 1987
[32]

The dimpled manifold model of adversarial examples in machine learning.arXiv preprint arXiv:2106.10151, 2022

Adi Shamir, Odelia Melamed, and Oriel BenShmuel. The dimpled manifold model of adversarial examples in machine learning.arXiv preprint arXiv:2106.10151, 2022. 3

work page arXiv 2022
[33]

Adversarial finetuning with latent representa- tion constraint to mitigate accuracy-robustness tradeoff

Satoshi Suzuki, Shin’ya Yamaguchi, Shoichiro Takeda, Sek- itoshi Kanai, Naoki Makishima, Atsushi Ando, and Ryo Masumura. Adversarial finetuning with latent representa- tion constraint to mitigate accuracy-robustness tradeoff. In 2023 IEEE/CVF International Conference on Computer Vi- sion (ICCV), pages 4367–4378, 2023. 3, 6

work page 2023
[34]

In- triguing properties of neural networks

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. In- triguing properties of neural networks. InICLR, 2013. 1

work page 2013
[35]

Robustness may be at odds with accuracy

Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, and Aleksander Madry. Robustness may be at odds with accuracy. InInternational Conference on Learning Representations, 2019. 2, 3

work page 2019
[36]

Visualizing data using t-sne.Journal of Machine Learning Research, 9 (86):2579–2605, 2008

Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne.Journal of Machine Learning Research, 9 (86):2579–2605, 2008. 8

work page 2008
[37]

Between-class adversarial training for improving adversarial robustness of image classification.Sensors, 23(6), 2023

Desheng Wang, Weidong Jin, and Yunpu Wu. Between-class adversarial training for improving adversarial robustness of image classification.Sensors, 23(6), 2023

work page 2023
[38]

Improving adversarial robustness requires revisiting misclassified examples

Yisen Wang, Difan Zou, Jinfeng Yi, James Bailey, Xingjun Ma, and Quanquan Gu. Improving adversarial robustness requires revisiting misclassified examples. InICLR, 2020. 1, 2, 6

work page 2020
[39]

Re- thinking invariance regularization in adversarial training to improve robustness-accuracy trade-off

Futa Kai Waseda, Ching-Chun Chang, and Isao Echizen. Re- thinking invariance regularization in adversarial training to improve robustness-accuracy trade-off. InThe Thirteenth In- ternational Conference on Learning Representations, 2025. 1, 3, 6

work page 2025
[40]

Adversarial weight perturbation helps robust generalization.Advances in Neural Information Processing Systems, 33, 2020

Dongxian Wu, Shu-Tao Xia, and Yisen Wang. Adversarial weight perturbation helps robust generalization.Advances in Neural Information Processing Systems, 33, 2020. 2

work page 2020
[41]

Bridg- ing the gap: Rademacher complexity in robust and standard generalization

Jiancong Xiao, Ruoyu Sun, Qi Long, and Weijie Su. Bridg- ing the gap: Rademacher complexity in robust and standard generalization. InThe Thirty Seventh Annual Conference on Learning Theory, pages 5074–5075, 2024. 5

work page 2024
[42]

An orthogonal classifier for improving the adversarial robustness of neural networks

Cong Xu, Xiang Li, and Min Yang. An orthogonal classifier for improving the adversarial robustness of neural networks. Inf. Sci., 591(C):251–262, 2022

work page 2022
[43]

Dynamic weighting loss for decision boundary adjustment based on robust distance in adversarial training

Yiqun Xu, Zhen Wei, Zhehao Li, Xing Wei, and Yang Lu. Dynamic weighting loss for decision boundary adjustment based on robust distance in adversarial training. InInterna- tional Conference on Multimedia and Expo, 2025. 1, 2, 6

work page 2025
[44]

Rademacher complexity for adversarially robust generaliza- tion

Dong Yin, Ramchandran Kannan, and Peter Bartlett. Rademacher complexity for adversarially robust generaliza- tion. InICML, 2019. 5

work page 2019
[45]

Spectral norm regular- ization for improving the generalizability of deep learning,

Yuichi Yoshida and Takeru Miyato. Spectral norm regular- ization for improving the generalizability of deep learning,

work page
[46]

Variable generalization performance of a deep learning model to de- tect pneumonia in chest radiographs: a cross-sectional study

John R Zech, Marcus A Badgeley, Manway Liu, Anthony B Costa, Joseph J Titano, and Eric Karl Oermann. Variable generalization performance of a deep learning model to de- tect pneumonia in chest radiographs: a cross-sectional study. PLoS medicine, 15(11):e1002683, 2018. 1

work page 2018
[47]

Defense against adversar- ial attacks using feature scattering-based adversarial training

Haichao Zhang and Jianyu Wang. Defense against adversar- ial attacks using feature scattering-based adversarial training. InNeurIPS, 2019. 4

work page 2019
[48]

Xing, Laurent El Ghaoui, and Michael I

Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, and Michael I. Jordan. Theoretically principled trade-off between robustness and accuracy. In ICML, 2019. 2, 6, 7, 8

work page 2019
[49]

Attacks which do not kill training make adversarial learning stronger

Jingfeng Zhang, Xilie Xu, Bo Han, Gang Niu, Lizhen Cui, Masashi Sugiyama, and Mohan Kankanhalli. Attacks which do not kill training make adversarial learning stronger. InIn- ternational Conference on Machine Learning, pages 11278– 11287. PMLR, 2020. 6

work page 2020

[1] [1]

Rademacher and gaussian complexities: Risk bounds and structural results

Peter L Bartlett and Shahar Mendelson. Rademacher and gaussian complexities: Risk bounds and structural results. Journal of machine learning research, 3(Nov):463–482,

work page

[2] [2]

Spectrally-normalized margin bounds for neural networks

Peter L Bartlett, Dylan J Foster, and Matus J Telgarsky. Spectrally-normalized margin bounds for neural networks. InNeurIPS, 2017. 4

work page 2017

[3] [3]

Lane detection for autonomous driving: Comprehensive reviews, current challenges, and future pre- dictions.IEEE Transactions on Intelligent Transportation Systems, 2025

Jiping Bi, Yongchao Song, Yahong Jiang, Lijun Sun, Xuan Wang, Zhaowei Liu, Jindong Xu, Siwen Quan, Zhe Dai, and Weiqing Yan. Lane detection for autonomous driving: Comprehensive reviews, current challenges, and future pre- dictions.IEEE Transactions on Intelligent Transportation Systems, 2025. 1

work page 2025

[4] [4]

Towards evaluating the robustness of neural networks

Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. InS&P, 2017. 4

work page 2017

[5] [5]

Unlabeled data improves adver- sarial robustness

Yair Carmon, Aditi Raghunathan, Ludwig Schmidt, John C Duchi, and Percy S Liang. Unlabeled data improves adver- sarial robustness. InNeurIPS, 2019. 2

work page 2019

[6] [6]

Parseval networks: Improv- ing robustness to adversarial examples

Moustapha Cisse, Piotr Bojanowski, Edouard Grave, Yann Dauphin, and Nicolas Usunier. Parseval networks: Improv- ing robustness to adversarial examples. InICML, 2017. 4

work page 2017

[7] [7]

Minimally distorted adversarial examples with a fast adaptive boundary attack

Francesco Croce and Matthias Hein. Minimally distorted adversarial examples with a fast adaptive boundary attack. arXiv preprint arXiv:1907.02044, 2019. 2

work page arXiv 1907

[8] [8]

Reliable evalua- tion of adversarial robustness with an ensemble of diverse parameter-free attacks

Francesco Croce and Matthias Hein. Reliable evalua- tion of adversarial robustness with an ensemble of diverse parameter-free attacks. InICML, 2020. 6

work page 2020

[9] [9]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09, 2009. 6

work page 2009

[10] [10]

Adversar- ial vulnerability for any classifier

Alhussein Fawzi, Hamza Fawzi, and Omar Fawzi. Adversar- ial vulnerability for any classifier. InProceedings of the 32nd International Conference on Neural Information Processing Systems, page 1186–1195, Red Hook, NY , USA, 2018. Cur- ran Associates Inc. 3

work page 2018

[11] [11]

Analy- sis of classifiers’ robustness to adversarial perturbations.Ma- chine learning, 107(3):481–508, 2018

Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. Analy- sis of classifiers’ robustness to adversarial perturbations.Ma- chine learning, 107(3):481–508, 2018. 3

work page 2018

[12] [12]

R. A. Fisher. The use of multiple measurements in taxonomic problems.Annals of Eugenics, 7(2):179–188, 1936. 8

work page 1936

[13] [13]

Explaining and harnessing adversarial examples

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. InICLR,

work page

[14] [14]

Regularisation of neural networks by enforc- ing lipschitz continuity.Machine Learning, 110(2):393–416,

Henry Gouk, Eibe Frank, Bernhard Pfahringer, and Michael J Cree. Regularisation of neural networks by enforc- ing lipschitz continuity.Machine Learning, 110(2):393–416,

work page

[15] [15]

Uncovering the limits of adversarial training against norm-bounded adversarial examples.arXiv preprint arXiv:2010.03593, 2020

Sven Gowal, Chongli Qin, Jonathan Uesato, Timothy Mann, and Pushmeet Kohli. Uncovering the limits of adversarial training against norm-bounded adversarial examples.arXiv preprint arXiv:2010.03593, 2020. 3

work page arXiv 2010

[16] [16]

Im- proving robustness using generated data.Advances in Neural Information Processing Systems, 34, 2021

Sven Gowal, Sylvestre-Alvise Rebuffi, Olivia Wiles, Florian Stimberg, Dan Andrei Calian, and Timothy A Mann. Im- proving robustness using generated data.Advances in Neural Information Processing Systems, 34, 2021. 6

work page 2021

[17] [17]

Black-box adversarial attacks with limited queries and information

Andrew Ilyas, Logan Engstrom, Anish Athalye, and Jessy Lin. Black-box adversarial attacks with limited queries and information. InICML, 2018. 3

work page 2018

[18] [18]

Improving fast adversarial training via self-knowledge guid- ance.IEEE Transactions on Information Forensics and Se- curity, 20:3772–3787, 2025

Chengze Jiang, Junkai Wang, Minjing Dong, Jie Gui, Xinli Shi, Yuan Cao, Yuan Yan Tang, and James Tin-Yau Kwok. Improving fast adversarial training via self-knowledge guid- ance.IEEE Transactions on Information Forensics and Se- curity, 20:3772–3787, 2025. 1, 3

work page 2025

[19] [19]

Contrastive neu- ral processes for self-supervised learning

Konstantinos Kallidromitis, Denis Gudovskiy, Kozuka Kazuki, Ohama Iku, and Luca Rigazio. Contrastive neu- ral processes for self-supervised learning. InProceedings of The 13th Asian Conference on Machine Learning, pages 594–609. PMLR, 2021. 8

work page 2021

[20] [20]

Learning multiple layers of features from tiny images.Technical Report, Uni- versity of Toronto, 2009

Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images.Technical Report, Uni- versity of Toronto, 2009. 5

work page 2009

[21] [21]

Semantically consistent visual representation for adversarial robustness.IEEE Transactions on Information Forensics and Security, 18:5608–5622, 2023

Huafeng Kuang, Hong Liu, Yongjian Wu, and Rongrong Ji. Semantically consistent visual representation for adversarial robustness.IEEE Transactions on Information Forensics and Security, 18:5608–5622, 2023. 3, 6

work page 2023

[22] [22]

Squeeze training for adversarial robustness

Qizhang Li, Yiwen Guo, Wangmeng Zuo, and Hao Chen. Squeeze training for adversarial robustness. InICLR, 2023. 1, 3, 6

work page 2023

[23] [23]

Towards deep learning models resistant to adversarial attacks

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. InICML, 2018. 1, 2, 3, 6

work page 2018

[24] [24]

Adversarial defense by restricting the hidden space of deep neural networks

Aamir Mustafa, Salman Khan, Munawar Hayat, Roland Goecke, Jianbing Shen, and Ling Shao. Adversarial defense by restricting the hidden space of deep neural networks. In 2019 IEEE/CVF International Conference on Computer Vi- sion (ICCV), pages 3384–3393, 2019. 1

work page 2019

[25] [25]

Reading digits in natural images with unsupervised feature learning.NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011,

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bis- sacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning.NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011,

work page 2011

[26] [26]

Exploring generalization in deep learning

Behnam Neyshabur, Srinadh Bhojanapalli, David McAllester, and Nati Srebro. Exploring generalization in deep learning. InNeurIPS, 2017. 5

work page 2017

[27] [27]

Trace ratio criterion for fea- ture selection

Feiping Nie, Shiming Xiang, Yangqing Jia, Changshui Zhang, and Shuicheng Yan. Trace ratio criterion for fea- ture selection. InAAAI Conference on Artificial Intelligence,

work page

[28] [28]

Network generalization prediction for safety critical tasks in novel operating domains

Molly O’Brien, Mike Medoff, Julia Bukowski, and Gre- gory D Hager. Network generalization prediction for safety critical tasks in novel operating domains. InProceedings of the IEEE/CVF Winter Conference on Applications of Com- puter Vision, pages 614–622, 2022. 1

work page 2022

[29] [29]

Rethinking softmax cross-entropy loss for ad- versarial robustness

Tianyu Pang, Kun Xu, Yinpeng Dong, Chao Du, Ning Chen, and Jun Zhu. Rethinking softmax cross-entropy loss for ad- versarial robustness. InInternational Conference on Learn- ing Representations, 2020

work page 2020

[30] [30]

Overfitting in adversarially robust deep learning

Leslie Rice, Eric Wong, and J Zico Kolter. Overfitting in adversarially robust deep learning. InICML, 2020. 3, 7

work page 2020

[31] [31]

Rousseeuw

Peter J. Rousseeuw. Silhouettes: A graphical aid to the inter- pretation and validation of cluster analysis.Journal of Com- putational and Applied Mathematics, 20:53–65, 1987. 8

work page 1987

[32] [32]

The dimpled manifold model of adversarial examples in machine learning.arXiv preprint arXiv:2106.10151, 2022

Adi Shamir, Odelia Melamed, and Oriel BenShmuel. The dimpled manifold model of adversarial examples in machine learning.arXiv preprint arXiv:2106.10151, 2022. 3

work page arXiv 2022

[33] [33]

Adversarial finetuning with latent representa- tion constraint to mitigate accuracy-robustness tradeoff

Satoshi Suzuki, Shin’ya Yamaguchi, Shoichiro Takeda, Sek- itoshi Kanai, Naoki Makishima, Atsushi Ando, and Ryo Masumura. Adversarial finetuning with latent representa- tion constraint to mitigate accuracy-robustness tradeoff. In 2023 IEEE/CVF International Conference on Computer Vi- sion (ICCV), pages 4367–4378, 2023. 3, 6

work page 2023

[34] [34]

In- triguing properties of neural networks

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. In- triguing properties of neural networks. InICLR, 2013. 1

work page 2013

[35] [35]

Robustness may be at odds with accuracy

Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, and Aleksander Madry. Robustness may be at odds with accuracy. InInternational Conference on Learning Representations, 2019. 2, 3

work page 2019

[36] [36]

Visualizing data using t-sne.Journal of Machine Learning Research, 9 (86):2579–2605, 2008

Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne.Journal of Machine Learning Research, 9 (86):2579–2605, 2008. 8

work page 2008

[37] [37]

Between-class adversarial training for improving adversarial robustness of image classification.Sensors, 23(6), 2023

Desheng Wang, Weidong Jin, and Yunpu Wu. Between-class adversarial training for improving adversarial robustness of image classification.Sensors, 23(6), 2023

work page 2023

[38] [38]

Improving adversarial robustness requires revisiting misclassified examples

Yisen Wang, Difan Zou, Jinfeng Yi, James Bailey, Xingjun Ma, and Quanquan Gu. Improving adversarial robustness requires revisiting misclassified examples. InICLR, 2020. 1, 2, 6

work page 2020

[39] [39]

Re- thinking invariance regularization in adversarial training to improve robustness-accuracy trade-off

Futa Kai Waseda, Ching-Chun Chang, and Isao Echizen. Re- thinking invariance regularization in adversarial training to improve robustness-accuracy trade-off. InThe Thirteenth In- ternational Conference on Learning Representations, 2025. 1, 3, 6

work page 2025

[40] [40]

Adversarial weight perturbation helps robust generalization.Advances in Neural Information Processing Systems, 33, 2020

Dongxian Wu, Shu-Tao Xia, and Yisen Wang. Adversarial weight perturbation helps robust generalization.Advances in Neural Information Processing Systems, 33, 2020. 2

work page 2020

[41] [41]

Bridg- ing the gap: Rademacher complexity in robust and standard generalization

Jiancong Xiao, Ruoyu Sun, Qi Long, and Weijie Su. Bridg- ing the gap: Rademacher complexity in robust and standard generalization. InThe Thirty Seventh Annual Conference on Learning Theory, pages 5074–5075, 2024. 5

work page 2024

[42] [42]

An orthogonal classifier for improving the adversarial robustness of neural networks

Cong Xu, Xiang Li, and Min Yang. An orthogonal classifier for improving the adversarial robustness of neural networks. Inf. Sci., 591(C):251–262, 2022

work page 2022

[43] [43]

Dynamic weighting loss for decision boundary adjustment based on robust distance in adversarial training

Yiqun Xu, Zhen Wei, Zhehao Li, Xing Wei, and Yang Lu. Dynamic weighting loss for decision boundary adjustment based on robust distance in adversarial training. InInterna- tional Conference on Multimedia and Expo, 2025. 1, 2, 6

work page 2025

[44] [44]

Rademacher complexity for adversarially robust generaliza- tion

Dong Yin, Ramchandran Kannan, and Peter Bartlett. Rademacher complexity for adversarially robust generaliza- tion. InICML, 2019. 5

work page 2019

[45] [45]

Spectral norm regular- ization for improving the generalizability of deep learning,

Yuichi Yoshida and Takeru Miyato. Spectral norm regular- ization for improving the generalizability of deep learning,

work page

[46] [46]

Variable generalization performance of a deep learning model to de- tect pneumonia in chest radiographs: a cross-sectional study

John R Zech, Marcus A Badgeley, Manway Liu, Anthony B Costa, Joseph J Titano, and Eric Karl Oermann. Variable generalization performance of a deep learning model to de- tect pneumonia in chest radiographs: a cross-sectional study. PLoS medicine, 15(11):e1002683, 2018. 1

work page 2018

[47] [47]

Defense against adversar- ial attacks using feature scattering-based adversarial training

Haichao Zhang and Jianyu Wang. Defense against adversar- ial attacks using feature scattering-based adversarial training. InNeurIPS, 2019. 4

work page 2019

[48] [48]

Xing, Laurent El Ghaoui, and Michael I

Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, and Michael I. Jordan. Theoretically principled trade-off between robustness and accuracy. In ICML, 2019. 2, 6, 7, 8

work page 2019

[49] [49]

Attacks which do not kill training make adversarial learning stronger

Jingfeng Zhang, Xilie Xu, Bo Han, Gang Niu, Lizhen Cui, Masashi Sugiyama, and Mohan Kankanhalli. Attacks which do not kill training make adversarial learning stronger. InIn- ternational Conference on Machine Learning, pages 11278– 11287. PMLR, 2020. 6

work page 2020