pith. sign in

arxiv: 2505.02360 · v2 · pith:IL5F52LVnew · submitted 2025-05-05 · 💻 cs.LG · cs.AI

Catastrophic Overfitting, Entropy Gap and Participation Ratio: A Noiseless l^p Norm Solution for Fast Adversarial Training

Pith reviewed 2026-05-22 16:43 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords adversarial trainingcatastrophic overfittingl^p normparticipation ratioentropy gapFGSMgradient concentrationrobustness
0
0 comments X

The pith

Tuning the l^p training norm adaptively using participation ratio and entropy prevents catastrophic overfitting in fast adversarial training.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that catastrophic overfitting arises in fast adversarial training when single-step methods like FGSM produce models that resist only weak attacks but collapse against stronger multi-step ones. This happens more often under the l^infty norm than l^2 because concentrated gradients interact badly with strict norm constraints. The authors model generalized l^p attacks as a fixed-point problem, then build an adaptive l^p-FGSM that chooses the norm at each step by tracking how spread out the gradients are via participation ratio and entropy. If the claim holds, it supplies a simple, noiseless way to train robust networks that does not rely on extra regularization, noise injection, or slower multi-step attacks.

Core claim

Catastrophic overfitting emerges when highly concentrated gradients, where information localizes in few dimensions, meet aggressive norm constraints. By quantifying this concentration through participation ratio and entropy gap, the authors construct an adaptive l^p-FGSM that automatically selects the training norm to avoid the failure mode, achieving strong robustness to multi-step attacks without additional techniques.

What carries the argument

The adaptive l^p-FGSM, which treats the generalized l^p attack as a fixed-point problem and selects the norm p at each training step according to the participation ratio and entropy of the current gradients.

If this is right

  • Single-step adversarial training reaches multi-step robustness levels without noise or regularization.
  • The choice of l^p norm can be made data-driven rather than fixed in advance.
  • Gradient concentration measures become practical diagnostics for training stability.
  • Fast adversarial training becomes viable for larger models where multi-step methods are too slow.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar adaptive-norm logic could stabilize other optimization problems that suffer from gradient sparsity or concentration.
  • Participation ratio tracking might diagnose related issues in standard supervised training or pruning.
  • The entropy-gap formulation invites tests on whether the same signals predict robustness in non-adversarial settings.

Load-bearing premise

Catastrophic overfitting is produced specifically by the interaction between highly concentrated gradients and aggressive norm constraints, and that automatically tuning the l^p norm from participation ratio and entropy is sufficient to block it.

What would settle it

An experiment in which the adaptive l^p method is applied but models still exhibit catastrophic overfitting whenever gradient participation ratio remains low, or in which fixed-norm l^2 or l^infty training matches the adaptive method's robustness.

Figures

Figures reproduced from arXiv: 2505.02360 by Fares B. Mehouachi, Saif Eddin Jabari.

Figure 1
Figure 1. Figure 1: CO phenomena on CIFAR-10 [29] using WideResNet-28-10 [30]: Upper: l∞ training (ϵ = 8/255) shows accuracy collapse against PGD-50 (ϵ = 8/255) [15] attacks, while l 2 (ϵ = 32/255, both training and attack) remains stable. Lower: CO onset in l∞ training correlates with gradient norm increase, absent in l 2 training (norms normalized at epoch 1). or regularization [25, 26], our method achieves superior perform… view at source ↗
Figure 2
Figure 2. Figure 2: Impact of l p norm choice on training dynamics and robustness for CIFAR-10 with WideResNet-28-10. The choice of p reveals a key trade-off: higher values (p ≥ 32) initially show better robustness but become vulnerable to Catastrophic Overfitting (CO), evident in the l∞ PGD-50 plot (second left). Lower p values prevent CO but with reduced adversarial robustness. Notably, l 2 PGD-50 accuracy (rightmost) remai… view at source ↗
Figure 3
Figure 3. Figure 3: Depiction of training effect on CIFAR-10’s loss [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Variation of the l p transition function Υp for different values of p. The high-pass filtering effect mirrors the thresholding behavior in ZeroGrad [33]. Lipschitzness of Fp: For p > 2, global Lipschitz continu￾ity fails due to the discontinuous sign function and concave power term q − 1 at null gradients. However, local Lips￾chitzness suffices via Banach contraction when gradients are bounded away from ze… view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of the initial two ascents of the fixed [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Clean and adversarial accuracy across datasets [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Evolution of Participation Ratios (PR, PR1) and entropy gap during training. Sharp declines in these met￾rics align with the onset of Catastrophic Overfitting (CO), highlighting the link between gradient concentration and adversarial vulnerability. Same experimental setting as [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Effect of the l p norm on attack geometry and sen￾sitivity to gradient noise. Left: An ideal scenario, where the angles between δ2, δ∞, and any δp are zero. Right: Under small gradient noise (common in ML), l∞ shows high sensitivity with large angular separation, whereas l p yields more stable attacks with better gradient alignment (higher cosine similarity). where ∆H = Hm − H is the Entropy Gap, H is the … view at source ↗
Figure 9
Figure 9. Figure 9: Performance benchmarking of adaptive l p norm￾based training against single-step and fast adversarial tech￾niques using PGD-50-10, demonstrating the competitive efficacy of adaptive l p -FGSM. Results were achieved with an SGD optimizer with a cosine learning rate schedule (30 epochs, minimum 0.001, maximum 0.2), weight decay of 5 · 10−4 , and a dropout rate of 0.1. For SVHN and CIFAR￾10, β = 0.01 was appl… view at source ↗
Figure 10
Figure 10. Figure 10: Comparative evaluation using AutoAttack on CIFAR-10 with WideResNet-28-10 across different perturbation [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Extended training performance of l p -FGSM on CIFAR-10. While Catastrophic Overfitting (CO) was not observed, the experiment highlights the occurrence of robust overfitting over a prolonged training period. The results of this long-term training provide insightful observations. Crucially, no instances of Catastrophic Overfitting (CO) were detected throughout the training process, underscoring the robustne… view at source ↗
Figure 12
Figure 12. Figure 12: Analysis of ε-softening and noise effects on CIFAR-10 using WideResNet-28-10 against PGD-50 (ϵ = 8/255). Left: Effect of ε-softening on clean (dashed) and adversarial (solid) accuracy for various p values. Optimal ε enhances stability against CO. Right: Synergistic effects of noise injection showing improved robustness against CO and enhanced overall accuracy. The results demonstrate that both components … view at source ↗
Figure 13
Figure 13. Figure 13: Evolution of Participation Ratios ( [PITH_FULL_IMAGE:figures/full_fig_p026_13.png] view at source ↗
read the original abstract

Adversarial training is a cornerstone of robust deep learning, but fast methods like the Fast Gradient Sign Method (FGSM) often suffer from Catastrophic Overfitting (CO), where models become robust to single-step attacks but fail against multi-step variants. While existing solutions rely on noise injection, regularization, or gradient clipping, we propose a novel solution that purely controls the $l^p$ training norm to mitigate CO. Our study is motivated by the empirical observation that CO is more prevalent under the $l^{\infty}$ norm than the $l^2$ norm. Leveraging this insight, we develop a framework for generalized $l^p$ attack as a fixed point problem and craft $l^p$-FGSM attacks to understand the transition mechanics from $l^2$ to $l^{\infty}$. This leads to our core insight: CO emerges when highly concentrated gradients where information localizes in few dimensions interact with aggressive norm constraints. By quantifying gradient concentration through Participation Ratio and entropy measures, we develop an adaptive $l^p$-FGSM that automatically tunes the training norm based on gradient information. Extensive experiments demonstrate that this approach achieves strong robustness without requiring additional regularization or noise injection, providing a novel and theoretically-principled pathway to mitigate the CO problem.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims that catastrophic overfitting (CO) in fast adversarial training with FGSM arises from the interaction of highly concentrated gradients with aggressive norm constraints, and proposes an adaptive l^p-FGSM method that automatically tunes the training norm p using participation ratio and entropy measures of the gradients. It develops a fixed-point framework for generalized l^p attacks to analyze the transition from l^2 to l^∞ norms and reports that this noiseless approach achieves strong robustness without regularization or noise injection.

Significance. If the central claim holds, the work offers a new mechanism-based approach to CO mitigation that avoids common interventions like noise or clipping, potentially simplifying robust training. The quantification of gradient concentration via participation ratio and entropy provides a concrete diagnostic for norm choice, and the fixed-point formulation for l^p attacks is a useful technical contribution if the derivations are rigorous.

major comments (3)
  1. [§3] §3 (fixed-point framework for l^p-FGSM): The core claim that CO emerges specifically from concentrated gradients interacting with norm constraints is load-bearing, yet the derivation does not demonstrate why participation ratio and entropy are necessary and sufficient diagnostics rather than alternatives such as gradient sparsity or Hessian-based measures; an explicit isolation argument or counterexample is needed to support the 'theoretically-principled' assertion.
  2. [Adaptive rule] Adaptive rule and mapping (around Eq. for p selection): The adaptive tuning is presented as automatic and based on gradient information, but the manuscript does not clarify whether the mapping from participation ratio to p introduces fitted constants or hyperparameters that are later used to claim success, raising a circularity concern for the parameter-free interpretation.
  3. [Experiments] Experimental validation (results tables/figures): While extensive experiments are reported, the absence of ablations that fix p while matching the same participation ratio and entropy statistics leaves the causal link between the proposed metrics and CO prevention untested; this undermines the claim that the method alone is sufficient without other interventions.
minor comments (2)
  1. [Preliminaries] Notation for participation ratio and entropy gap should be defined more explicitly with respect to the gradient vector dimensions to avoid ambiguity in replication.
  2. [Figures] Figure captions for the l^p transition plots could include the exact p values used in each panel for clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which help clarify the presentation of our contributions. We address each major comment below and indicate the revisions we will make to strengthen the manuscript.

read point-by-point responses
  1. Referee: [§3] §3 (fixed-point framework for l^p-FGSM): The core claim that CO emerges specifically from concentrated gradients interacting with norm constraints is load-bearing, yet the derivation does not demonstrate why participation ratio and entropy are necessary and sufficient diagnostics rather than alternatives such as gradient sparsity or Hessian-based measures; an explicit isolation argument or counterexample is needed to support the 'theoretically-principled' assertion.

    Authors: We appreciate the referee's emphasis on strengthening the justification for our choice of diagnostics. The fixed-point framework in §3 derives the conditions under which gradient concentration interacts with the l^∞ constraint to produce CO, and participation ratio together with entropy are selected because they quantify the effective support and information localization of the gradient vector in a manner directly tied to those conditions. We do not claim these are the only possible measures, nor do we provide a full isolation proof against every alternative. In the revision we will expand the discussion in §3 to include a comparison with gradient sparsity and a brief note on why Hessian-based alternatives are less directly connected to the single-step norm transition analyzed in the fixed-point formulation. revision: yes

  2. Referee: [Adaptive rule] Adaptive rule and mapping (around Eq. for p selection): The adaptive tuning is presented as automatic and based on gradient information, but the manuscript does not clarify whether the mapping from participation ratio to p introduces fitted constants or hyperparameters that are later used to claim success, raising a circularity concern for the parameter-free interpretation.

    Authors: The mapping is obtained by identifying the participation-ratio thresholds at which the fixed-point analysis predicts the onset of adverse norm-gradient interaction; these thresholds are fixed by the theoretical transition points and do not involve constants fitted to robustness metrics or validation performance. Consequently the rule remains free of data-dependent hyperparameters. We will revise the text around the relevant equation to state this derivation explicitly and to confirm that no post-hoc fitting was performed. revision: yes

  3. Referee: [Experiments] Experimental validation (results tables/figures): While extensive experiments are reported, the absence of ablations that fix p while matching the same participation ratio and entropy statistics leaves the causal link between the proposed metrics and CO prevention untested; this undermines the claim that the method alone is sufficient without other interventions.

    Authors: We agree that a controlled ablation holding p fixed while matching the observed participation-ratio and entropy statistics would provide stronger causal evidence. Because these statistics are themselves functions of the chosen training norm, constructing such matched conditions requires additional experimental design. Our current results already compare the adaptive rule against fixed-p baselines and track the evolution of the metrics during training. In the revision we will add supplementary figures that plot participation ratio and entropy trajectories for the fixed-p runs, thereby making the correlation with CO more explicit. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation is empirical and self-contained.

full rationale

The paper starts from the empirical observation that CO occurs more under l^∞ than l², frames the generalized l^p attack as a fixed-point problem, and introduces Participation Ratio plus entropy as quantifiers of gradient concentration to drive an adaptive choice of p. No equation in the provided text defines the chosen p or the concentration metrics in terms of the final robustness metric, nor does any step rename a fitted parameter as a prediction or reduce the claimed sufficiency to a self-citation chain. The central pathway is therefore supported by external experimental validation rather than by construction from its own inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach rests on the empirical observation that CO is worse under l^inf than l^2 and on the modeling choice that gradient concentration measured by participation ratio drives the need for adaptive norm selection. No explicit free parameters are named in the abstract, but the adaptive rule likely introduces at least one tunable threshold or mapping from participation ratio to p.

free parameters (1)
  • mapping from participation ratio to p
    The adaptive rule that selects or tunes the norm order p based on measured gradient concentration is not specified as parameter-free.
axioms (1)
  • domain assumption CO emerges when highly concentrated gradients interact with aggressive norm constraints
    This is presented as the core insight motivating the adaptive l^p-FGSM.

pith-pipeline@v0.9.0 · 5772 in / 1440 out tokens · 73673 ms · 2026-05-22T16:43:36.579775+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. FlowMixer: A Depth-Agnostic Neural Architecture for Interpretable Spatiotemporal Forecasting

    cs.LG 2025-05 unverdicted novelty 5.0

    A single-layer architecture called FlowMixer uses constrained matrix operations and a semi-group property to enable depth-agnostic, interpretable spatiotemporal forecasting with direct eigenmode extraction.

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages · cited by 1 Pith paper · 5 internal anchors

  1. [1]

    Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups

    Geoffrey Hinton, Li Deng, Dong Yu, George E Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N Sainath, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6):82–97, 2012

  2. [2]

    Deep learning

    Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436–444, 2015

  3. [3]

    Attention is all you need

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in neural information processing systems, pages 5998–6008, 2017

  4. [4]

    Intriguing properties of neural networks

    Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. In arXiv preprint arXiv:1312.6199, 2013

  5. [5]

    Explaining and Harnessing Adversarial Examples

    Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014

  6. [6]

    Multilabel black-box adversarial attacks only with predicted labels

    Linghao Kong, Wenjian Luo, Zipeng Ye, Qi Zhou, and Yan Jia. Multilabel black-box adversarial attacks only with predicted labels. IEEE Transactions on Artificial Intelligence, 6(5):1284–1297, 2025

  7. [7]

    Rethinking transferable adversarial attacks with double adversarial neuron attribution

    Zhiyu Zhu, Zhibo Jin, Xinyi Wang, Jiayu Zhang, Huaming Chen, and Kim-Kwang Raymond Choo. Rethinking transferable adversarial attacks with double adversarial neuron attribution. IEEE Transactions on Artificial Intelligence, 6(2):354–364, 2025

  8. [8]

    Functional safety for machine learning: a case study in automotive software

    Léonard Humbert, Michael Wagner, and Philip Koopman. Functional safety for machine learning: a case study in automotive software. In Proceedings of the 35th Annual ACM Symposium on Applied Computing , pages 1739–1746, 2020

  9. [9]

    Dynamic risk assessment for autonomous vehicle safety

    Michael Wagner and Philip Koopman. Dynamic risk assessment for autonomous vehicle safety. Journal of Systems and Software, 168:110598, 2020

  10. [10]

    Detection and identification of uavs based on spectrum monitoring and deep learning in negative snr conditions

    F Mehouachi, Juan Galvis, Santiago Morales, Milosch Meriac, Felix Vega, and Chaouki Kasmi. Detection and identification of uavs based on spectrum monitoring and deep learning in negative snr conditions. URSI GASS, 2021

  11. [11]

    On the vulnerability of deep reinforcement learning to backdoor attacks in autonomous vehicles

    Yue Wang, Esha Sarkar, Saif Eddin Jabari, and Michail Maniatakos. On the vulnerability of deep reinforcement learning to backdoor attacks in autonomous vehicles. In Embedded Machine Learning for Cyber-Physical, IoT, and Edge Computing: Use Cases and Emerging Challenges, pages 315–341. Springer, 2023

  12. [12]

    Adversarial attacks on medical machine learning

    Samuel G Finlayson, John D Bowers, Joichi Ito, Jonathan L Zittrain, Andrew L Beam, and Isaac S Kohane. Adversarial attacks on medical machine learning. Science, 363(6433):1287–1289, 2019

  13. [13]

    Adversarial attacks on deep models for financial transaction records

    Ivan Fursov, Matvey Morozov, Nina Kaploukhaya, Elizaveta Kovtun, Rodrigo Rivera-Castro, Gleb Gusev, Dmitry Babaev, Ivan Kireev, Alexey Zaytsev, and Evgeny Burnaev. Adversarial attacks on deep models for financial transaction records. arXiv preprint arXiv:2106.08361, 2021

  14. [14]

    Adversarial attacks on machine learning systems for high-frequency trading

    Micah Goldblum, Avi Schwarzschild, Ankit B Patel, and Tom Goldstein. Adversarial attacks on machine learning systems for high-frequency trading. arXiv preprint arXiv:2002.09565, 2020

  15. [15]

    Towards Deep Learning Models Resistant to Adversarial Attacks

    Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017

  16. [16]

    Robustness of classifiers: from adversarial to random noise

    Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. Robustness of classifiers: from adversarial to random noise. Advances in Neural Information Processing Systems, 2018

  17. [17]

    Theoretically principled trade-off between robustness and accuracy

    Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric Xing, Laurent El Ghaoui, and Michael Jordan. Theoretically principled trade-off between robustness and accuracy. In International conference on machine learning, pages 7472–7482. PMLR, 2019

  18. [18]

    Rachel Selva Dhanaraj and M. Sridevi. Building a robust and efficient defensive system using hybrid adversarial attack. IEEE Transactions on Artificial Intelligence, 5(9):4470–4478, 2024

  19. [19]

    Apr-net: Defense against adversarial examples based on universal adversarial perturbation removal network.IEEE Transactions on Artificial Intelligence, 6(4):945–954, 2025

    Wenxing Liao, Zhuxian Liu, Minghuang Shen, Riqing Chen, and Xiaolong Liu. Apr-net: Defense against adversarial examples based on universal adversarial perturbation removal network.IEEE Transactions on Artificial Intelligence, 6(4):945–954, 2025

  20. [20]

    Adversarial machine learning for social good: Reframing the adversary as an ally

    Shawqi Al-Maliki, Adnan Qayyum, Hassan Ali, Mohamed Abdallah, Junaid Qadir, Dinh Thai Hoang, Dusit Niyato, and Ala Al-Fuqaha. Adversarial machine learning for social good: Reframing the adversary as an ally. IEEE Transactions on Artificial Intelligence, 5(9):4322–4343, 2024

  21. [21]

    Adversarial masked autoencoders are robust vision learners

    Yuchong Yao, Nandakishor Desai, and Marimuthu Palaniswami. Adversarial masked autoencoders are robust vision learners. IEEE Transactions on Artificial Intelligence, 6(4):805–815, 2025. 10 A Noiseless lp Norm Solution for Fast Adversarial Training

  22. [22]

    Active robust adversarial reinforcement learning under temporally coupled perturbations

    Jiacheng Yang, Yuanda Wang, Lu Dong, Lei Xue, and Changyin Sun. Active robust adversarial reinforcement learning under temporally coupled perturbations. IEEE Transactions on Artificial Intelligence, 6(4):874–884, 2025

  23. [23]

    A membership inference and adversarial attack defense framework for network traffic classifiers.IEEE Transactions on Artificial Intelligence, 6(2):317–332, 2025

    Guangrui Liu, Weizhe Zhang, Xurun Wang, Stephen King, and Shui Yu. A membership inference and adversarial attack defense framework for network traffic classifiers.IEEE Transactions on Artificial Intelligence, 6(2):317–332, 2025

  24. [24]

    Dale and Lauren Christopher

    Ashley S. Dale and Lauren Christopher. Direct adversarial latent estimation to evaluate decision boundary complexity in black box models. IEEE Transactions on Artificial Intelligence, 5(12):6043–6053, 2024

  25. [25]

    Fast is better than free: Revisiting adversarial training

    Eric Wong, Leslie Rice, and J Zico Kolter. Fast is better than free: Revisiting adversarial training. arXiv preprint arXiv:2001.03994, 2020

  26. [26]

    Understanding and improving fast adversarial training

    Maksym Andriushchenko and Nicolas Flammarion. Understanding and improving fast adversarial training. Advances in Neural Information Processing Systems, 33:16048–16059, 2020

  27. [27]

    Adversarial robustness through local linearization

    Chongli Qin, James Martens, Sven Gowal, Dilip Krishnan, Krishnamurthy Dvijotham, Alhussein Fawzi, Soham De, Robert Stanforth, and Pushmeet Kohli. Adversarial robustness through local linearization. Advances in Neural Information Processing Systems, 32, 2019

  28. [28]

    Spatio-temporal graph-based generation and detection of adversarial false data injection evasion attacks in smart grids

    Abdulrahman Takiddin, Muhammad Ismail, Rachad Atat, and Erchin Serpedin. Spatio-temporal graph-based generation and detection of adversarial false data injection evasion attacks in smart grids. IEEE Transactions on Artificial Intelligence, 5(12):6601–6616, 2024

  29. [29]

    Learning multiple layers of features from tiny images

    Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images. University of Toronto Technical Report, 2009

  30. [30]

    Wide Residual Networks

    Sergey Zagoruyko and Nikos Komodakis. Wide residual networks. arXiv preprint arXiv:1605.07146, 2016

  31. [31]

    Absence of diffusion in certain random lattices

    Philip W Anderson. Absence of diffusion in certain random lattices. Physical review, 109(5):1492, 1958

  32. [32]

    III: Quantum Mechanics

    Richard P Feynman, Robert B Leighton, and Matthew Sands.The Feynman Lectures on Physics, Vol. III: Quantum Mechanics. Addison-Wesley, 1965

  33. [33]

    Zerograd: Mitigating and explaining catastrophic overfitting in fgsm adversarial training

    Zeinab Golgooni, Mehrdad Saberi, Masih Eskandar, and Mohammad Hossein Rohban. Zerograd: Mitigating and explaining catastrophic overfitting in fgsm adversarial training. arXiv preprint arXiv:2103.15476, 2021

  34. [34]

    Make some noise: Reliable and efficient single-step adversarial training.Advances in Neural Information Processing Systems, 35:12881–12893, 2022

    Pau de Jorge Aranda, Adel Bibi, Riccardo V olpi, Amartya Sanyal, Philip Torr, Grégory Rogez, and Puneet Dokania. Make some noise: Reliable and efficient single-step adversarial training.Advances in Neural Information Processing Systems, 35:12881–12893, 2022

  35. [35]

    Deep Learning

    Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016

  36. [36]

    The nature of statistical learning theory

    Vladimir Vapnik. The nature of statistical learning theory. Springer science & business media, 1999

  37. [37]

    Self-normalizing neural networks

    Günter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. Self-normalizing neural networks. In Advances in Neural Information Processing Systems, pages 971–980, 2017

  38. [38]

    Gaussian Error Linear Units (GELUs)

    Dan Hendrycks and Kevin Gimpel. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415, 2016

  39. [39]

    Efficient training of low-curvature neural networks

    Suraj Srinivas, Kyle Matoba, Himabindu Lakkaraju, and François Fleuret. Efficient training of low-curvature neural networks. Advances in Neural Information Processing Systems, 35:25951–25964, 2022

  40. [40]

    Reading digits in natural images with unsupervised feature learning

    Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. 2011

  41. [41]

    Identity mappings in deep residual networks

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Identity mappings in deep residual networks. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14, pages 630–645. Springer, 2016

  42. [42]

    Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

    Francesco Croce and Matthias Hein. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International conference on machine learning, pages 2206–2216. PMLR, 2020

  43. [43]

    Pac-bayesian spectrally-normalized bounds for adversarially robust generalization

    Jiancong Xiao, Ruoyu Sun, and Zhi-Quan Luo. Pac-bayesian spectrally-normalized bounds for adversarially robust generalization. Advances in Neural Information Processing Systems, 36:36305–36323, 2023

  44. [44]

    Berg, and Li Fei-Fei

    Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015

  45. [45]

    Understanding catastrophic overfitting in single-step adversarial training

    Hoki Kim, Woojin Lee, and Jaewook Lee. Understanding catastrophic overfitting in single-step adversarial training. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 8119–8127, 2021. 11 A Noiseless lp Norm Solution for Fast Adversarial Training

  46. [46]

    Adversarial training for free!Advances in Neural Information Processing Systems, 32, 2019

    Ali Shafahi, Mahyar Najibi, Mohammad Amin Ghiasi, Zheng Xu, John Dickerson, Christoph Studer, Larry S Davis, Gavin Taylor, and Tom Goldstein. Adversarial training for free!Advances in Neural Information Processing Systems, 32, 2019

  47. [47]

    one power

    Leslie Rice, Eric Wong, and Zico Kolter. Overfitting in adversarially robust deep learning. In International Conference on Machine Learning, pages 8093–8104. PMLR, 2020. Acknowledgment This work was supported in part by the NYUAD Center for Interacting Urban Networks (CITIES), funded by Tamkeen under the NYUAD Research Institute Award CG001, and in part b...