Recognition: unknown
Unveiling the Backdoor Mechanism Hidden Behind Catastrophic Overfitting in Fast Adversarial Training
Pith reviewed 2026-05-08 04:25 UTC · model grok-4.3
The pith
Catastrophic overfitting in fast adversarial training functions as a weak backdoor trigger variant of unlearnable tasks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We interpret catastrophic overfitting in fast adversarial training as a backdoor phenomenon, supported by evidence of pathway division, diverse feature predictions, and universal class-distinguishable triggers. This leads us to conceptualize CO as a weak trigger variant of unlearnable tasks, placing CO, backdoor attacks, and unlearnable tasks inside one theoretical framework. The same view directly motivates backdoor-style interventions: recalibrating parameters through fine-tuning, linear probing, or reinitialization, plus a weight-outlier suppression term to curb abnormal weight growth.
What carries the argument
The backdoor lens on CO, which treats the overfitting as a weak, class-distinguishable trigger that unifies it with unlearnable tasks and enables direct transfer of mitigation tactics.
If this is right
- Backdoor-inspired recalibration of parameters restores generalization to unseen attacks.
- A weight-outlier suppression constraint limits the abnormal weight deviations that accompany CO.
- The shared framework predicts that techniques successful against backdoors will also reduce CO.
- Unlearnable-task methods become applicable to diagnosing and preventing catastrophic overfitting.
Where Pith is reading between the lines
- Standard backdoor detection tools could be repurposed to flag CO during training before it becomes catastrophic.
- Joint study of CO and unlearnable examples may reveal shared dynamics that govern when data becomes unusable for robust learning.
- The unification suggests that robustness benchmarks should test models against both adversarial and backdoor-style triggers.
Load-bearing premise
The listed phenomena (pathway division, diverse feature predictions, and universal class-distinguishable triggers) demonstrate a backdoor mechanism rather than ordinary overfitting or memorization.
What would settle it
A controlled run in which a model exhibits clear catastrophic overfitting yet shows none of the three backdoor indicators (pathway division, diverse predictions, or universal triggers) would refute the proposed mechanism.
Figures
read the original abstract
Fast Adversarial Training (FAT) has attracted significant attention due to its efficiency in enhancing neural network robustness against adversarial attacks. However, FAT is prone to catastrophic overfitting (CO), wherein models overfit to the specific attack used during training and fail to generalize to others. While existing methods introduce diverse hypotheses and propose various strategies to mitigate CO, a systematic and intuitive explanation of CO remains absent. In this work, we innovatively interpret CO through the lens of backdoor. Through validations on pathway division, diverse feature predictions, and universal class distinguishable triggers in CO, we conceptualize CO as a weak trigger variant of unlearnable tasks, unifying CO, backdoor attacks, and unlearnable tasks under a common theoretical framework. Guided by this, we leverage several backdoor inspired strategies to mitigate CO: (i) Recalibrate CO affected model parameters using vanilla fine tuning, linear probing, or reinitialization-based techniques; (ii) Introduce a weight outlier suppression constraint to regulate abnormal deviations in model weights. Extensive experiments support our interpretation of CO and show the efficacy of the proposed mitigation strategies.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that catastrophic overfitting (CO) in fast adversarial training (FAT) is a weak-trigger variant of unlearnable tasks, thereby unifying CO, backdoor attacks, and unlearnable tasks under a single theoretical framework. It validates this view via observations of pathway division, diverse feature predictions, and universal class-distinguishable triggers, then proposes backdoor-inspired mitigations consisting of parameter recalibration (vanilla fine-tuning, linear probing, or reinitialization) and a weight-outlier suppression constraint. The abstract states that experiments support both the interpretation and the efficacy of the fixes.
Significance. If the backdoor interpretation holds and the mitigations prove robust, the work would supply a novel unifying lens on CO that could import techniques from the backdoor literature into adversarial training, potentially yielding more reliable and efficient robustness methods. The practical recalibration and regularization strategies are directly usable and could improve FAT in settings where speed is critical.
major comments (3)
- [validation experiments] The validations on pathway division, diverse feature predictions, and universal class-distinguishable triggers (described in the abstract and the validation section) do not include control experiments that would distinguish a backdoor (weak-trigger unlearnable-task) mechanism from standard explanations of CO such as memorization of attack-specific perturbation directions. Without such falsifying tests, the central interpretive claim remains compatible with non-backdoor accounts of overfitting.
- [theoretical framework / conceptualization] No formal definition, mathematical axioms, or precise characterization of the claimed 'common theoretical framework' is supplied. The conceptualization of CO as a 'weak trigger variant of unlearnable tasks' is introduced informally, which prevents rigorous verification of the unification and makes the framework non-load-bearing for the paper's conclusions.
- [experiments / abstract] The abstract asserts that 'extensive experiments support our interpretation' yet reports no quantitative metrics, error bars, ablation details, or comparisons against existing CO mitigation baselines. This absence undermines the ability to assess whether the proposed mitigations outperform prior methods or merely reproduce known regularization effects.
minor comments (2)
- [mitigation strategies] Notation for the weight-outlier suppression constraint should be defined explicitly (e.g., the precise form of the regularizer and its hyper-parameters) to allow reproduction.
- [conclusion] The manuscript would benefit from a dedicated limitations paragraph discussing the scope of the backdoor analogy (e.g., whether it applies only to specific attack norms or architectures).
Simulated Author's Rebuttal
We thank the referee for the constructive and insightful comments. We address each major point below, providing our responses and indicating revisions to the manuscript.
read point-by-point responses
-
Referee: [validation experiments] The validations on pathway division, diverse feature predictions, and universal class-distinguishable triggers (described in the abstract and the validation section) do not include control experiments that would distinguish a backdoor (weak-trigger unlearnable-task) mechanism from standard explanations of CO such as memorization of attack-specific perturbation directions. Without such falsifying tests, the central interpretive claim remains compatible with non-backdoor accounts of overfitting.
Authors: We appreciate this suggestion for strengthening the interpretive claim. Our existing validations demonstrate phenomena such as pathway division and universal class-distinguishable triggers that are consistent with a weak-trigger backdoor mechanism. We agree that dedicated control experiments are needed to differentiate from alternatives like memorization of perturbation directions. In the revision, we will add control studies, including training with non-adversarial or random perturbations and testing for trigger universality in non-CO settings, to provide falsifying evidence. revision: yes
-
Referee: [theoretical framework / conceptualization] No formal definition, mathematical axioms, or precise characterization of the claimed 'common theoretical framework' is supplied. The conceptualization of CO as a 'weak trigger variant of unlearnable tasks' is introduced informally, which prevents rigorous verification of the unification and makes the framework non-load-bearing for the paper's conclusions.
Authors: The unification is presented as a conceptual framework highlighting mechanistic parallels, such as weak triggers inducing overfitting to specific patterns across CO, backdoors, and unlearnable tasks. We acknowledge the benefit of greater precision. We will add a subsection with a more formal characterization, defining shared properties (e.g., trigger weakness leading to class-specific overfitting) to support verification while retaining the intuitive unification. revision: yes
-
Referee: [experiments / abstract] The abstract asserts that 'extensive experiments support our interpretation' yet reports no quantitative metrics, error bars, ablation details, or comparisons against existing CO mitigation baselines. This absence undermines the ability to assess whether the proposed mitigations outperform prior methods or merely reproduce known regularization effects.
Authors: The abstract is intentionally concise, while the body of the manuscript (experimental sections) reports quantitative metrics, error bars from multiple runs, ablation studies on recalibration techniques and outlier suppression, and direct comparisons to prior CO mitigation methods. To improve transparency, we will revise the abstract to briefly reference these key results and performance advantages. revision: yes
Circularity Check
No significant circularity in interpretive unification of CO with backdoor/unlearnable tasks
full rationale
The paper's core move is an empirical interpretation: it reports observations of pathway division, diverse feature predictions, and class-distinguishable triggers within CO, then proposes to view CO as a weak-trigger variant of unlearnable tasks. This is presented as a unifying lens rather than a mathematical derivation, first-principles proof, or fitted model whose outputs are renamed as predictions. No equations appear that reduce to their own inputs by construction, no parameters are fitted on a subset and then called predictions on related quantities, and no load-bearing self-citations or uniqueness theorems are invoked. The mitigation strategies (fine-tuning, weight suppression) are motivated by the interpretation but remain independent empirical tests. The claimed common theoretical framework is therefore self-contained as an organizing perspective on existing phenomena, not a closed loop that forces the result from the inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Neural networks trained with fast adversarial training can develop class-specific feature pathways that are separable from normal decision boundaries.
invented entities (1)
-
weak trigger variant of unlearnable tasks
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Deep learning in optical metrology: a review,
C. Zuo, J. Qian, S. Feng, W. Yin, Y . Li, P. Fan, J. Han, K. Qian, and Q. Chen, “Deep learning in optical metrology: a review,”Light: Science & Applications, vol. 11, no. 1, p. 39, 2022
2022
-
[2]
Deep-learning seismology,
S. M. Mousavi and G. C. Beroza, “Deep-learning seismology,”Science, vol. 377, no. 6607, p. eabm4470, 2022
2022
-
[3]
Sleap: A deep learning system for multi-animal pose tracking,
T. D. Pereira, N. Tabris, A. Matsliah, D. M. Turner, J. Li, S. Ravin- dranath, E. S. Papadoyannis, E. Normand, D. S. Deutsch, Z. Y . Wang et al., “Sleap: A deep learning system for multi-animal pose tracking,” Nature methods, vol. 19, no. 4, pp. 486–495, 2022
2022
-
[4]
Deep learning and protein structure modeling,
M. Baek and D. Baker, “Deep learning and protein structure modeling,” Nature methods, vol. 19, no. 1, pp. 13–14, 2022
2022
-
[5]
Current progress and open challenges for applying deep learning across the biosciences,
N. Sapoval, A. Aghazadeh, M. G. Nute, D. A. Antunes, A. Balaji, R. Baraniuk, C. Barberan, R. Dannenfelser, C. Dun, M. Edrisiet al., “Current progress and open challenges for applying deep learning across the biosciences,”Nature Communications, vol. 13, no. 1, p. 1728, 2022
2022
-
[6]
How to backdoor diffusion models?
S.-Y . Chou, P.-Y . Chen, and T.-Y . Ho, “How to backdoor diffusion models?” inCVPR, 2023, pp. 4015–4024
2023
-
[7]
Noise-suppressing neural dynamics for time-dependent constrained nonlinear optimization with applications,
L. Wei, L. Jin, and X. Luo, “Noise-suppressing neural dynamics for time-dependent constrained nonlinear optimization with applications,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 10, pp. 6139–6150, 2022
2022
-
[8]
Effective backdoor defense by exploiting sensitivity of poisoned samples,
W. Chen, B. Wu, and H. Wang, “Effective backdoor defense by exploiting sensitivity of poisoned samples,” vol. 35, 2022, pp. 9727– 9737
2022
-
[9]
Advdo: Re- alistic adversarial attacks for trajectory prediction,
Y . Cao, C. Xiao, A. Anandkumar, D. Xu, and M. Pavone, “Advdo: Re- alistic adversarial attacks for trajectory prediction,” inECCV. Springer, 2022, pp. 36–52
2022
-
[10]
Segpgd: An effective and efficient adversarial attack for evaluating and boosting segmentation robustness,
J. Gu, H. Zhao, V . Tresp, and P. H. Torr, “Segpgd: An effective and efficient adversarial attack for evaluating and boosting segmentation robustness,” inECCV. Springer, 2022, pp. 308–325
2022
-
[11]
Shadows can be dangerous: Stealthy and effective physical-world adversarial attack by natural phenomenon,
Y . Zhong, X. Liu, D. Zhai, J. Jiang, and X. Ji, “Shadows can be dangerous: Stealthy and effective physical-world adversarial attack by natural phenomenon,” inCVPR, 2022, pp. 15 345–15 354
2022
-
[12]
Structure-guided adversarial training of diffusion models,
L. Yang, H. Qian, Z. Zhang, J. Liu, and B. Cui, “Structure-guided adversarial training of diffusion models,” inCVPR, 2024, pp. 7256– 7266
2024
- [13]
-
[14]
Defensive unlearning with adversarial training for robust concept erasure in diffusion models,
Y . Zhang, X. Chen, J. Jia, Y . Zhang, C. Fan, J. Liu, M. Hong, K. Ding, and S. Liu, “Defensive unlearning with adversarial training for robust concept erasure in diffusion models,” vol. 37, 2024, pp. 36 748–36 776
2024
-
[15]
Revisiting adversarial training under long-tailed distributions,
X. Yue, N. Mou, Q. Wang, and L. Zhao, “Revisiting adversarial training under long-tailed distributions,” inCVPR, 2024, pp. 24 492–24 501
2024
-
[16]
F. Fang, Y . Bai, S. Ni, M. Yang, X. Chen, and R. Xu, “Enhancing noise robustness of retrieval-augmented language models with adaptive adversarial training,”arXiv preprint arXiv:2405.20978, 2024
-
[17]
Effec- tive single-step adversarial training with energy-based models,
K. Tang, T. Lou, W. Peng, N. Chen, Y . Shi, and W. Wang, “Effec- tive single-step adversarial training with energy-based models,”IEEE Transactions on Emerging Topics in Computational Intelligence, 2024. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY , VOL. XX, NO. X, MONTH 20XX 11
2024
-
[18]
When adversarial train- ing meets vision transformers: Recipes from training to architecture,
Y . Mo, D. Wu, Y . Wang, Y . Guo, and Y . Wang, “When adversarial train- ing meets vision transformers: Recipes from training to architecture,” vol. 35, 2022, pp. 18 599–18 611
2022
-
[19]
Las-at: adversarial training with learnable attack strategy,
X. Jia, Y . Zhang, B. Wu, K. Ma, J. Wang, and X. Cao, “Las-at: adversarial training with learnable attack strategy,” inCVPR, 2022, pp. 13 398–13 408
2022
-
[20]
Towards efficient adversarial training on vision transformers,
B. Wu, J. Gu, Z. Li, D. Cai, X. He, and W. Liu, “Towards efficient adversarial training on vision transformers,” inECCV. Springer, 2022, pp. 307–325
2022
-
[21]
Stability analysis and generalization bounds of adversarial training,
J. Xiao, Y . Fan, R. Sun, J. Wang, and Z.-Q. Luo, “Stability analysis and generalization bounds of adversarial training,” vol. 35, 2022, pp. 15 446–15 459
2022
-
[22]
Enhancing adversarial training with second-order statistics of weights,
G. Jin, X. Yi, W. Huang, S. Schewe, and X. Huang, “Enhancing adversarial training with second-order statistics of weights,” inCVPR, 2022, pp. 15 273–15 283
2022
-
[23]
Cross-lingual event detection via optimized adversarial training,
L. Guzman-Nateras, M. Van Nguyen, and T. Nguyen, “Cross-lingual event detection via optimized adversarial training,” inConference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022, pp. 5588–5599
2022
-
[24]
Towards deep learning models resistant to adversarial attacks,
A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” inICLR, 2018
2018
-
[25]
Adversarial initialization with universal adversarial perturbation: A new approach to fast adversarial training,
C. Pan, Q. Li, and X. Yao, “Adversarial initialization with universal adversarial perturbation: A new approach to fast adversarial training,” inAAAI, vol. 38, no. 19, 2024, pp. 21 501–21 509
2024
-
[26]
Understanding catastrophic overfitting in single-step adversarial training,
H. Kim, W. Lee, and J. Lee, “Understanding catastrophic overfitting in single-step adversarial training,” inAAAI, vol. 35, no. 9, 2021, pp. 8119–8127
2021
-
[27]
Fast is better than free: Revisiting adversarial training,
K. J. Z. Wong E, Rice L, “Fast is better than free: Revisiting adversarial training,” inICLR, 2020
2020
-
[28]
Subspace adversarial training,
T. Li, Y . Wu, S. Chen, K. Fang, and X. Huang, “Subspace adversarial training,” inCVPR, 2022, pp. 13 409–13 418
2022
-
[29]
Fast adversarial training with adaptive step size,
Z. Huang, Y . Fan, C. Liu, W. Zhang, Y . Zhang, M. Salzmann, S. S ¨usstrunk, and J. Wang, “Fast adversarial training with adaptive step size,”IEEE TIP, vol. 32, pp. 6102–6114, 2023
2023
-
[30]
Reliably fast adversarial training via latent adversarial perturbation,
G. Y . Park and S. W. Lee, “Reliably fast adversarial training via latent adversarial perturbation,” inICCV, 2021, pp. 7758–7767
2021
-
[31]
Fast adversarial training against textual adversarial attacks.arXiv preprint arXiv:2401.12461, 2024
Y . Yang, X. Liu, and K. He, “Fast adversarial training against textual adversarial attacks,”arXiv preprint arXiv:2401.12461, 2024
-
[32]
Fast adversarial training with adaptive step size,
Z. Huang, Y . Fan, C. Liu, W. Zhang, Y . Zhang, M. Salzmann, S. S ¨usstrunk, and J. Wang, “Fast adversarial training with adaptive step size,”arXiv preprint arXiv:2206.02417, 2022
-
[33]
Fast adversarial training with smooth convergence,
M. Zhao, L. Zhang, Y . Kong, and B. Yin, “Fast adversarial training with smooth convergence,” inICCV, 2023, pp. 4720–4729
2023
-
[34]
Zerograd: Costless conscious remedies for catastrophic overfitting in the fgsm adversarial training,
Z. Golgooni, M. Saberi, M. Eskandar, and M. H. Rohban, “Zerograd: Costless conscious remedies for catastrophic overfitting in the fgsm adversarial training,”Intelligent Systems with Applications, vol. 19, p. 200258, 2023
2023
-
[35]
Understanding and improving fast adver- sarial training,
F. N. Andriushchenko M, “Understanding and improving fast adver- sarial training,” 2020, pp. 16 048–16 059
2020
-
[36]
Prior-guided adversarial initialization for fast adversarial training,
J. Xiaojun, Z. Yong, W. Xingxing, W. Baoyuan, M. Ke, W. Jue, and C. Xiaochun, “Prior-guided adversarial initialization for fast adversarial training,” inECCV, 2022
2022
-
[37]
Investigating catastrophic overfitting in fast adversarial training: A self-fitting perspective,
Z. He, T. Li, S. Chen, and X. Huang, “Investigating catastrophic overfitting in fast adversarial training: A self-fitting perspective,” in CVPR, 2023, pp. 2313–2320
2023
-
[38]
Understanding and improv- ing fast adversarial training,
M. Andriushchenko and N. Flammarion, “Understanding and improv- ing fast adversarial training,” vol. 33, 2020, pp. 16 048–16 059
2020
-
[39]
Catastrophic overfitting: A potential blessing in disguise,
M. Zhao, L. Zhang, Y . Kong, and B. Yin, “Catastrophic overfitting: A potential blessing in disguise,” inECCV. Springer, 2024, pp. 293–310
2024
-
[40]
Backdoor attacks and defenses targeting multi-domain ai models: A comprehensive review,
S. Zhang, Y . Pan, Q. Liu, Z. Yan, K.-K. R. Choo, and G. Wang, “Backdoor attacks and defenses targeting multi-domain ai models: A comprehensive review,”ACM Computing Surveys, vol. 57, no. 4, pp. 1–35, 2024
2024
-
[41]
Watch out for your agents! investigating backdoor threats to llm-based agents,
W. Yang, X. Bi, Y . Lin, S. Chen, J. Zhou, and X. Sun, “Watch out for your agents! investigating backdoor threats to llm-based agents,” vol. 37, 2024, pp. 100 938–100 964
2024
-
[42]
Badclip: Dual-embedding guided backdoor attack on multimodal contrastive learning,
S. Liang, M. Zhu, A. Liu, B. Wu, X. Cao, and E.-C. Chang, “Badclip: Dual-embedding guided backdoor attack on multimodal contrastive learning,” inCVPR, 2024, pp. 24 645–24 654
2024
-
[43]
Physical backdoor: Towards temperature-based backdoor attacks in the physical world,
W. Yin, J. Lou, P. Zhou, Y . Xie, D. Feng, Y . Sun, T. Zhang, and L. Sun, “Physical backdoor: Towards temperature-based backdoor attacks in the physical world,” inCVPR, 2024, pp. 12 733–12 743
2024
-
[44]
Adversarial machine learning at scale,
A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial machine learning at scale,” inICLR, 2017
2017
-
[45]
Boosting adversarial attacks with momentum,
Y . Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li, “Boosting adversarial attacks with momentum,” inCVPR, 2018, pp. 9185–9193
2018
-
[46]
Defense against adversarial attacks using topology aligning adversarial training,
H. Kuang, H. Liu, X. Lin, and R. Ji, “Defense against adversarial attacks using topology aligning adversarial training,”IEEE TIFS, vol. 19, pp. 3659–3673, 2024
2024
-
[47]
Revisiting adversarial training at scale,
Z. Wang, X. Li, H. Zhu, and C. Xie, “Revisiting adversarial training at scale,” inCVPR, 2024, pp. 24 675–24 685
2024
-
[48]
Explaining and harnessing adversarial examples,
I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” inICLR, 2015
2015
-
[49]
Revisiting and advancing fast adversarial training through the lens of bi-level optimization,
Y . Zhang, G. Zhang, P. Khanduri, M. Hong, S. Chang, and S. Liu, “Revisiting and advancing fast adversarial training through the lens of bi-level optimization,” inICML. PMLR, 2022, pp. 26 693–26 712
2022
-
[50]
Boosting fast adversarial training with learnable adversarial initialization,
X. Jia, Y . Zhang, B. Wu, J. Wang, and X. Cao, “Boosting fast adversarial training with learnable adversarial initialization,”IEEE TIP, vol. 31, pp. 4417–4430, 2022
2022
-
[51]
Prior- guided adversarial initialization for fast adversarial training,
X. Jia, Y . Zhang, X. Wei, B. Wu, K. Ma, J. Wang, and X. Cao, “Prior- guided adversarial initialization for fast adversarial training,” inECCV. Springer, 2022, pp. 567–584
2022
-
[52]
Taxonomy driven fast adversarial training,
K. Tong, C. Jiang, J. Gui, and Y . Cao, “Taxonomy driven fast adversarial training,” inAAAI, vol. 38, no. 6, 2024, pp. 5233–5242
2024
-
[53]
Revisiting and exploring efficient fast adversarial training via law: Lipschitz regularization and auto weight averaging,
X. Jia, Y . Chen, X. Mao, R. Duan, J. Gu, R. Zhang, H. Xue, Y . Liu, and X. Cao, “Revisiting and exploring efficient fast adversarial training via law: Lipschitz regularization and auto weight averaging,”IEEE TIFS, 2024
2024
-
[54]
Fast propagation is better: Ac- celerating single-step adversarial training via sampling subnetworks,
X. Jia, J. Li, J. Gu, Y . Bai, and X. Cao, “Fast propagation is better: Ac- celerating single-step adversarial training via sampling subnetworks,” IEEE TIFS, 2024
2024
-
[55]
Make some noise: Reliable and efficient single-step adversarial training,
P. de Jorge Aranda, A. Bibi, R. V olpi, A. Sanyal, P. Torr, G. Rogez, and P. Dokania, “Make some noise: Reliable and efficient single-step adversarial training,” vol. 35, 2022, pp. 12 881–12 893
2022
-
[56]
Overfitting in adversarially robust deep learning,
L. Rice, E. Wong, and Z. Kolter, “Overfitting in adversarially robust deep learning,” inICML. PMLR, 2020, pp. 8093–8104
2020
-
[57]
Improving fast adversarial training with prior-guided knowledge,
X. Jia, Y . Zhang, X. Wei, B. Wu, K. Ma, J. Wang, and X. Cao, “Improving fast adversarial training with prior-guided knowledge,” TPAMI, vol. 46, no. 9, pp. 6367–6383, 2024
2024
-
[58]
Rethinking fast adversarial train- ing: A splitting technique to overcome catastrophic overfitting,
M. Zareapoor and P. Shamsolmoali, “Rethinking fast adversarial train- ing: A splitting technique to overcome catastrophic overfitting,” in ECCV. Springer, 2024, pp. 34–51
2024
-
[59]
Preventing catastrophic over- fitting in fast adversarial training: A bi-level optimization perspective,
Z. Wang, H. Wang, C. Tian, and Y . Jin, “Preventing catastrophic over- fitting in fast adversarial training: A bi-level optimization perspective,” inECCV. Springer, 2024, pp. 144–160
2024
-
[60]
Revealing the pseudo-robust shortcut dependency,
L. Runqi, Y . Chaojian, H. Bo, S. Hang, and L. Tongliang, “Revealing the pseudo-robust shortcut dependency,” inICML. PMLR, 2024, pp. 2663–2672
2024
-
[61]
Badnets: Evaluating backdooring attacks on deep neural networks,
T. Gu, K. Liu, B. Dolan-Gavitt, and S. Garg, “Badnets: Evaluating backdooring attacks on deep neural networks,”IEEE Access, vol. 7, pp. 47 230–47 244, 2019
2019
-
[62]
Trojaning attack on neural networks,
Y . Liu, S. Ma, Y . Aafer, W.-C. Lee, J. Zhai, W. Wang, and X. Zhang, “Trojaning attack on neural networks,” inNDSS. Internet Soc, 2018
2018
-
[63]
Invisible backdoor attack with sample-specific triggers,
Y . Li, Y . Li, B. Wu, L. Li, R. He, and S. Lyu, “Invisible backdoor attack with sample-specific triggers,” inICCV, 2021, pp. 16 463–16 472
2021
-
[64]
Rethinking the backdoor attacks’ triggers: A frequency perspective,
Y . Zeng, W. Park, Z. M. Mao, and R. Jia, “Rethinking the backdoor attacks’ triggers: A frequency perspective,” inICCV, 2021, pp. 16 473– 16 481
2021
-
[65]
Fsba: Invisible back- door attacks via frequency domain and singular value decomposition,
W. Chen, X. Xu, X. Wang, Z. Li, and Y . Chen, “Fsba: Invisible back- door attacks via frequency domain and singular value decomposition,” Expert Systems with Applications, vol. 288, p. 127830, 2025
2025
-
[66]
Wanet–imperceptible warping-based back- door attack,
A. Nguyen and A. Tran, “Wanet–imperceptible warping-based back- door attack,”arXiv preprint arXiv:2102.10369, 2021
-
[67]
Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning
X. Chen, C. Liu, B. Li, K. Lu, and D. Song, “Targeted backdoor attacks on deep learning systems using data poisoning,”arXiv preprint arXiv:1712.05526, 2017
work page internal anchor Pith review arXiv 2017
-
[68]
An invisible black-box backdoor attack through frequency domain,
T. Wang, Y . Yao, F. Xu, S. An, H. Tong, and T. Wang, “An invisible black-box backdoor attack through frequency domain,” inECCV. Springer, 2022, pp. 396–413
2022
-
[69]
Lira: Learnable, imperceptible and robust backdoor attacks,
K. Doan, Y . Lao, W. Zhao, and P. Li, “Lira: Learnable, imperceptible and robust backdoor attacks,” inICCV, 2021, pp. 11 966–11 976
2021
-
[70]
An embarrassingly simple backdoor attack on self-supervised learning,
C. Li, R. Pang, Z. Xi, T. Du, S. Ji, Y . Yao, and T. Wang, “An embarrassingly simple backdoor attack on self-supervised learning,” inICCV, 2023, pp. 4367–4378
2023
-
[71]
Unlearnable examples: Making personal data unexploitable,
H. Huang, X. Ma, S. M. Erfani, J. Bailey, and Y . Wang, “Unlearnable examples: Making personal data unexploitable,” inICLR, 2021
2021
-
[72]
Transferable unlearnable examples,
J. Ren, H. Xu, Y . Wan, X. Ma, L. Sun, and J. Tang, “Transferable unlearnable examples,”arXiv preprint arXiv:2210.10114, 2022
-
[73]
Unlearnable clusters: Towards label-agnostic unlearnable examples,
J. Zhang, X. Ma, Q. Yi, J. Sang, Y .-G. Jiang, Y . Wang, and C. Xu, “Unlearnable clusters: Towards label-agnostic unlearnable examples,” inCVPR, 2023, pp. 3984–3993. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY , VOL. XX, NO. X, MONTH 20XX 12
2023
-
[74]
Multimodal unlearnable examples: Protecting data against multimodal contrastive learning,
X. Liu, X. Jia, Y . Xun, S. Liang, and X. Cao, “Multimodal unlearnable examples: Protecting data against multimodal contrastive learning,” in ACMM, 2024, pp. 8024–8033
2024
-
[75]
Learning the unlearn- able: Adversarial augmentations suppress unlearnable example attacks,
T. Qin, X. Gao, J. Zhao, K. Ye, and C.-Z. Xu, “Learning the unlearn- able: Adversarial augmentations suppress unlearnable example attacks,” arXiv preprint arXiv:2303.15127, 2023
-
[76]
Robust unlearnable examples: Protecting data against adversarial learning,
S. Fu, F. He, Y . Liu, L. Shen, and D. Tao, “Robust unlearnable examples: Protecting data against adversarial learning,”arXiv preprint arXiv:2203.14533, 2022
-
[77]
Ungeneralizable examples,
J. Ye and X. Wang, “Ungeneralizable examples,” inCVPR, 2024, pp. 11 944–11 953
2024
-
[78]
Detecting and corrupting convolution-based unlearnable examples,
M. Li, X. Wang, Z. Yu, S. Hu, Z. Zhou, L. Zhang, and L. Y . Zhang, “Detecting and corrupting convolution-based unlearnable examples,” in AAAI, vol. 39, no. 17, 2025, pp. 18 403–18 411
2025
-
[79]
Learning multiple layers of features from tiny images,
A. Krizhevsky, G. Hintonet al., “Learning multiple layers of features from tiny images,” 2009
2009
-
[80]
Deep residual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inCVPR, 2016, pp. 770–778
2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.