Recognition: 2 theorem links
· Lean TheoremATAC: Augmentation-Based Test-Time Adversarial Correction for CLIP
Pith reviewed 2026-05-17 20:20 UTC · model grok-4.3
The pith
ATAC corrects adversarial CLIP embeddings at test time by aligning augmentation-induced drift vectors to recover the original semantic direction.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Our method operates directly in the embedding space of CLIP, calculating augmentation-induced drift vectors to infer a semantic recovery direction and correcting the embedding based on the angular consistency of these latent drifts. Across a wide range of benchmarks, ATAC consistently achieves remarkably high robustness, surpassing that of previous state-of-the-art methods by nearly 50% on average, all while requiring minimal computational overhead.
What carries the argument
Augmentation-induced drift vectors in the CLIP embedding space, whose angular consistency determines the semantic recovery direction for correction.
If this is right
- ATAC surpasses previous state-of-the-art test-time defenses by nearly 50 percent on average across benchmarks.
- The method requires only minimal computational overhead at test time.
- ATAC maintains high robustness even in unconventional and extreme settings.
- It provides nontrivial robustness against adaptive attacks.
Where Pith is reading between the lines
- The same drift-consistency idea might apply to other multimodal models that share similar embedding spaces.
- Combining this correction with natural distribution shifts could test whether angular consistency helps beyond adversarial cases.
- The approach could scale to video or other modalities if augmentations produce comparable latent drifts.
Load-bearing premise
Augmentation-induced drift vectors reliably point toward a semantic recovery direction whose angular consistency can be used to correct the embedding without introducing new errors or requiring task-specific tuning.
What would settle it
Measuring whether the angular consistency of drift vectors from augmentations correlates with higher accuracy on standard adversarial image benchmarks after correction compared to the original perturbed embeddings.
Figures
read the original abstract
Despite its remarkable success in zero-shot image-text matching, CLIP remains highly vulnerable to adversarial perturbations on images. As adversarial fine-tuning is prohibitively costly, recent works explore various test-time defense strategies; however, these approaches still exhibit limited robustness. In this work, we revisit this problem and propose a simple yet effective strategy: Augmentation-based Test-time Adversarial Correction (ATAC). Our method operates directly in the embedding space of CLIP, calculating augmentation-induced drift vectors to infer a semantic recovery direction and correcting the embedding based on the angular consistency of these latent drifts. Across a wide range of benchmarks, ATAC consistently achieves remarkably high robustness, surpassing that of previous state-of-the-art methods by nearly 50\% on average, all while requiring minimal computational overhead. Furthermore, ATAC retains state-of-the-art robustness in unconventional and extreme settings and even achieves nontrivial robustness against adaptive attacks. Our results demonstrate that ATAC is an efficient method in a novel paradigm for test-time adversarial defenses in the embedding space of CLIP. Code is available at: https://github.com/kylin0421/ATAC
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces ATAC, a test-time adversarial correction method for CLIP that operates in embedding space. It computes drift vectors induced by a set of augmentations, infers a semantic recovery direction from their angular consistency, and applies a deterministic correction to the perturbed embedding. The central empirical claim is that this yields nearly 50% average robustness gains over prior state-of-the-art test-time defenses across benchmarks, with low overhead, while retaining strong performance in extreme settings and nontrivial robustness to adaptive attacks.
Significance. If the empirical claims and the underlying geometric assumption hold under rigorous testing, ATAC would constitute a meaningful advance in efficient, training-free defenses for vision-language models. The embedding-space correction paradigm and the reported gains over existing test-time methods could influence practical deployment of robust zero-shot CLIP systems.
major comments (2)
- [§4.3] §4.3 (Adaptive Attack Evaluation): The reported 'nontrivial' robustness to adaptive attacks relies on standard PGD-style or expectation-over-transformation attacks rather than an attacker that directly optimizes the input to either align the observed drift vectors in a harmful direction or to drive their angular consistency below the correction threshold. Because the correction rule is a fixed function of the drift-consistency statistic, this omission leaves the 50% average gain claim dependent on an untested assumption about the geometry of augmentation drifts under targeted adaptation.
- [§3.2] §3.2 (Drift Vector Correction Rule): The method assumes that the principal direction of augmentation-induced drifts is reliably a semantic recovery direction orthogonal to the adversarial perturbation. No ablation or geometric analysis is provided showing that this direction remains stable when the adversary knows the exact augmentation set and can craft perturbations that exploit the deterministic correction; this assumption is load-bearing for both the robustness numbers and the claim of a 'novel paradigm'.
minor comments (2)
- [Abstract and §4.1] The abstract and §4.1 would be strengthened by reporting exact benchmark names, baseline methods, absolute accuracy numbers, and statistical significance rather than the summary phrase 'nearly 50% on average'.
- [Figure 2] Figure 2 (drift vector visualization) would benefit from clearer annotation of the angular consistency threshold and the exact correction vector applied.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. The comments raise important points about the strength of our adaptive attack evaluation and the geometric assumptions underlying the correction rule. We address each concern below with clarifications based on our existing experiments and indicate where revisions will be made to strengthen the paper.
read point-by-point responses
-
Referee: [§4.3] §4.3 (Adaptive Attack Evaluation): The reported 'nontrivial' robustness to adaptive attacks relies on standard PGD-style or expectation-over-transformation attacks rather than an attacker that directly optimizes the input to either align the observed drift vectors in a harmful direction or to drive their angular consistency below the correction threshold. Because the correction rule is a fixed function of the drift-consistency statistic, this omission leaves the 50% average gain claim dependent on an untested assumption about the geometry of augmentation drifts under targeted adaptation.
Authors: We appreciate the referee's observation on the nature of the adaptive attacks. Our evaluation includes both standard PGD attacks and Expectation-over-Transformation (EoT) attacks. The EoT formulation explicitly optimizes the loss while averaging over the same family of augmentations used by ATAC, which directly targets the drift vectors and their consistency. This provides a meaningful test of robustness under adaptation to the augmentation process. We acknowledge, however, that an even more specialized attack explicitly optimizing to minimize angular consistency or to invert the inferred recovery direction was not evaluated. We will revise §4.3 to clarify the relationship between EoT and the method's internal statistics and to discuss this as a limitation and avenue for future work. revision: partial
-
Referee: [§3.2] §3.2 (Drift Vector Correction Rule): The method assumes that the principal direction of augmentation-induced drifts is reliably a semantic recovery direction orthogonal to the adversarial perturbation. No ablation or geometric analysis is provided showing that this direction remains stable when the adversary knows the exact augmentation set and can craft perturbations that exploit the deterministic correction; this assumption is load-bearing for both the robustness numbers and the claim of a 'novel paradigm'.
Authors: The correction rule in §3.2 is motivated by the empirical finding that, across clean images, the principal component of augmentation-induced drifts aligns with directions that improve semantic alignment, while adversarial perturbations produce inconsistent or opposing drifts. This is quantified through the angular consistency metric reported in the paper. While we did not include an explicit ablation in which the adversary is given oracle knowledge of the precise augmentation parameters to craft perturbations that deliberately exploit the deterministic mapping, the EoT attacks already incorporate knowledge of the augmentation distribution during optimization. We agree that a dedicated geometric analysis (e.g., drift vector visualizations or stability tests under augmentation-set-aware adversaries) would provide stronger support. We will add such analysis to the revised manuscript. revision: yes
Circularity Check
No circularity: explicit algorithmic correction rule defined directly from augmentations
full rationale
The paper defines ATAC as a direct, deterministic procedure operating in CLIP embedding space: compute augmentation-induced drift vectors, infer a semantic recovery direction from their angular consistency, and apply a correction. This is an explicit algorithmic rule with no fitted parameters tuned on target data, no self-referential definitions where the claimed recovery direction is constructed from the same quantity it predicts, and no load-bearing self-citations that reduce the central claim to unverified prior work by the authors. Robustness results are presented as empirical outcomes on benchmarks rather than derivations that collapse to the input definitions by construction. The method is self-contained against external evaluation.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Augmentations preserve enough semantic content that their induced drifts in embedding space can be aggregated via angular consistency to recover the correct direction.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
calculating augmentation-induced drift vectors to infer a semantic recovery direction and correcting the embedding based on the angular consistency of these latent drifts
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanembed_strictMono_of_one_lt unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
τ = 1/n Σ cos(di, d-bar); f* = fx + α d-bar if τ > τ*
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Jameel Abdul Samadh, Mohammad Hanan Gani, Noor Hus- sein, Muhammad Uzair Khattak, Muhammad Muzammal Naseer, Fahad Shahbaz Khan, and Salman H Khan. Align your prompts: Test-time prompting with distribution align- ment for zero-shot generalization.Advances in Neural Infor- mation Processing Systems, 36:80396–80413, 2023. 2
work page 2023
-
[2]
Square attack: a query-efficient black-box adversarial attack via random search
Maksym Andriushchenko, Francesco Croce, Nicolas Flam- marion, and Matthias Hein. Square attack: a query-efficient black-box adversarial attack via random search. InEuropean conference on computer vision, pages 484–501. Springer,
-
[3]
Synthesizing robust adversarial examples
Anish Athalye, Logan Engstrom, Andrew Ilyas, and Kevin Kwok. Synthesizing robust adversarial examples. InInter- national conference on machine learning, pages 284–293. PMLR, 2018. 7
work page 2018
-
[4]
Food-101 - mining discriminative components with random forests
Lukas Bossard, Matthieu Guillaumin, and Luc Van Gool. Food-101 - mining discriminative components with random forests. InEuropean Conference on Computer Vision, 2014. 4
work page 2014
-
[5]
Towards evaluating the robustness of neural networks
Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In2017 ieee symposium on security and privacy (sp), pages 39–57. Ieee, 2017. 1, 2, 4, 11
work page 2017
-
[6]
On evaluating ad- versarial robustness, 2019
Nicholas Carlini, Anish Athalye, Nicolas Papernot, Wieland Brendel, Jonas Rauber, Dimitris Tsipras, Ian Goodfellow, Aleksander Madry, and Alexey Kurakin. On evaluating ad- versarial robustness, 2019. 7
work page 2019
-
[7]
Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. Describing textures in the wild.2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 3606–3613, 2013. 4
work page 2014
-
[8]
Adam Coates, A. Ng, and Honglak Lee. An analysis of single-layer networks in unsupervised feature learning. In International Conference on Artificial Intelligence and Statis- tics, 2011. 4
work page 2011
-
[9]
Certified adversarial robustness via randomized smoothing
Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. Certified adversarial robustness via randomized smoothing. Ininter- national conference on machine learning, pages 1310–1320. PMLR, 2019. 1, 2
work page 2019
-
[10]
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter- free attacks
Francesco Croce and Matthias Hein. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter- free attacks. InInternational conference on machine learning, pages 2206–2216. PMLR, 2020. 4, 11
work page 2020
-
[11]
Em- bedding shift dissection on clip: Effects of augmentations on vlm’s representation learning
Ashim Dahal, Saydul Akbar Murad, and Nick Rahimi. Em- bedding shift dissection on clip: Effects of augmentations on vlm’s representation learning. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 4814–4818,
-
[12]
Jia Deng, Jin Li, Zhenhua Zhao, and Shaowei Wang. Fpt- noise: Dynamic scene-aware counterattack for test-time ad- versarial defense in vision-language models, 2025. 2
work page 2025
-
[13]
Li Fei-Fei, Rob Fergus, and Pietro Perona. One-shot learning of object categories.IEEE Transactions on Pattern Analysis and Machine Intelligence, 28:594–611, 2006. 4
work page 2006
-
[14]
Diverse data augmentation with diffusions for effective test-time prompt tuning, 2023
Chun-Mei Feng, Kai Yu, Yong Liu, Salman Khan, and Wang- meng Zuo. Diverse data augmentation with diffusions for effective test-time prompt tuning, 2023. 2
work page 2023
-
[15]
Clip- adapter: Better vision-language models with feature adapters
Peng Gao, Shijie Geng, Renrui Zhang, Teli Ma, Rongyao Fang, Yongfeng Zhang, Hongsheng Li, and Yu Qiao. Clip- adapter: Better vision-language models with feature adapters. International Journal of Computer Vision, 132(2):581–595,
-
[16]
Goodfellow, Jonathon Shlens, and Christian Szegedy
Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples, 2015. 1, 2
work page 2015
-
[17]
Caltech-256 object category dataset
Gregory Griffin, Alex Holub, Pietro Perona, et al. Caltech-256 object category dataset. Technical report, Technical Report 7694, California Institute of Technology Pasadena, 2007. 4
work page 2007
-
[18]
Countering adversarial images using input transformations, 2018
Chuan Guo, Mayank Rana, Moustapha Cisse, and Laurens van der Maaten. Countering adversarial images using input transformations, 2018. 1
work page 2018
-
[19]
Patrick Helber, Benjamin Bischke, Andreas R. Dengel, and Damian Borth. Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12:2217–2226, 2017. 4
work page 2017
-
[20]
Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Logan Engstrom, Brandon Tran, and Aleksander Madry. Adversarial examples are not bugs, they are features.Advances in neural information processing systems, 32, 2019. 1, 2, 3
work page 2019
-
[21]
Jonathan Krause, Michael Stark, Jia Deng, and Li Fei-Fei. 3d object representations for fine-grained categorization.2013 IEEE International Conference on Computer Vision Work- shops, pages 554–561, 2013. 4
work page 2013
-
[22]
Learning multiple layers of features from tiny images
Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009. 4
work page 2009
-
[23]
Tiny imagenet visual recognition challenge.CS 231N, 7(7):3, 2015
Yann Le and Xuan Yang. Tiny imagenet visual recognition challenge.CS 231N, 7(7):3, 2015. 5
work page 2015
-
[24]
One prompt word is enough to boost adversarial robustness for pre-trained vision-language models
Lin Li, Haoyan Guan, Jianing Qiu, and Michael Spratling. One prompt word is enough to boost adversarial robustness for pre-trained vision-language models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24408–24419, 2024. 2
work page 2024
-
[25]
Defense against adversarial at- tacks using high-level representation guided denoiser
Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Xiaolin Hu, and Jun Zhu. Defense against adversarial at- tacks using high-level representation guided denoiser. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1778–1787, 2018. 1
work page 2018
-
[26]
Delving into the pixels of adversarial sam- ples, 2021
Blerta Lindqvist. Delving into the pixels of adversarial sam- ples, 2021. 3
work page 2021
-
[27]
Jiaxiang Liu, Jiawei Du, Xiao Liu, Prayag Tiwari, and Mingkun Xu. Self-calibrated consistency can fight back for adversarial robustness in vision-language models.arXiv preprint arXiv:2510.22785, 2025. 2
-
[28]
Xingjun Ma, Yifeng Gao, Yixu Wang, Ruofan Wang, Xin Wang, Ye Sun, Yifan Ding, Hengyuan Xu, Yunhao Chen, Yunhao Zhao, et al. Safety at scale: A comprehensive survey of large model and agent safety.Foundations and Trends® in Privacy and Security, 8(3-4):254–469, 2025. 1, 2
work page 2025
-
[29]
Towards deep learn- ing models resistant to adversarial attacks
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learn- ing models resistant to adversarial attacks. InInternational Conference on Learning Representations, 2018. 1
work page 2018
-
[30]
Towards deep learning models resistant to adversarial attacks, 2019
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks, 2019. 3 9
work page 2019
-
[31]
Fine-Grained Visual Classification of Aircraft
Subhransu Maji, Esa Rahtu, Juho Kannala, Matthew B. Blaschko, and Andrea Vedaldi. Fine-grained visual clas- sification of aircraft.ArXiv, abs/1306.5151, 2013. 4
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[32]
Understanding zero-shot adversarial robust- ness for large-scale models
Chengzhi Mao, Scott Geng, Junfeng Yang, Xin Wang, and Carl V ondrick. Understanding zero-shot adversarial robust- ness for large-scale models. InThe Eleventh International Conference on Learning Representations. 1, 2, 3, 5
-
[33]
Adversarial attacks are reversible with natural supervision
Chengzhi Mao, Mia Chiquier, Hao Wang, Junfeng Yang, and Carl V ondrick. Adversarial attacks are reversible with natural supervision. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 661–671, 2021. 3
work page 2021
-
[34]
Diffusion models for adversarial purification
Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, and Animashree Anandkumar. Diffusion models for adversarial purification. InInternational Conference on Machine Learning, pages 16805–16827. PMLR, 2022. 1
work page 2022
-
[35]
Maria-Elena Nilsback and Andrew Zisserman. Automated flower classification over a large number of classes.2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, pages 722–729, 2008. 4
work page 2008
-
[36]
Parkhi, Andrea Vedaldi, Andrew Zisserman, and C
Omkar M. Parkhi, Andrea Vedaldi, Andrew Zisserman, and C. V . Jawahar. Cats and dogs. InIEEE Conference on Com- puter Vision and Pattern Recognition, 2012. 4
work page 2012
-
[37]
Enhancing adversarial robustness via test-time transformation ensembling
Juan C Pérez, Motasem Alfarra, Guillaume Jeanneret, Laura Rueda, Ali Thabet, Bernard Ghanem, and Pablo Arbeláez. Enhancing adversarial robustness via test-time transformation ensembling. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 81–91, 2021. 2, 3, 4
work page 2021
-
[38]
Learning transferable visual models from natural language supervision
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. InInternational Conference on Machine Learning, 2021. 1, 4
work page 2021
-
[39]
Christian Schlarmann, Naman Deep Singh, Francesco Croce, and Matthias Hein. Robust clip: Unsupervised adversar- ial fine-tuning of vision embeddings for robust large vision- language models.ICML, 2024. 1, 2, 5
work page 2024
-
[40]
R-tpt: Improving adversarial robustness of vision-language mod- els through test-time prompt tuning
Lijun Sheng, Jian Liang, Zilei Wang, and Ran He. R-tpt: Improving adversarial robustness of vision-language mod- els through test-time prompt tuning. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 29958–29967, 2025. 1, 2, 3, 4, 5
work page 2025
-
[41]
Manli Shu, Weili Nie, De-An Huang, Zhiding Yu, Tom Gold- stein, Anima Anandkumar, and Chaowei Xiao. Test-time prompt tuning for zero-shot generalization in vision-language models.Advances in Neural Information Processing Systems, 35:14274–14289, 2022. 1, 2
work page 2022
-
[42]
Test-time alignment-enhanced adapter for vision-language models
Baoshun Tong, Kaiyu Song, and Hanjiang Lai. Test-time alignment-enhanced adapter for vision-language models. In ICASSP 2025-2025 IEEE International Conference on Acous- tics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2025. 2
work page 2025
-
[43]
Florian Tramer, Nicholas Carlini, Wieland Brendel, and Alek- sander Madry. On adaptive attacks to adversarial example defenses.Advances in neural information processing systems, 33:1633–1645, 2020. 7
work page 2020
-
[44]
Pre- trained model guided fine-tuning for zero-shot adversarial robustness
Sibo Wang, Jie Zhang, Zheng Yuan, and Shiguang Shan. Pre- trained model guided fine-tuning for zero-shot adversarial robustness. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 24502–24511,
-
[45]
Tapt: Test-time adversarial prompt tuning for robust inference in vision-language models
Xin Wang, Kai Chen, Jiaming Zhang, Jingjing Chen, and Xingjun Ma. Tapt: Test-time adversarial prompt tuning for robust inference in vision-language models. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 19910–19920, 2025. 2
work page 2025
-
[46]
Songlong Xing, Zhengyu Zhao, and Nicu Sebe. Clip is strong enough to fight back: Test-time counterattacks towards zero- shot adversarial robustness of clip. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 15172–15182, 2025. 1, 2, 3, 4, 5, 8
work page 2025
-
[47]
Adversarial attacks beyond the image space
Xiaohui Zeng, Chenxi Liu, Yu-Siang Wang, Weichao Qiu, Lingxi Xie, Yu-Wing Tai, Chi-Keung Tang, and Alan L Yuille. Adversarial attacks beyond the image space. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4302–4311, 2019. 3
work page 2019
-
[48]
Clipure: Purification in latent space via clip for adversarially robust zero-shot classification
Mingkun Zhang, Keping Bi, Wei Chen, Jiafeng Guo, and Xueqi Cheng. Clipure: Purification in latent space via clip for adversarially robust zero-shot classification. InThe Thirteenth International Conference on Learning Representations. 3
-
[49]
Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. Learning to prompt for vision-language models.Inter- national Journal of Computer Vision, 130:2337 – 2348, 2021. 2
work page 2021
-
[50]
Wanqi Zhou, Shuanghao Bai, Qibin Zhao, and Badong Chen. Revisiting the adversarial robustness of vision language mod- els: a multimodal perspective.CoRR, 2024. 1, 2 10 ATAC: Augmentation-Based Test-Time Adversarial Correction for CLIP Supplementary Material
work page 2024
-
[51]
Results Against Other Attacks To further validate the generality and reliability of ATAC, we extend our evaluation beyond the standard PGD setting (ϵ= 4/255 ) to two widely recognized and complemen- tary benchmarks: AutoAttack [10] and the Carlini–Wagner (CW) attack [5]. We use the “plus” version of AutoAttack that integrates six attacks, including both t...
-
[52]
On the Distribution of Consistency-Scores In Sec. 4.2 we argue that the augmentation-induced latent drift vectors are scattered for clean samples and consistent for adversarial inputs. To verify our claim, we analyze the distribution of τ-scores for clean and adversarial inputs, and report the separability of the two distributions. The last column of Fig....
-
[53]
5.3, we find that the effect of α is minimal while τ ∗ is crucial
Further Ablations In Sec. 5.3, we find that the effect of α is minimal while τ ∗ is crucial. In this section, we investigate the effect of different augmentation choices. To understand which aspects of augmentations contribute to performance, we construct five ablation settings. • default: the original setting used in our main experiments. • asymmetric: w...
-
[54]
Adaptive Attack Algorithms Here, we give the full pseudocodes for our attacks. The adaptive attack against our method is given in Algorithm 1, and the adaptive attack against TTC is given in Algorithm 2. In both pseudocodes, we use pred(·,·) as a shorthand for the calculation of class-wise logits (see Eq. (1)). σ denotes the sigmoid function. As in the ma...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.