pith. machine review for the scientific record. sign in

arxiv: 2511.17362 · v3 · submitted 2025-11-21 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

ATAC: Augmentation-Based Test-Time Adversarial Correction for CLIP

Pith reviewed 2026-05-17 20:20 UTC · model grok-4.3

classification 💻 cs.CV
keywords adversarial defensetest-time correctionCLIP embeddingsaugmentation driftsrobustnesszero-shot vision-languageangular consistency
0
0 comments X

The pith

ATAC corrects adversarial CLIP embeddings at test time by aligning augmentation-induced drift vectors to recover the original semantic direction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces ATAC as a test-time defense for CLIP models against adversarial perturbations on images. It calculates drift vectors from image augmentations in the embedding space and uses their angular consistency to infer and apply a correction toward the true semantic meaning. This avoids the high cost of adversarial fine-tuning and previous test-time methods that showed limited robustness. A reader would care because CLIP enables zero-shot image-text tasks but is easily fooled by small image changes, and an efficient fix could make such models safer for real use.

Core claim

Our method operates directly in the embedding space of CLIP, calculating augmentation-induced drift vectors to infer a semantic recovery direction and correcting the embedding based on the angular consistency of these latent drifts. Across a wide range of benchmarks, ATAC consistently achieves remarkably high robustness, surpassing that of previous state-of-the-art methods by nearly 50% on average, all while requiring minimal computational overhead.

What carries the argument

Augmentation-induced drift vectors in the CLIP embedding space, whose angular consistency determines the semantic recovery direction for correction.

If this is right

  • ATAC surpasses previous state-of-the-art test-time defenses by nearly 50 percent on average across benchmarks.
  • The method requires only minimal computational overhead at test time.
  • ATAC maintains high robustness even in unconventional and extreme settings.
  • It provides nontrivial robustness against adaptive attacks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same drift-consistency idea might apply to other multimodal models that share similar embedding spaces.
  • Combining this correction with natural distribution shifts could test whether angular consistency helps beyond adversarial cases.
  • The approach could scale to video or other modalities if augmentations produce comparable latent drifts.

Load-bearing premise

Augmentation-induced drift vectors reliably point toward a semantic recovery direction whose angular consistency can be used to correct the embedding without introducing new errors or requiring task-specific tuning.

What would settle it

Measuring whether the angular consistency of drift vectors from augmentations correlates with higher accuracy on standard adversarial image benchmarks after correction compared to the original perturbed embeddings.

Figures

Figures reproduced from arXiv: 2511.17362 by Andr\'as Balogh, Linxiang Su.

Figure 1
Figure 1. Figure 1: Overview of the ATAC framework. We use the visual [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Ablations on the cosine-consistency threshold [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: ROC curves of τ -scores of different augmentation settings on different datasets. 14 [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗
read the original abstract

Despite its remarkable success in zero-shot image-text matching, CLIP remains highly vulnerable to adversarial perturbations on images. As adversarial fine-tuning is prohibitively costly, recent works explore various test-time defense strategies; however, these approaches still exhibit limited robustness. In this work, we revisit this problem and propose a simple yet effective strategy: Augmentation-based Test-time Adversarial Correction (ATAC). Our method operates directly in the embedding space of CLIP, calculating augmentation-induced drift vectors to infer a semantic recovery direction and correcting the embedding based on the angular consistency of these latent drifts. Across a wide range of benchmarks, ATAC consistently achieves remarkably high robustness, surpassing that of previous state-of-the-art methods by nearly 50\% on average, all while requiring minimal computational overhead. Furthermore, ATAC retains state-of-the-art robustness in unconventional and extreme settings and even achieves nontrivial robustness against adaptive attacks. Our results demonstrate that ATAC is an efficient method in a novel paradigm for test-time adversarial defenses in the embedding space of CLIP. Code is available at: https://github.com/kylin0421/ATAC

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces ATAC, a test-time adversarial correction method for CLIP that operates in embedding space. It computes drift vectors induced by a set of augmentations, infers a semantic recovery direction from their angular consistency, and applies a deterministic correction to the perturbed embedding. The central empirical claim is that this yields nearly 50% average robustness gains over prior state-of-the-art test-time defenses across benchmarks, with low overhead, while retaining strong performance in extreme settings and nontrivial robustness to adaptive attacks.

Significance. If the empirical claims and the underlying geometric assumption hold under rigorous testing, ATAC would constitute a meaningful advance in efficient, training-free defenses for vision-language models. The embedding-space correction paradigm and the reported gains over existing test-time methods could influence practical deployment of robust zero-shot CLIP systems.

major comments (2)
  1. [§4.3] §4.3 (Adaptive Attack Evaluation): The reported 'nontrivial' robustness to adaptive attacks relies on standard PGD-style or expectation-over-transformation attacks rather than an attacker that directly optimizes the input to either align the observed drift vectors in a harmful direction or to drive their angular consistency below the correction threshold. Because the correction rule is a fixed function of the drift-consistency statistic, this omission leaves the 50% average gain claim dependent on an untested assumption about the geometry of augmentation drifts under targeted adaptation.
  2. [§3.2] §3.2 (Drift Vector Correction Rule): The method assumes that the principal direction of augmentation-induced drifts is reliably a semantic recovery direction orthogonal to the adversarial perturbation. No ablation or geometric analysis is provided showing that this direction remains stable when the adversary knows the exact augmentation set and can craft perturbations that exploit the deterministic correction; this assumption is load-bearing for both the robustness numbers and the claim of a 'novel paradigm'.
minor comments (2)
  1. [Abstract and §4.1] The abstract and §4.1 would be strengthened by reporting exact benchmark names, baseline methods, absolute accuracy numbers, and statistical significance rather than the summary phrase 'nearly 50% on average'.
  2. [Figure 2] Figure 2 (drift vector visualization) would benefit from clearer annotation of the angular consistency threshold and the exact correction vector applied.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. The comments raise important points about the strength of our adaptive attack evaluation and the geometric assumptions underlying the correction rule. We address each concern below with clarifications based on our existing experiments and indicate where revisions will be made to strengthen the paper.

read point-by-point responses
  1. Referee: [§4.3] §4.3 (Adaptive Attack Evaluation): The reported 'nontrivial' robustness to adaptive attacks relies on standard PGD-style or expectation-over-transformation attacks rather than an attacker that directly optimizes the input to either align the observed drift vectors in a harmful direction or to drive their angular consistency below the correction threshold. Because the correction rule is a fixed function of the drift-consistency statistic, this omission leaves the 50% average gain claim dependent on an untested assumption about the geometry of augmentation drifts under targeted adaptation.

    Authors: We appreciate the referee's observation on the nature of the adaptive attacks. Our evaluation includes both standard PGD attacks and Expectation-over-Transformation (EoT) attacks. The EoT formulation explicitly optimizes the loss while averaging over the same family of augmentations used by ATAC, which directly targets the drift vectors and their consistency. This provides a meaningful test of robustness under adaptation to the augmentation process. We acknowledge, however, that an even more specialized attack explicitly optimizing to minimize angular consistency or to invert the inferred recovery direction was not evaluated. We will revise §4.3 to clarify the relationship between EoT and the method's internal statistics and to discuss this as a limitation and avenue for future work. revision: partial

  2. Referee: [§3.2] §3.2 (Drift Vector Correction Rule): The method assumes that the principal direction of augmentation-induced drifts is reliably a semantic recovery direction orthogonal to the adversarial perturbation. No ablation or geometric analysis is provided showing that this direction remains stable when the adversary knows the exact augmentation set and can craft perturbations that exploit the deterministic correction; this assumption is load-bearing for both the robustness numbers and the claim of a 'novel paradigm'.

    Authors: The correction rule in §3.2 is motivated by the empirical finding that, across clean images, the principal component of augmentation-induced drifts aligns with directions that improve semantic alignment, while adversarial perturbations produce inconsistent or opposing drifts. This is quantified through the angular consistency metric reported in the paper. While we did not include an explicit ablation in which the adversary is given oracle knowledge of the precise augmentation parameters to craft perturbations that deliberately exploit the deterministic mapping, the EoT attacks already incorporate knowledge of the augmentation distribution during optimization. We agree that a dedicated geometric analysis (e.g., drift vector visualizations or stability tests under augmentation-set-aware adversaries) would provide stronger support. We will add such analysis to the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: explicit algorithmic correction rule defined directly from augmentations

full rationale

The paper defines ATAC as a direct, deterministic procedure operating in CLIP embedding space: compute augmentation-induced drift vectors, infer a semantic recovery direction from their angular consistency, and apply a correction. This is an explicit algorithmic rule with no fitted parameters tuned on target data, no self-referential definitions where the claimed recovery direction is constructed from the same quantity it predicts, and no load-bearing self-citations that reduce the central claim to unverified prior work by the authors. Robustness results are presented as empirical outcomes on benchmarks rather than derivations that collapse to the input definitions by construction. The method is self-contained against external evaluation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The method rests on the domain assumption that standard augmentations produce drifts whose angular statistics indicate a reliable semantic correction direction; no free parameters or invented entities are mentioned in the abstract.

axioms (1)
  • domain assumption Augmentations preserve enough semantic content that their induced drifts in embedding space can be aggregated via angular consistency to recover the correct direction.
    This premise is required for the correction step to be meaningful and is implicit in the method description.

pith-pipeline@v0.9.0 · 5493 in / 1232 out tokens · 28952 ms · 2026-05-17T20:20:08.755578+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages · 1 internal anchor

  1. [1]

    Align your prompts: Test-time prompting with distribution align- ment for zero-shot generalization.Advances in Neural Infor- mation Processing Systems, 36:80396–80413, 2023

    Jameel Abdul Samadh, Mohammad Hanan Gani, Noor Hus- sein, Muhammad Uzair Khattak, Muhammad Muzammal Naseer, Fahad Shahbaz Khan, and Salman H Khan. Align your prompts: Test-time prompting with distribution align- ment for zero-shot generalization.Advances in Neural Infor- mation Processing Systems, 36:80396–80413, 2023. 2

  2. [2]

    Square attack: a query-efficient black-box adversarial attack via random search

    Maksym Andriushchenko, Francesco Croce, Nicolas Flam- marion, and Matthias Hein. Square attack: a query-efficient black-box adversarial attack via random search. InEuropean conference on computer vision, pages 484–501. Springer,

  3. [3]

    Synthesizing robust adversarial examples

    Anish Athalye, Logan Engstrom, Andrew Ilyas, and Kevin Kwok. Synthesizing robust adversarial examples. InInter- national conference on machine learning, pages 284–293. PMLR, 2018. 7

  4. [4]

    Food-101 - mining discriminative components with random forests

    Lukas Bossard, Matthieu Guillaumin, and Luc Van Gool. Food-101 - mining discriminative components with random forests. InEuropean Conference on Computer Vision, 2014. 4

  5. [5]

    Towards evaluating the robustness of neural networks

    Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In2017 ieee symposium on security and privacy (sp), pages 39–57. Ieee, 2017. 1, 2, 4, 11

  6. [6]

    On evaluating ad- versarial robustness, 2019

    Nicholas Carlini, Anish Athalye, Nicolas Papernot, Wieland Brendel, Jonas Rauber, Dimitris Tsipras, Ian Goodfellow, Aleksander Madry, and Alexey Kurakin. On evaluating ad- versarial robustness, 2019. 7

  7. [7]

    Describing textures in the wild.2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 3606–3613, 2013

    Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. Describing textures in the wild.2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 3606–3613, 2013. 4

  8. [8]

    Ng, and Honglak Lee

    Adam Coates, A. Ng, and Honglak Lee. An analysis of single-layer networks in unsupervised feature learning. In International Conference on Artificial Intelligence and Statis- tics, 2011. 4

  9. [9]

    Certified adversarial robustness via randomized smoothing

    Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. Certified adversarial robustness via randomized smoothing. Ininter- national conference on machine learning, pages 1310–1320. PMLR, 2019. 1, 2

  10. [10]

    Reliable evaluation of adversarial robustness with an ensemble of diverse parameter- free attacks

    Francesco Croce and Matthias Hein. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter- free attacks. InInternational conference on machine learning, pages 2206–2216. PMLR, 2020. 4, 11

  11. [11]

    Em- bedding shift dissection on clip: Effects of augmentations on vlm’s representation learning

    Ashim Dahal, Saydul Akbar Murad, and Nick Rahimi. Em- bedding shift dissection on clip: Effects of augmentations on vlm’s representation learning. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 4814–4818,

  12. [12]

    Fpt- noise: Dynamic scene-aware counterattack for test-time ad- versarial defense in vision-language models, 2025

    Jia Deng, Jin Li, Zhenhua Zhao, and Shaowei Wang. Fpt- noise: Dynamic scene-aware counterattack for test-time ad- versarial defense in vision-language models, 2025. 2

  13. [13]

    One-shot learning of object categories.IEEE Transactions on Pattern Analysis and Machine Intelligence, 28:594–611, 2006

    Li Fei-Fei, Rob Fergus, and Pietro Perona. One-shot learning of object categories.IEEE Transactions on Pattern Analysis and Machine Intelligence, 28:594–611, 2006. 4

  14. [14]

    Diverse data augmentation with diffusions for effective test-time prompt tuning, 2023

    Chun-Mei Feng, Kai Yu, Yong Liu, Salman Khan, and Wang- meng Zuo. Diverse data augmentation with diffusions for effective test-time prompt tuning, 2023. 2

  15. [15]

    Clip- adapter: Better vision-language models with feature adapters

    Peng Gao, Shijie Geng, Renrui Zhang, Teli Ma, Rongyao Fang, Yongfeng Zhang, Hongsheng Li, and Yu Qiao. Clip- adapter: Better vision-language models with feature adapters. International Journal of Computer Vision, 132(2):581–595,

  16. [16]

    Goodfellow, Jonathon Shlens, and Christian Szegedy

    Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples, 2015. 1, 2

  17. [17]

    Caltech-256 object category dataset

    Gregory Griffin, Alex Holub, Pietro Perona, et al. Caltech-256 object category dataset. Technical report, Technical Report 7694, California Institute of Technology Pasadena, 2007. 4

  18. [18]

    Countering adversarial images using input transformations, 2018

    Chuan Guo, Mayank Rana, Moustapha Cisse, and Laurens van der Maaten. Countering adversarial images using input transformations, 2018. 1

  19. [19]

    Dengel, and Damian Borth

    Patrick Helber, Benjamin Bischke, Andreas R. Dengel, and Damian Borth. Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12:2217–2226, 2017. 4

  20. [20]

    Adversarial examples are not bugs, they are features.Advances in neural information processing systems, 32, 2019

    Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Logan Engstrom, Brandon Tran, and Aleksander Madry. Adversarial examples are not bugs, they are features.Advances in neural information processing systems, 32, 2019. 1, 2, 3

  21. [21]

    3d object representations for fine-grained categorization.2013 IEEE International Conference on Computer Vision Work- shops, pages 554–561, 2013

    Jonathan Krause, Michael Stark, Jia Deng, and Li Fei-Fei. 3d object representations for fine-grained categorization.2013 IEEE International Conference on Computer Vision Work- shops, pages 554–561, 2013. 4

  22. [22]

    Learning multiple layers of features from tiny images

    Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009. 4

  23. [23]

    Tiny imagenet visual recognition challenge.CS 231N, 7(7):3, 2015

    Yann Le and Xuan Yang. Tiny imagenet visual recognition challenge.CS 231N, 7(7):3, 2015. 5

  24. [24]

    One prompt word is enough to boost adversarial robustness for pre-trained vision-language models

    Lin Li, Haoyan Guan, Jianing Qiu, and Michael Spratling. One prompt word is enough to boost adversarial robustness for pre-trained vision-language models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24408–24419, 2024. 2

  25. [25]

    Defense against adversarial at- tacks using high-level representation guided denoiser

    Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Xiaolin Hu, and Jun Zhu. Defense against adversarial at- tacks using high-level representation guided denoiser. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1778–1787, 2018. 1

  26. [26]

    Delving into the pixels of adversarial sam- ples, 2021

    Blerta Lindqvist. Delving into the pixels of adversarial sam- ples, 2021. 3

  27. [27]

    Self-calibrated consistency can fight back for adversarial robustness in vision-language models.arXiv preprint arXiv:2510.22785, 2025

    Jiaxiang Liu, Jiawei Du, Xiao Liu, Prayag Tiwari, and Mingkun Xu. Self-calibrated consistency can fight back for adversarial robustness in vision-language models.arXiv preprint arXiv:2510.22785, 2025. 2

  28. [28]

    Safety at scale: A comprehensive survey of large model and agent safety.Foundations and Trends® in Privacy and Security, 8(3-4):254–469, 2025

    Xingjun Ma, Yifeng Gao, Yixu Wang, Ruofan Wang, Xin Wang, Ye Sun, Yifan Ding, Hengyuan Xu, Yunhao Chen, Yunhao Zhao, et al. Safety at scale: A comprehensive survey of large model and agent safety.Foundations and Trends® in Privacy and Security, 8(3-4):254–469, 2025. 1, 2

  29. [29]

    Towards deep learn- ing models resistant to adversarial attacks

    Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learn- ing models resistant to adversarial attacks. InInternational Conference on Learning Representations, 2018. 1

  30. [30]

    Towards deep learning models resistant to adversarial attacks, 2019

    Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks, 2019. 3 9

  31. [31]

    Fine-Grained Visual Classification of Aircraft

    Subhransu Maji, Esa Rahtu, Juho Kannala, Matthew B. Blaschko, and Andrea Vedaldi. Fine-grained visual clas- sification of aircraft.ArXiv, abs/1306.5151, 2013. 4

  32. [32]

    Understanding zero-shot adversarial robust- ness for large-scale models

    Chengzhi Mao, Scott Geng, Junfeng Yang, Xin Wang, and Carl V ondrick. Understanding zero-shot adversarial robust- ness for large-scale models. InThe Eleventh International Conference on Learning Representations. 1, 2, 3, 5

  33. [33]

    Adversarial attacks are reversible with natural supervision

    Chengzhi Mao, Mia Chiquier, Hao Wang, Junfeng Yang, and Carl V ondrick. Adversarial attacks are reversible with natural supervision. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 661–671, 2021. 3

  34. [34]

    Diffusion models for adversarial purification

    Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, and Animashree Anandkumar. Diffusion models for adversarial purification. InInternational Conference on Machine Learning, pages 16805–16827. PMLR, 2022. 1

  35. [35]

    Automated flower classification over a large number of classes.2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, pages 722–729, 2008

    Maria-Elena Nilsback and Andrew Zisserman. Automated flower classification over a large number of classes.2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, pages 722–729, 2008. 4

  36. [36]

    Parkhi, Andrea Vedaldi, Andrew Zisserman, and C

    Omkar M. Parkhi, Andrea Vedaldi, Andrew Zisserman, and C. V . Jawahar. Cats and dogs. InIEEE Conference on Com- puter Vision and Pattern Recognition, 2012. 4

  37. [37]

    Enhancing adversarial robustness via test-time transformation ensembling

    Juan C Pérez, Motasem Alfarra, Guillaume Jeanneret, Laura Rueda, Ali Thabet, Bernard Ghanem, and Pablo Arbeláez. Enhancing adversarial robustness via test-time transformation ensembling. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 81–91, 2021. 2, 3, 4

  38. [38]

    Learning transferable visual models from natural language supervision

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. InInternational Conference on Machine Learning, 2021. 1, 4

  39. [39]

    Robust clip: Unsupervised adversar- ial fine-tuning of vision embeddings for robust large vision- language models.ICML, 2024

    Christian Schlarmann, Naman Deep Singh, Francesco Croce, and Matthias Hein. Robust clip: Unsupervised adversar- ial fine-tuning of vision embeddings for robust large vision- language models.ICML, 2024. 1, 2, 5

  40. [40]

    R-tpt: Improving adversarial robustness of vision-language mod- els through test-time prompt tuning

    Lijun Sheng, Jian Liang, Zilei Wang, and Ran He. R-tpt: Improving adversarial robustness of vision-language mod- els through test-time prompt tuning. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 29958–29967, 2025. 1, 2, 3, 4, 5

  41. [41]

    Test-time prompt tuning for zero-shot generalization in vision-language models.Advances in Neural Information Processing Systems, 35:14274–14289, 2022

    Manli Shu, Weili Nie, De-An Huang, Zhiding Yu, Tom Gold- stein, Anima Anandkumar, and Chaowei Xiao. Test-time prompt tuning for zero-shot generalization in vision-language models.Advances in Neural Information Processing Systems, 35:14274–14289, 2022. 1, 2

  42. [42]

    Test-time alignment-enhanced adapter for vision-language models

    Baoshun Tong, Kaiyu Song, and Hanjiang Lai. Test-time alignment-enhanced adapter for vision-language models. In ICASSP 2025-2025 IEEE International Conference on Acous- tics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2025. 2

  43. [43]

    On adaptive attacks to adversarial example defenses.Advances in neural information processing systems, 33:1633–1645, 2020

    Florian Tramer, Nicholas Carlini, Wieland Brendel, and Alek- sander Madry. On adaptive attacks to adversarial example defenses.Advances in neural information processing systems, 33:1633–1645, 2020. 7

  44. [44]

    Pre- trained model guided fine-tuning for zero-shot adversarial robustness

    Sibo Wang, Jie Zhang, Zheng Yuan, and Shiguang Shan. Pre- trained model guided fine-tuning for zero-shot adversarial robustness. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 24502–24511,

  45. [45]

    Tapt: Test-time adversarial prompt tuning for robust inference in vision-language models

    Xin Wang, Kai Chen, Jiaming Zhang, Jingjing Chen, and Xingjun Ma. Tapt: Test-time adversarial prompt tuning for robust inference in vision-language models. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 19910–19920, 2025. 2

  46. [46]

    Clip is strong enough to fight back: Test-time counterattacks towards zero- shot adversarial robustness of clip

    Songlong Xing, Zhengyu Zhao, and Nicu Sebe. Clip is strong enough to fight back: Test-time counterattacks towards zero- shot adversarial robustness of clip. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 15172–15182, 2025. 1, 2, 3, 4, 5, 8

  47. [47]

    Adversarial attacks beyond the image space

    Xiaohui Zeng, Chenxi Liu, Yu-Siang Wang, Weichao Qiu, Lingxi Xie, Yu-Wing Tai, Chi-Keung Tang, and Alan L Yuille. Adversarial attacks beyond the image space. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4302–4311, 2019. 3

  48. [48]

    Clipure: Purification in latent space via clip for adversarially robust zero-shot classification

    Mingkun Zhang, Keping Bi, Wei Chen, Jiafeng Guo, and Xueqi Cheng. Clipure: Purification in latent space via clip for adversarially robust zero-shot classification. InThe Thirteenth International Conference on Learning Representations. 3

  49. [49]

    Learning to prompt for vision-language models.Inter- national Journal of Computer Vision, 130:2337 – 2348, 2021

    Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. Learning to prompt for vision-language models.Inter- national Journal of Computer Vision, 130:2337 – 2348, 2021. 2

  50. [50]

    Revisiting the adversarial robustness of vision language mod- els: a multimodal perspective.CoRR, 2024

    Wanqi Zhou, Shuanghao Bai, Qibin Zhao, and Badong Chen. Revisiting the adversarial robustness of vision language mod- els: a multimodal perspective.CoRR, 2024. 1, 2 10 ATAC: Augmentation-Based Test-Time Adversarial Correction for CLIP Supplementary Material

  51. [51]

    Results Against Other Attacks To further validate the generality and reliability of ATAC, we extend our evaluation beyond the standard PGD setting (ϵ= 4/255 ) to two widely recognized and complemen- tary benchmarks: AutoAttack [10] and the Carlini–Wagner (CW) attack [5]. We use the “plus” version of AutoAttack that integrates six attacks, including both t...

  52. [52]

    4.2 we argue that the augmentation-induced latent drift vectors are scattered for clean samples and consistent for adversarial inputs

    On the Distribution of Consistency-Scores In Sec. 4.2 we argue that the augmentation-induced latent drift vectors are scattered for clean samples and consistent for adversarial inputs. To verify our claim, we analyze the distribution of τ-scores for clean and adversarial inputs, and report the separability of the two distributions. The last column of Fig....

  53. [53]

    5.3, we find that the effect of α is minimal while τ ∗ is crucial

    Further Ablations In Sec. 5.3, we find that the effect of α is minimal while τ ∗ is crucial. In this section, we investigate the effect of different augmentation choices. To understand which aspects of augmentations contribute to performance, we construct five ablation settings. • default: the original setting used in our main experiments. • asymmetric: w...

  54. [54]

    nearly hard

    Adaptive Attack Algorithms Here, we give the full pseudocodes for our attacks. The adaptive attack against our method is given in Algorithm 1, and the adaptive attack against TTC is given in Algorithm 2. In both pseudocodes, we use pred(·,·) as a shorthand for the calculation of class-wise logits (see Eq. (1)). σ denotes the sigmoid function. As in the ma...