RaPA: Enhancing Transferable Targeted Attacks via Random Parameter Pruning

Qingbin Li; Shengyu Zhu; Tongrui Su; Wei Chen; Xueqi Cheng

arxiv: 2504.18594 · v3 · submitted 2025-04-24 · 💻 cs.LG · cs.AI

RaPA: Enhancing Transferable Targeted Attacks via Random Parameter Pruning

Tongrui Su , Qingbin Li , Shengyu Zhu , Wei Chen , Xueqi Cheng This is my paper

Pith reviewed 2026-05-22 18:36 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords adversarial attackstransferabilitytargeted attacksparameter pruningsurrogate modelsblack-box attacksCNNTransformer

0 comments

The pith

Randomly pruning surrogate model parameters at each attack step boosts targeted transfer success rates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that targeted adversarial examples from current methods depend too much on a narrow slice of parameters inside the surrogate model, which hurts how well they work on new target models. RaPA fixes this by randomly dropping parameters during each step of the optimization, creating several different but still useful versions of the surrogate. The authors argue this step is mathematically the same as adding a regularizer that balances parameter importance, so the attack no longer overfits to any small group of weights. Tests on both CNN and transformer models confirm the change raises attack success rates, most noticeably when the surrogate is a CNN and the target is a transformer. Readers care because better transfer makes black-box targeted attacks more reliable when the defender’s exact model is unknown.

Core claim

Adversarial examples generated by existing methods rely heavily on a small subset of surrogate model parameters, which limits their transferability to unseen target models. RaPA introduces parameter-level randomization during the attack process: at each optimization step it randomly prunes model parameters to generate diverse yet semantically consistent surrogate variants. This randomization is equivalent to adding an importance-equalization regularizer, thereby alleviating the over-reliance issue and raising transfer attack success rates.

What carries the argument

Random Parameter Pruning (RaPA): at each optimization step, randomly prunes a subset of surrogate model parameters to produce varied but semantically equivalent attack trajectories; the pruning acts as an importance-equalization regularizer.

Load-bearing premise

Existing attack methods over-rely on a small subset of surrogate parameters, and randomly pruning parameters at each step yields diverse yet still effective surrogate variants without lowering attack quality.

What would settle it

A controlled experiment that applies RaPA-style random pruning but measures no gain (or a drop) in average transfer attack success rate compared with the same baseline method run without pruning.

Figures

Figures reproduced from arXiv: 2504.18594 by Qingbin Li, Shengyu Zhu, Tongrui Su, Wei Chen, Xueqi Cheng.

**Figure 2.** Figure 2: Average ASRs along optimization iterations. Here [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Average ASRs with different iterations ( [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Attention heatmap of adversarial example. The true label is ‘black swan’ and the target label is ‘weasel’. The intensity of red [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

**Figure 5.** Figure 5: Attention heatmap of adversarial example. The true label is ‘cinema’ and the target label is ‘croquet ball’. The intensity of red [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗

**Figure 6.** Figure 6: ASRs(%) averaged over 16 models with varying DropConnect probabilities, using ResNet-50 as surrogate . [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗

**Figure 7.** Figure 7: Diversity and Utility with increasing mask ratio. [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗

read the original abstract

Compared to untargeted attacks, targeted transfer-based attack is still suffering from much lower Attack Success Rates (ASRs), although significant improvements have been achieved by kinds of methods, such as diversifying input, stabilizing the gradient, and re-training surrogate models. In this paper, we find that adversarial examples generated by existing methods rely heavily on a small subset of surrogate model parameters, which in turn limits their transferability to unseen target models. Inspired by this, we propose the Random Parameter Pruning Attack (RaPA), which introduces parameter-level randomization during the attack process. At each optimization step, RaPA randomly prunes model parameters to generate diverse yet semantically consistent surrogate variants.We show this parameter-level randomization is equivalent to adding an importance-equalization regularizer, thereby alleviating the over-reliance issue. Extensive experiments across both CNN and Transformer architectures demonstrate that RaPA substantially enhances transferability. In the challenging case of transferring from CNN-based to Transformer-based models, RaPA achieves up to 11.7% higher average ASRs than state-of-the-art baselines(with 33.3% ASRs), while being training-free, cross-architecture efficient, and easily integrated into existing attack frameworks. Code is available in https://github.com/molarsu/RaPA.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RaPA shows clear transfer gains from random parameter pruning in targeted attacks, but the regularization equivalence needs a tighter derivation to pin down the mechanism.

read the letter

The main takeaway is that adding random parameter pruning inside the surrogate during targeted attack optimization lifts transfer success rates, with the biggest reported edge in CNN-to-Transformer cases reaching 11.7% higher average ASR than recent baselines. The method stays training-free and slots into existing frameworks, which keeps the barrier low. Experiments run across several architectures and report consistent improvements, and the authors release code, so the numbers are at least checkable in principle. That combination of a simple randomization trick and measurable cross-architecture gains is the concrete advance here. The paper also tries to explain the effect by linking the pruning to an importance-equalization regularizer, which is a reasonable direction if it holds up. The empirical side looks solid enough on the surface: held-out target models, multiple source-target pairs, and direct comparison to prior diversification and stabilization methods. If the full experiments control for pruning ratio and step count without hidden tuning, the lift is worth noting for anyone stress-testing transfer robustness. The softer part is the claimed equivalence. The abstract states that random pruning produces the same effect as an additive regularizer that equalizes parameter importance, but without seeing the exact expectation over binary masks or how it modifies the attack loss, it is difficult to rule out that the gains simply come from extra gradient noise or implicit ensembling. If that link is only approximate or breaks at higher pruning rates in Transformer layers, the 33% ASR figure might over-attribute the improvement. A minor additional concern is whether the pruning probability was chosen after seeing results on the test transfers; that would weaken the story. Overall this is a narrow but practical increment inside adversarial robustness work. Readers who run attack evaluations or build defenses against transferable targeted examples will get the most out of it. The empirical pattern is clear enough and the idea is easy to reproduce, so it deserves a serious referee rather than a desk reject. The derivation gap can be fixed in revision.

Referee Report

2 major / 2 minor

Summary. The paper claims that targeted transfer-based adversarial attacks over-rely on a small subset of surrogate-model parameters, limiting cross-model transferability. It proposes RaPA, which performs per-step random parameter pruning on the surrogate to produce diverse yet semantically consistent variants; the authors assert that this randomization is mathematically equivalent to adding an importance-equalization regularizer to the attack loss. Experiments across CNN and Transformer architectures report consistent ASR gains, with up to 11.7% improvement (reaching 33.3% ASR) when transferring from CNN surrogates to Transformer targets, while remaining training-free and compatible with existing frameworks.

Significance. If the claimed equivalence holds and the observed gains are attributable to importance equalization rather than generic stochastic effects, RaPA would supply a lightweight, architecture-agnostic technique for improving transferable targeted attacks. The reported cross-architecture lift and the provision of public code are concrete strengths that would be of immediate practical value to the adversarial robustness community.

major comments (2)

[§3.2] §3.2 (Method): The central claim that random parameter pruning is equivalent to an importance-equalization regularizer is asserted without an explicit derivation. The manuscript must show how the expectation over binary pruning masks produces an additive term that equalizes parameter importance in the attack objective (rather than merely injecting gradient noise or performing implicit ensembling). Without this step, it remains unclear whether the 11.7% ASR improvement in CNN-to-Transformer transfer is mechanistically explained by the regularizer or by unaccounted stochastic effects.
[§4.3] §4.3 (Experiments, CNN-to-Transformer table): The reported 11.7% average ASR gain is load-bearing for the cross-architecture claim. The paper should report per-target-model standard deviations over multiple random seeds and confirm that the gain remains statistically significant when the same random-pruning schedule is replaced by simple gradient noise of matched variance.

minor comments (2)

[Abstract] Abstract: the parenthetical “(with 33.3% ASRs)” is missing a space before the parenthesis.
[§2] §2 (Related Work): the discussion of prior gradient-stabilization and input-diversification methods should explicitly contrast their regularization effects with the parameter-level randomization introduced here.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for the constructive feedback on our manuscript. We address each major comment below and outline the revisions we plan to make to strengthen the paper.

read point-by-point responses

Referee: [§3.2] §3.2 (Method): The central claim that random parameter pruning is equivalent to an importance-equalization regularizer is asserted without an explicit derivation. The manuscript must show how the expectation over binary pruning masks produces an additive term that equalizes parameter importance in the attack objective (rather than merely injecting gradient noise or performing implicit ensembling). Without this step, it remains unclear whether the 11.7% ASR improvement in CNN-to-Transformer transfer is mechanistically explained by the regularizer or by unaccounted stochastic effects.

Authors: We thank the referee for this insightful comment. We acknowledge that an explicit derivation would strengthen the presentation of our method. In the revised manuscript, we will add a detailed derivation in §3.2 demonstrating how the expectation over the binary pruning masks M leads to an additive term in the loss that equalizes the importance of parameters. This will be shown by expanding E_M[L(f(x; θ ⊙ M))], where the resulting regularizer discourages over-reliance on any subset of parameters. We will also discuss why this differs from simple gradient noise or ensembling effects. This revision will clarify the mechanistic basis for the observed transferability gains. revision: yes
Referee: [§4.3] §4.3 (Experiments, CNN-to-Transformer table): The reported 11.7% average ASR gain is load-bearing for the cross-architecture claim. The paper should report per-target-model standard deviations over multiple random seeds and confirm that the gain remains statistically significant when the same random-pruning schedule is replaced by simple gradient noise of matched variance.

Authors: We agree with the referee that additional statistical analysis would bolster the experimental claims. In the revised manuscript, we will report per-target-model standard deviations for the ASR results over multiple random seeds. Additionally, we will perform and include a comparison experiment where random parameter pruning is replaced by injecting gradient noise with matched variance, and verify that the ASR improvements remain statistically significant. This will help attribute the gains specifically to the importance-equalization regularizer rather than generic stochastic effects. revision: yes

Circularity Check

0 steps flagged

No significant circularity; equivalence claim and transfer gains rest on external validation rather than self-referential reduction

full rationale

The paper asserts an equivalence between per-step random parameter pruning and an importance-equalization regularizer, then validates the resulting transfer gains on held-out target models (including CNN-to-Transformer cases) that are independent of the surrogate training data and fitted parameters. No load-bearing step reduces by construction to a fitted quantity or prior self-citation; the central mechanism is presented as a derived property of the expectation over binary masks, and the reported ASR improvements are measured against external baselines on unseen architectures. This satisfies the criteria for a self-contained derivation against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The method rests on standard assumptions about model behavior under pruning and the existence of a small influential parameter subset; no new physical entities or ad-hoc constants are introduced beyond typical attack hyperparameters.

free parameters (1)

pruning probability or ratio
The fraction or probability of parameters pruned at each step is a tunable hyperparameter required for the randomization procedure.

axioms (1)

domain assumption Randomly pruned surrogate variants remain semantically consistent for the purpose of generating transferable adversarial examples.
Invoked when stating that pruning produces diverse yet valid attack directions.

pith-pipeline@v0.9.0 · 5761 in / 1135 out tokens · 42334 ms · 2026-05-22T18:36:34.834547+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We show that this parameter-level randomization is equivalent to adding an importance-equalization regularizer... EM[L(f(xadv;M⊙θ))]≈L(f(xadv;θ))+p(1−p)/2 Σi ∂²L/∂θi² θi²
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

RaPA randomly prunes model parameters to generate diverse yet semantically consistent surrogate variants

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages

[1]

Improving the transferabil- ity of targeted adversarial examples through object-based di- verse input

Junyoung Byun, Seungju Cho, Myung-Joon Kwon, Hee- Seon Kim, and Changick Kim. Improving the transferabil- ity of targeted adversarial examples through object-based di- verse input. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022. 1, 3, 6, 12

work page 2022
[2]

Introducing competition to boost the transferability of targeted adversarial examples through clean feature mixup

Junyoung Byun, Myung-Joon Kwon, Seungju Cho, Yoonji Kim, and Changick Kim. Introducing competition to boost the transferability of targeted adversarial examples through clean feature mixup. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023. 1, 3, 5, 6, 12

work page 2023
[3]

Rethinking model ensemble in transfer-based adversarial at- tacks

Huanran Chen, Yichi Zhang, Yinpeng Dong, and Junyi Zhu. Rethinking model ensemble in transfer-based adversarial at- tacks. InInternational Conference on Learning Representa- tions, 2024. 5, 12

work page 2024
[4]

Xception: Deep learning with depthwise separable convolutions

Franc ¸ois Chollet. Xception: Deep learning with depthwise separable convolutions. InIEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2017. 5

work page 2017
[5]

Twins: Revisiting the design of spatial attention in vision transformers

Xiangxiang Chu, Zhi Tian, Yuqing Wang, Bo Zhang, Haib- ing Ren, Xiaolin Wei, Huaxia Xia, and Chunhua Shen. Twins: Revisiting the design of spatial attention in vision transformers. InAdvances in Neural Information Processing Systems, 2021. 6

work page 2021
[6]

Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2009. 6

work page 2009
[7]

Boosting adversarial at- tacks with momentum

Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. Boosting adversarial at- tacks with momentum. InIEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2018. 1, 3, 5, 6, 12

work page 2018
[8]

Evading defenses to transferable adversarial examples by translation-invariant attacks

Yinpeng Dong, Tianyu Pang, Hang Su, and Jun Zhu. Evading defenses to transferable adversarial examples by translation-invariant attacks. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. 3, 6, 12

work page 2019
[9]

An image is worth 16x16 words: Transformers for image recognition at scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, G Heigold, S Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. InInternational Conference on Learning Representations, 2020. 1, 5

work page 2020
[10]

Convit: Improving vision transformers with soft convolutional inductive biases

St ´ephane d’Ascoli, Hugo Touvron, Matthew L Leavitt, Ari S Morcos, Giulio Biroli, and Levent Sagun. Convit: Improving vision transformers with soft convolutional inductive biases. InInternational Conference on Machine Learning, 2021. 6

work page 2021
[11]

Depgraph: Towards any structural pruning

Gongfan Fang, Xinyin Ma, Mingli Song, Michael Bi Mi, and Xinchao Wang. Depgraph: Towards any structural pruning. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023. 13

work page 2023
[12]

Explaining and harnessing adversarial examples

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2015. 1, 2, 12

work page 2015
[13]

Levit: a vision transformer in convnet’s clothing for faster inference

Benjamin Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Herv ´e J ´egou, and Matthijs Douze. Levit: a vision transformer in convnet’s clothing for faster inference. InInternational Conference on Computer Vision, 2021. 6

work page 2021
[14]

A survey on transferability of adversar- ial examples across deep neural networks.Transactions on Machine Learning Research, 2024

Jindong Gu, Xiaojun Jia, Pau de Jorge, Wenqian Yu, Xin- wei Liu, Avery Ma, Yuan Xun, Anjun Hu, Ashkan Khakzar, Zhijiang Li, et al. A survey on transferability of adversar- ial examples across deep neural networks.Transactions on Machine Learning Research, 2024. 1

work page 2024
[15]

Countering adversarial images using input transformations

Chuan Guo, Mayank Rana, Moustapha Cisse, and Laurens Van Der Maaten. Countering adversarial images using input transformations. InInternational Conference on Learning Representations, 2018. 8

work page 2018
[16]

Mambavision: A hybrid mamba-transformer vision backbone

Ali Hatamizadeh and Jan Kautz. Mambavision: A hybrid mamba-transformer vision backbone. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025. 15

work page 2025
[17]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InIEEE/CVF Conference on Computer Vision and Pattern Recognition,

work page
[18]

Rethinking spatial dimensions of vision transformers

Byeongho Heo, Sangdoo Yun, Dongyoon Han, Sanghyuk Chun, Junsuk Choe, and Seong Joon Oh. Rethinking spatial dimensions of vision transformers. InInternational Confer- ence on Computer Vision, 2021. 6

work page 2021
[19]

Densely connected convolutional net- works

Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kil- ian Q Weinberger. Densely connected convolutional net- works. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017. 1, 2, 5

work page 2017
[20]

T-sea: Transfer-based self-ensemble attack on object detection

Hao Huang, Ziyan Chen, Huanran Chen, Yongtao Wang, and Kevin Zhang. T-sea: Transfer-based self-ensemble attack on object detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023. 1, 3, 5, 12

work page 2023
[21]

Adversarial attacks and de- fences competition

Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, et al. Adversarial attacks and de- fences competition. InThe NIPS’17 Competition: Building Intelligent Systems, 2018. 8

work page 2018
[22]

Adversarial attacks and de- fences competition

Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, et al. Adversarial attacks and de- fences competition. InThe NIPS’17 Competition: Building Intelligent Systems, pages 195–231. Springer, 2018. 5

work page 2018
[23]

Ad- versarial examples in the physical world

Alexey Kurakin, Ian J Goodfellow, and Samy Bengio. Ad- versarial examples in the physical world. InArtificial In- telligence Safety and Security, pages 99–112. Chapman and Hall/CRC, 2018. 2, 12

work page 2018
[24]

Optimal brain damage.Advances in neural information processing systems, 2, 1989

Yann LeCun, John Denker, and Sara Solla. Optimal brain damage.Advances in neural information processing systems, 2, 1989. 3

work page 1989
[25]

Towards transferable targeted attack

Maosen Li, Cheng Deng, Tengjiao Li, Junchi Yan, Xinbo Gao, and Heng Huang. Towards transferable targeted attack. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. 12

work page 2020
[26]

Learning transferable adversarial examples via ghost networks

Yingwei Li, Song Bai, Yuyin Zhou, Cihang Xie, Zhishuai Zhang, and Alan Yuille. Learning transferable adversarial examples via ghost networks. InAAAI Conference on Artifi- cial Intelligence, 2020. 1, 2, 3, 5, 12, 15

work page 2020
[27]

Improving transferable targeted attacks with fea- ture tuning mixup

Kaisheng Liang, Xuelong Dai, Yanjie Li, Dong Wang, and Bin Xiao. Improving transferable targeted attacks with fea- ture tuning mixup. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 25802–25811, 2025. 1, 3, 12

work page 2025
[28]

Defense against adversarial at- tacks using high-level representation guided denoiser

Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Xiaolin Hu, and Jun Zhu. Defense against adversarial at- tacks using high-level representation guided denoiser. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. 8

work page 2018
[29]

Hopcroft

Jiadong Lin, Chuanbiao Song, Kun He, Liwei Wang, and John E. Hopcroft. Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks. InInternational Confer- ence on Learning Representations, 2020. 3, 6, 12

work page 2020
[30]

Scaling laws for black box adversarial attacks

Chuan Liu, Huanran Chen, Yichi Zhang, Yinpeng Dong, and Jun Zhu. Scaling laws for black box adversarial attacks. arXiv preprint arXiv:2411.16782, 2024. 1, 5, 12

work page arXiv 2024
[31]

Delving into transferable adversarial examples and black-box attacks

Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Xiaodong Song. Delving into transferable adversarial examples and black-box attacks. InInternational Conference on Learning Representations, 2017. 1, 5, 12

work page 2017
[32]

On the robustness of vision transformers to adversarial ex- amples

Kaleel Mahmood, Rigel Mahmood, and Marten Van Dijk. On the robustness of vision transformers to adversarial ex- amples. InInternational Conference on Computer Vision,

work page
[33]

Importance estimation for neural network pruning

Pavlo Molchanov, Arun Mallya, Stephen Tyree, Iuri Frosio, and Jan Kautz. Importance estimation for neural network pruning. InIEEE/CVF conference on computer vision and pattern recognition, 2019. 3, 13

work page 2019
[34]

On improving ad- versarial transferability of vision transformers

Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Fahad Shahbaz Khan, and Fatih Porikli. On improving ad- versarial transferability of vision transformers. InInterna- tional Conference on Learning Representations, 2022. 1, 3, 5, 6, 12

work page 2022
[35]

Diffusion models for ad- versarial purification

Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, and Anima Anandkumar. Diffusion models for ad- versarial purification. InInternational Conference on Ma- chine Learning (ICML), 2022. 8

work page 2022
[36]

Maxime Oquab, Timoth ´ee Darcet, Theo Moutakanni, Huy V . V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Rus- sell Howes, Po-Yao Huang, Hu Xu, Vasu Sharma, Shang- Wen Li, Wojciech Galuba, Mike Rabbat, Mido Assran, Nico- las Ballas, Gabriel Synnaeve, Ishan Misra, Herve Jegou, Julien Mairal, Patri...

work page 2023
[37]

Learn- ing transferable visual models from natural language super- vision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learn- ing transferable visual models from natural language super- vision. InInternational Conference on Machine Learning,

work page
[38]

Do imagenet classifiers generalize to im- agenet? InInternational conference on machine learning, pages 5389–5400

Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. Do imagenet classifiers generalize to im- agenet? InInternational conference on machine learning, pages 5389–5400. PMLR, 2019. 15

work page 2019
[39]

Do adversarially robust im- agenet models transfer better? InAdvances in Neural Infor- mation Processing Systems, 2020

Hadi Salman, Andrew Ilyas, Logan Engstrom, Ashish Kapoor, and Aleksander Madry. Do adversarially robust im- agenet models transfer better? InAdvances in Neural Infor- mation Processing Systems, 2020. 8

work page 2020
[40]

Mobilenetv2: Inverted residuals and linear bottlenecks

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zh- moginov, and Liang-Chieh Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. 5

work page 2018
[41]

Very deep convo- lutional networks for large-scale image recognition

Karen Simonyan and Andrew Zisserman. Very deep convo- lutional networks for large-scale image recognition. InIn- ternational Conference on Learning Representations, 2015. 5

work page 2015
[42]

In- triguing properties of neural networks

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. In- triguing properties of neural networks. InInternational Con- ference on Learning Representations, 2014. 1, 2

work page 2014
[43]

Rethinking the inception ar- chitecture for computer vision

Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception ar- chitecture for computer vision. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016. 5

work page 2016
[44]

Inception-v4, inception-resnet and the im- pact of residual connections on learning

Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander Alemi. Inception-v4, inception-resnet and the im- pact of residual connections on learning. InAAAI Conference on Artificial Intelligence, 2017. 5

work page 2017
[45]

Efficientnet: Rethinking model scaling for convolutional neural networks

Mingxing Tan and Quoc Le. Efficientnet: Rethinking model scaling for convolutional neural networks. InInternational Conference on Machine Learning, 2019. 5

work page 2019
[46]

Attention is all you need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Systems, 2017. 4

work page 2017
[47]

Regularization of neural networks using drop- connect

Li Wan, Matthew Zeiler, Sixin Zhang, Yann Le Cun, and Rob Fergus. Regularization of neural networks using drop- connect. InInternational Conference on Machine Learning,

work page
[48]

Enhanc- ing targeted attack transferability via diversified weight prun- ing

Hung-Jui Wang, Yu-Yu Wu, and Shang-Tse Chen. Enhanc- ing targeted attack transferability via diversified weight prun- ing. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition Workshops, 2024. 3

work page 2024
[49]

Boosting adversarial transferability by block shuffle and rotation

Kunyu Wang, Xuanran He, Wenxuan Wang, and Xiaosen Wang. Boosting adversarial transferability by block shuffle and rotation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024. 1, 3, 6, 12

work page 2024
[50]

Enhancing the transferability of adversarial attacks through variance tuning

Xiaosen Wang and Kun He. Enhancing the transferability of adversarial attacks through variance tuning. InIEEE/CVF Conference on Computer Vision and Pattern Recognition,

work page
[51]

Admix: Enhancing the transferability of adversarial attacks

Xiaosen Wang, Xuanran He, Jingdong Wang, and Kun He. Admix: Enhancing the transferability of adversarial attacks. InInternational Conference on Computer Vision, 2021. 1, 3, 6, 12

work page 2021
[52]

Struc- ture invariant transformation for better adversarial transfer- ability

Xiaosen Wang, Zeliang Zhang, and Jianping Zhang. Struc- ture invariant transformation for better adversarial transfer- ability. InProceedings of the IEEE/CVF International Con- ference on Computer Vision, 2023. 1, 3, 6

work page 2023
[53]

Con- vnext v2: Co-designing and scaling convnets with masked autoencoders.arXiv preprint arXiv:2301.00808, 2023

Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, and Saining Xie. Con- vnext v2: Co-designing and scaling convnets with masked autoencoders.arXiv preprint arXiv:2301.00808, 2023. 15

work page arXiv 2023
[54]

Im- proving transferable targeted adversarial attacks with model self-enhancement

Han Wu, Guanyan Ou, Weibin Wu, and Zibin Zheng. Im- proving transferable targeted adversarial attacks with model self-enhancement. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, 2024. 2, 3, 8, 12

work page 2024
[55]

Mitigating adversarial effects through random- ization

Cihang Xie, Jianyu Wang, Zhishuai Zhang, Zhou Ren, and Alan Yuille. Mitigating adversarial effects through random- ization. InInternational Conference on Learning Represen- tations, 2018. 8

work page 2018
[56]

Improving trans- ferability of adversarial examples with input diversity

Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, and Alan L Yuille. Improving trans- ferability of adversarial examples with input diversity. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. 1, 3, 6, 12

work page 2019
[57]

Boosting the adversarial transferability of surrogate models with dark knowledge

Dingcheng Yang, Zihao Xiao, and Wenjian Yu. Boosting the adversarial transferability of surrogate models with dark knowledge. In2023 IEEE 35th International Conference on Tools with Artificial Intelligence, 2023. 3, 8, 12

work page 2023
[58]

Generating adversarial examples with better transferability via masking unimportant parameters of surrogate model

Dingcheng Yang, Wenjian Yu, Zihao Xiao, and Jiaqi Luo. Generating adversarial examples with better transferability via masking unimportant parameters of surrogate model. In International Joint Conference on Neural Networks, 2023. 1, 2, 3, 5, 6, 12

work page 2023
[59]

Trs: Transferability reduced ensemble via encouraging gradient diversity and model smoothness

Zhuolin Yang, Linyi Li, Xiaojun Xu, Shiliang Zuo, Qian Chen, Benjamin Rubinstein, Ce Zhang, and Bo Li. Trs: Transferability reduced ensemble via encouraging gradient diversity and model smoothness. InAdvances in Neural In- formation Processing Systems, 2021. 1

work page 2021
[60]

On suc- cess and simplicity: A second look at transferable targeted attacks

Zhengyu Zhao, Zhuoran Liu, and Martha Larson. On suc- cess and simplicity: A second look at transferable targeted attacks. InAdvances in Neural Information Processing Sys- tems, 2021. 1, 5, 12

work page 2021
[61]

Improving the transferability of adversarial ex- amples with resized-diverse-inputs, diversity-ensemble and region fitting

Junhua Zou, Zhisong Pan, Junyang Qiu, Xin Liu, Ting Rui, and Wei Li. Improving the transferability of adversarial ex- amples with resized-diverse-inputs, diversity-ensemble and region fitting. InEuropean Conference on Computer Vision,

work page
[62]

noise curing

1, 3, 6, 12 Supplementary Material A. More Related Work One of the most fundamental attack methods is Fast Gradient Sign Method (FGSM) [12], which uses the direction of gradient to craft adversarial examples. Iterative-FGSM (I-FGSM) [23] extends FGSM into an iterative framework to enhance the attack performance. However, while the obtained adversarial exa...

work page
[63]

Ghost Network [26] perturbs surrogate model to create a set of new models and then samples one model from the set at each iteration

specifically considers vision Transformer as surrogate model and is denoted as SE-ViT in this paper. Ghost Network [26] perturbs surrogate model to create a set of new models and then samples one model from the set at each iteration. Masking Unimportant Parameters (MUP) [58] drops out unimportant parameters according to a predefined Taylor expansion-based...

work page 2023

[1] [1]

Improving the transferabil- ity of targeted adversarial examples through object-based di- verse input

Junyoung Byun, Seungju Cho, Myung-Joon Kwon, Hee- Seon Kim, and Changick Kim. Improving the transferabil- ity of targeted adversarial examples through object-based di- verse input. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022. 1, 3, 6, 12

work page 2022

[2] [2]

Introducing competition to boost the transferability of targeted adversarial examples through clean feature mixup

Junyoung Byun, Myung-Joon Kwon, Seungju Cho, Yoonji Kim, and Changick Kim. Introducing competition to boost the transferability of targeted adversarial examples through clean feature mixup. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023. 1, 3, 5, 6, 12

work page 2023

[3] [3]

Rethinking model ensemble in transfer-based adversarial at- tacks

Huanran Chen, Yichi Zhang, Yinpeng Dong, and Junyi Zhu. Rethinking model ensemble in transfer-based adversarial at- tacks. InInternational Conference on Learning Representa- tions, 2024. 5, 12

work page 2024

[4] [4]

Xception: Deep learning with depthwise separable convolutions

Franc ¸ois Chollet. Xception: Deep learning with depthwise separable convolutions. InIEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2017. 5

work page 2017

[5] [5]

Twins: Revisiting the design of spatial attention in vision transformers

Xiangxiang Chu, Zhi Tian, Yuqing Wang, Bo Zhang, Haib- ing Ren, Xiaolin Wei, Huaxia Xia, and Chunhua Shen. Twins: Revisiting the design of spatial attention in vision transformers. InAdvances in Neural Information Processing Systems, 2021. 6

work page 2021

[6] [6]

Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2009. 6

work page 2009

[7] [7]

Boosting adversarial at- tacks with momentum

Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. Boosting adversarial at- tacks with momentum. InIEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2018. 1, 3, 5, 6, 12

work page 2018

[8] [8]

Evading defenses to transferable adversarial examples by translation-invariant attacks

Yinpeng Dong, Tianyu Pang, Hang Su, and Jun Zhu. Evading defenses to transferable adversarial examples by translation-invariant attacks. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. 3, 6, 12

work page 2019

[9] [9]

An image is worth 16x16 words: Transformers for image recognition at scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, G Heigold, S Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. InInternational Conference on Learning Representations, 2020. 1, 5

work page 2020

[10] [10]

Convit: Improving vision transformers with soft convolutional inductive biases

St ´ephane d’Ascoli, Hugo Touvron, Matthew L Leavitt, Ari S Morcos, Giulio Biroli, and Levent Sagun. Convit: Improving vision transformers with soft convolutional inductive biases. InInternational Conference on Machine Learning, 2021. 6

work page 2021

[11] [11]

Depgraph: Towards any structural pruning

Gongfan Fang, Xinyin Ma, Mingli Song, Michael Bi Mi, and Xinchao Wang. Depgraph: Towards any structural pruning. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023. 13

work page 2023

[12] [12]

Explaining and harnessing adversarial examples

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2015. 1, 2, 12

work page 2015

[13] [13]

Levit: a vision transformer in convnet’s clothing for faster inference

Benjamin Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Herv ´e J ´egou, and Matthijs Douze. Levit: a vision transformer in convnet’s clothing for faster inference. InInternational Conference on Computer Vision, 2021. 6

work page 2021

[14] [14]

A survey on transferability of adversar- ial examples across deep neural networks.Transactions on Machine Learning Research, 2024

Jindong Gu, Xiaojun Jia, Pau de Jorge, Wenqian Yu, Xin- wei Liu, Avery Ma, Yuan Xun, Anjun Hu, Ashkan Khakzar, Zhijiang Li, et al. A survey on transferability of adversar- ial examples across deep neural networks.Transactions on Machine Learning Research, 2024. 1

work page 2024

[15] [15]

Countering adversarial images using input transformations

Chuan Guo, Mayank Rana, Moustapha Cisse, and Laurens Van Der Maaten. Countering adversarial images using input transformations. InInternational Conference on Learning Representations, 2018. 8

work page 2018

[16] [16]

Mambavision: A hybrid mamba-transformer vision backbone

Ali Hatamizadeh and Jan Kautz. Mambavision: A hybrid mamba-transformer vision backbone. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025. 15

work page 2025

[17] [17]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InIEEE/CVF Conference on Computer Vision and Pattern Recognition,

work page

[18] [18]

Rethinking spatial dimensions of vision transformers

Byeongho Heo, Sangdoo Yun, Dongyoon Han, Sanghyuk Chun, Junsuk Choe, and Seong Joon Oh. Rethinking spatial dimensions of vision transformers. InInternational Confer- ence on Computer Vision, 2021. 6

work page 2021

[19] [19]

Densely connected convolutional net- works

Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kil- ian Q Weinberger. Densely connected convolutional net- works. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017. 1, 2, 5

work page 2017

[20] [20]

T-sea: Transfer-based self-ensemble attack on object detection

Hao Huang, Ziyan Chen, Huanran Chen, Yongtao Wang, and Kevin Zhang. T-sea: Transfer-based self-ensemble attack on object detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023. 1, 3, 5, 12

work page 2023

[21] [21]

Adversarial attacks and de- fences competition

Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, et al. Adversarial attacks and de- fences competition. InThe NIPS’17 Competition: Building Intelligent Systems, 2018. 8

work page 2018

[22] [22]

Adversarial attacks and de- fences competition

Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, et al. Adversarial attacks and de- fences competition. InThe NIPS’17 Competition: Building Intelligent Systems, pages 195–231. Springer, 2018. 5

work page 2018

[23] [23]

Ad- versarial examples in the physical world

Alexey Kurakin, Ian J Goodfellow, and Samy Bengio. Ad- versarial examples in the physical world. InArtificial In- telligence Safety and Security, pages 99–112. Chapman and Hall/CRC, 2018. 2, 12

work page 2018

[24] [24]

Optimal brain damage.Advances in neural information processing systems, 2, 1989

Yann LeCun, John Denker, and Sara Solla. Optimal brain damage.Advances in neural information processing systems, 2, 1989. 3

work page 1989

[25] [25]

Towards transferable targeted attack

Maosen Li, Cheng Deng, Tengjiao Li, Junchi Yan, Xinbo Gao, and Heng Huang. Towards transferable targeted attack. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. 12

work page 2020

[26] [26]

Learning transferable adversarial examples via ghost networks

Yingwei Li, Song Bai, Yuyin Zhou, Cihang Xie, Zhishuai Zhang, and Alan Yuille. Learning transferable adversarial examples via ghost networks. InAAAI Conference on Artifi- cial Intelligence, 2020. 1, 2, 3, 5, 12, 15

work page 2020

[27] [27]

Improving transferable targeted attacks with fea- ture tuning mixup

Kaisheng Liang, Xuelong Dai, Yanjie Li, Dong Wang, and Bin Xiao. Improving transferable targeted attacks with fea- ture tuning mixup. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 25802–25811, 2025. 1, 3, 12

work page 2025

[28] [28]

Defense against adversarial at- tacks using high-level representation guided denoiser

Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Xiaolin Hu, and Jun Zhu. Defense against adversarial at- tacks using high-level representation guided denoiser. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. 8

work page 2018

[29] [29]

Hopcroft

Jiadong Lin, Chuanbiao Song, Kun He, Liwei Wang, and John E. Hopcroft. Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks. InInternational Confer- ence on Learning Representations, 2020. 3, 6, 12

work page 2020

[30] [30]

Scaling laws for black box adversarial attacks

Chuan Liu, Huanran Chen, Yichi Zhang, Yinpeng Dong, and Jun Zhu. Scaling laws for black box adversarial attacks. arXiv preprint arXiv:2411.16782, 2024. 1, 5, 12

work page arXiv 2024

[31] [31]

Delving into transferable adversarial examples and black-box attacks

Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Xiaodong Song. Delving into transferable adversarial examples and black-box attacks. InInternational Conference on Learning Representations, 2017. 1, 5, 12

work page 2017

[32] [32]

On the robustness of vision transformers to adversarial ex- amples

Kaleel Mahmood, Rigel Mahmood, and Marten Van Dijk. On the robustness of vision transformers to adversarial ex- amples. InInternational Conference on Computer Vision,

work page

[33] [33]

Importance estimation for neural network pruning

Pavlo Molchanov, Arun Mallya, Stephen Tyree, Iuri Frosio, and Jan Kautz. Importance estimation for neural network pruning. InIEEE/CVF conference on computer vision and pattern recognition, 2019. 3, 13

work page 2019

[34] [34]

On improving ad- versarial transferability of vision transformers

Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Fahad Shahbaz Khan, and Fatih Porikli. On improving ad- versarial transferability of vision transformers. InInterna- tional Conference on Learning Representations, 2022. 1, 3, 5, 6, 12

work page 2022

[35] [35]

Diffusion models for ad- versarial purification

Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, and Anima Anandkumar. Diffusion models for ad- versarial purification. InInternational Conference on Ma- chine Learning (ICML), 2022. 8

work page 2022

[36] [36]

Maxime Oquab, Timoth ´ee Darcet, Theo Moutakanni, Huy V . V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Rus- sell Howes, Po-Yao Huang, Hu Xu, Vasu Sharma, Shang- Wen Li, Wojciech Galuba, Mike Rabbat, Mido Assran, Nico- las Ballas, Gabriel Synnaeve, Ishan Misra, Herve Jegou, Julien Mairal, Patri...

work page 2023

[37] [37]

Learn- ing transferable visual models from natural language super- vision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learn- ing transferable visual models from natural language super- vision. InInternational Conference on Machine Learning,

work page

[38] [38]

Do imagenet classifiers generalize to im- agenet? InInternational conference on machine learning, pages 5389–5400

Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. Do imagenet classifiers generalize to im- agenet? InInternational conference on machine learning, pages 5389–5400. PMLR, 2019. 15

work page 2019

[39] [39]

Do adversarially robust im- agenet models transfer better? InAdvances in Neural Infor- mation Processing Systems, 2020

Hadi Salman, Andrew Ilyas, Logan Engstrom, Ashish Kapoor, and Aleksander Madry. Do adversarially robust im- agenet models transfer better? InAdvances in Neural Infor- mation Processing Systems, 2020. 8

work page 2020

[40] [40]

Mobilenetv2: Inverted residuals and linear bottlenecks

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zh- moginov, and Liang-Chieh Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. 5

work page 2018

[41] [41]

Very deep convo- lutional networks for large-scale image recognition

Karen Simonyan and Andrew Zisserman. Very deep convo- lutional networks for large-scale image recognition. InIn- ternational Conference on Learning Representations, 2015. 5

work page 2015

[42] [42]

In- triguing properties of neural networks

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. In- triguing properties of neural networks. InInternational Con- ference on Learning Representations, 2014. 1, 2

work page 2014

[43] [43]

Rethinking the inception ar- chitecture for computer vision

Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception ar- chitecture for computer vision. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016. 5

work page 2016

[44] [44]

Inception-v4, inception-resnet and the im- pact of residual connections on learning

Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander Alemi. Inception-v4, inception-resnet and the im- pact of residual connections on learning. InAAAI Conference on Artificial Intelligence, 2017. 5

work page 2017

[45] [45]

Efficientnet: Rethinking model scaling for convolutional neural networks

Mingxing Tan and Quoc Le. Efficientnet: Rethinking model scaling for convolutional neural networks. InInternational Conference on Machine Learning, 2019. 5

work page 2019

[46] [46]

Attention is all you need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Systems, 2017. 4

work page 2017

[47] [47]

Regularization of neural networks using drop- connect

Li Wan, Matthew Zeiler, Sixin Zhang, Yann Le Cun, and Rob Fergus. Regularization of neural networks using drop- connect. InInternational Conference on Machine Learning,

work page

[48] [48]

Enhanc- ing targeted attack transferability via diversified weight prun- ing

Hung-Jui Wang, Yu-Yu Wu, and Shang-Tse Chen. Enhanc- ing targeted attack transferability via diversified weight prun- ing. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition Workshops, 2024. 3

work page 2024

[49] [49]

Boosting adversarial transferability by block shuffle and rotation

Kunyu Wang, Xuanran He, Wenxuan Wang, and Xiaosen Wang. Boosting adversarial transferability by block shuffle and rotation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024. 1, 3, 6, 12

work page 2024

[50] [50]

Enhancing the transferability of adversarial attacks through variance tuning

Xiaosen Wang and Kun He. Enhancing the transferability of adversarial attacks through variance tuning. InIEEE/CVF Conference on Computer Vision and Pattern Recognition,

work page

[51] [51]

Admix: Enhancing the transferability of adversarial attacks

Xiaosen Wang, Xuanran He, Jingdong Wang, and Kun He. Admix: Enhancing the transferability of adversarial attacks. InInternational Conference on Computer Vision, 2021. 1, 3, 6, 12

work page 2021

[52] [52]

Struc- ture invariant transformation for better adversarial transfer- ability

Xiaosen Wang, Zeliang Zhang, and Jianping Zhang. Struc- ture invariant transformation for better adversarial transfer- ability. InProceedings of the IEEE/CVF International Con- ference on Computer Vision, 2023. 1, 3, 6

work page 2023

[53] [53]

Con- vnext v2: Co-designing and scaling convnets with masked autoencoders.arXiv preprint arXiv:2301.00808, 2023

Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, and Saining Xie. Con- vnext v2: Co-designing and scaling convnets with masked autoencoders.arXiv preprint arXiv:2301.00808, 2023. 15

work page arXiv 2023

[54] [54]

Im- proving transferable targeted adversarial attacks with model self-enhancement

Han Wu, Guanyan Ou, Weibin Wu, and Zibin Zheng. Im- proving transferable targeted adversarial attacks with model self-enhancement. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, 2024. 2, 3, 8, 12

work page 2024

[55] [55]

Mitigating adversarial effects through random- ization

Cihang Xie, Jianyu Wang, Zhishuai Zhang, Zhou Ren, and Alan Yuille. Mitigating adversarial effects through random- ization. InInternational Conference on Learning Represen- tations, 2018. 8

work page 2018

[56] [56]

Improving trans- ferability of adversarial examples with input diversity

Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, and Alan L Yuille. Improving trans- ferability of adversarial examples with input diversity. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. 1, 3, 6, 12

work page 2019

[57] [57]

Boosting the adversarial transferability of surrogate models with dark knowledge

Dingcheng Yang, Zihao Xiao, and Wenjian Yu. Boosting the adversarial transferability of surrogate models with dark knowledge. In2023 IEEE 35th International Conference on Tools with Artificial Intelligence, 2023. 3, 8, 12

work page 2023

[58] [58]

Generating adversarial examples with better transferability via masking unimportant parameters of surrogate model

Dingcheng Yang, Wenjian Yu, Zihao Xiao, and Jiaqi Luo. Generating adversarial examples with better transferability via masking unimportant parameters of surrogate model. In International Joint Conference on Neural Networks, 2023. 1, 2, 3, 5, 6, 12

work page 2023

[59] [59]

Trs: Transferability reduced ensemble via encouraging gradient diversity and model smoothness

Zhuolin Yang, Linyi Li, Xiaojun Xu, Shiliang Zuo, Qian Chen, Benjamin Rubinstein, Ce Zhang, and Bo Li. Trs: Transferability reduced ensemble via encouraging gradient diversity and model smoothness. InAdvances in Neural In- formation Processing Systems, 2021. 1

work page 2021

[60] [60]

On suc- cess and simplicity: A second look at transferable targeted attacks

Zhengyu Zhao, Zhuoran Liu, and Martha Larson. On suc- cess and simplicity: A second look at transferable targeted attacks. InAdvances in Neural Information Processing Sys- tems, 2021. 1, 5, 12

work page 2021

[61] [61]

Improving the transferability of adversarial ex- amples with resized-diverse-inputs, diversity-ensemble and region fitting

Junhua Zou, Zhisong Pan, Junyang Qiu, Xin Liu, Ting Rui, and Wei Li. Improving the transferability of adversarial ex- amples with resized-diverse-inputs, diversity-ensemble and region fitting. InEuropean Conference on Computer Vision,

work page

[62] [62]

noise curing

1, 3, 6, 12 Supplementary Material A. More Related Work One of the most fundamental attack methods is Fast Gradient Sign Method (FGSM) [12], which uses the direction of gradient to craft adversarial examples. Iterative-FGSM (I-FGSM) [23] extends FGSM into an iterative framework to enhance the attack performance. However, while the obtained adversarial exa...

work page

[63] [63]

Ghost Network [26] perturbs surrogate model to create a set of new models and then samples one model from the set at each iteration

specifically considers vision Transformer as surrogate model and is denoted as SE-ViT in this paper. Ghost Network [26] perturbs surrogate model to create a set of new models and then samples one model from the set at each iteration. Masking Unimportant Parameters (MUP) [58] drops out unimportant parameters according to a predefined Taylor expansion-based...

work page 2023