arxiv: 2604.23105 · v1 · submitted 2026-04-25 · 💻 cs.CV

Recognition: unknown

Transferable Physical-World Adversarial Patches Against Object Detection in Autonomous Driving

Zihui Zhu , Ziqi Zhou , Yichen Wang , Lulu Xue , Minghui Li , Shengshan Hu

Authors on Pith no claims yet

Pith reviewed 2026-05-08 08:28 UTC · model grok-4.3

classification 💻 cs.CV

keywords adversarial patchesobject detectionautonomous drivingtransferable attacksphysical adversarial examplesmulti-model optimization

0 comments

The pith

Joint optimization over multiple detectors creates physical patches that transfer to unseen object detectors in driving scenes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents AdvAD, a method for generating physical adversarial patches that attack object detectors in autonomous driving systems. Rather than crafting patches for one model at a time, it optimizes them across several detectors simultaneously to exploit shared weaknesses. The process includes adaptive weighting of each model, plus data augmentation and geometric transforms to simulate real-world variations like viewpoint and lighting changes. Experiments show these patches achieve higher success rates and better transferability than prior attacks when tested both digitally and on physical objects in real environments.

Core claim

AdvAD optimizes adversarial patches over multiple detection models in a unified framework, adaptively balances model contributions, and enforces robustness to physical variations through data augmentation and geometric transformations, producing patches that outperform prior methods in attack performance and transferability to unseen detectors in both digital and real-world settings.

What carries the argument

A unified multi-model optimization process that jointly learns perturbations while incorporating physical robustness constraints via augmentation and transforms.

If this is right

Patches remain effective across different detector architectures without retraining for each one.
Effectiveness persists under changes in distance, angle, and lighting that occur in actual driving.
The attack success rate exceeds that of existing single-model or non-augmented methods in both simulated and physical tests.
The same patch can disrupt perception in multiple autonomous driving systems simultaneously.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Developers of autonomous driving systems may need to test detectors against patches optimized on ensembles rather than isolated models.
The multi-model approach could extend to other perception tasks such as lane detection or segmentation if similar joint optimization is used.
Physical patch attacks might serve as a practical stress test for deployed vehicles before they enter public roads.

Load-bearing premise

That shared vulnerabilities exist across different detector architectures and that the chosen augmentations sufficiently capture real physical variations so the patch remains effective on new models.

What would settle it

Apply the generated patches to a new object detector architecture never used in training and measure whether attack success rate in real-world driving tests falls to the level of single-model baselines.

Figures

Figures reproduced from arXiv: 2604.23105 by Lulu Xue, Minghui Li, Shengshan Hu, Yichen Wang, Zihui Zhu, Ziqi Zhou.

**Figure 1.** Figure 1: An illustration of attacking object detection models in autonomous view at source ↗

**Figure 2.** Figure 2: Overall framework of the proposed AdvAD method. Given an input image, tailored data augmentation is first applied to simulate diverse real-world view at source ↗

**Figure 3.** Figure 3: Visualization results in the digital world. The adversarial patch is aligned with the target object region during attack generation. Compared with view at source ↗

**Figure 4.** Figure 4: Visualization results in simulated environments. Notably, since the view at source ↗

**Figure 5.** Figure 5: Visualization results in the physical world. (a) Physical-world validation under controlled conditions, where adversarial patches are printed and captured view at source ↗

read the original abstract

Deep learning drives major advances in autonomous driving (AD), where object detectors are central to perception. However, adversarial attacks pose significant threats to the reliability and safety of these systems, with physical adversarial patches representing a particularly potent form of attack. Physical adversarial patch attacks pose severe risks but are usually crafted for a single model, yielding poor transferability to unseen detectors. We propose AdvAD, a transfer-based physical attack against object detection in autonomous driving. Instead of targeting a specific detector, AdvAD optimizes adversarial patches over multiple detection models in a unified framework, encouraging the learned perturbations to capture shared vulnerabilities across architectures. The optimization process adaptively balances model contributions and enforces robustness to physical variations. It further employs data augmentation and geometric transformations to maintain patch effectiveness under diverse physical conditions. Experiments in both digital and real-world settings show that AdvAD consistently outperforms state-of-the-art (SOTA) attacks in performance and transferability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AdvAD uses multi-model joint optimization with adaptive balancing and physical augmentations to target better transferability for patches on AD object detectors, but the gains rest on unseen experimental numbers.

read the letter

AdvAD optimizes adversarial patches over multiple object detection models at once, with adaptive balancing of their contributions and added robustness to physical changes, to achieve better transfer to unseen detectors in autonomous driving scenarios. That's the core idea, and it directly tackles the transferability weakness that plagues most physical patch attacks. The work does a good job laying out why single-model patches fall short in practice for AD systems and then building a framework that jointly optimizes across models. The use of data augmentation and geometric transformations to handle real-world conditions like different distances and angles is a practical addition that aligns with how these attacks need to perform outside the lab. The soft spots come down to the experimental side. While the abstract claims consistent outperformance in digital and real-world settings, it doesn't include any specific numbers, model lists, or comparison details. This leaves the strength of the results open to question until the full paper is reviewed. The adaptive balancing introduces some extra parameters, and it's not clear from the description how sensitive the method is to those choices or whether they provide a real edge over simpler multi-model averaging. Transferability claims in this area often depend heavily on the diversity of the surrogate models, and if those are too correlated, the gains might not generalize as hoped. This paper is for people studying adversarial robustness in computer vision for autonomous vehicles. A reader looking for new attack methods to evaluate safety of detectors would find the unified optimization approach useful to consider, especially if they need something that works physically. It deserves a serious referee because the problem is timely and the method is described in enough detail to be reproducible and critiqued. The central claims can be tested against the data. I recommend sending it to peer review, with reviewers focusing on the quantitative improvements and the physical experiment validity.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes AdvAD, a transfer-based physical adversarial attack on object detectors for autonomous driving. It jointly optimizes adversarial patches over multiple detection models using an adaptive balancing mechanism for model contributions, combined with data augmentation and geometric transformations to promote robustness under physical variations. The central claim is that this approach yields patches with superior attack performance and transferability to unseen detectors compared to state-of-the-art methods, as demonstrated in both digital and real-world experiments.

Significance. If the experimental results hold and are reproducible, the work would be moderately significant for the computer vision and autonomous driving security communities. It directly tackles the known limitation of poor transferability in single-model physical patch attacks by using multi-model optimization, which is a reasonable empirical strategy. Successful real-world validation could highlight practical risks to AD perception systems and motivate stronger defenses, though the approach builds on established techniques rather than introducing fundamentally new theoretical insights.

major comments (1)

[Abstract] Abstract: The abstract asserts that 'Experiments in both digital and real-world settings show that AdvAD consistently outperforms state-of-the-art (SOTA) attacks in performance and transferability,' yet provides no quantitative metrics (e.g., attack success rates, mAP drops, or transfer rates), named baselines, model architectures, or dataset details. This makes the central empirical claim difficult to evaluate from the summary alone; the experimental section must supply these specifics with clear comparisons to substantiate outperformance.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We agree that enhancing the abstract with concrete quantitative details will improve clarity and have revised it accordingly. Our point-by-point response to the major comment is provided below.

read point-by-point responses

Referee: [Abstract] Abstract: The abstract asserts that 'Experiments in both digital and real-world settings show that AdvAD consistently outperforms state-of-the-art (SOTA) attacks in performance and transferability,' yet provides no quantitative metrics (e.g., attack success rates, mAP drops, or transfer rates), named baselines, model architectures, or dataset details. This makes the central empirical claim difficult to evaluate from the summary alone; the experimental section must supply these specifics with clear comparisons to substantiate outperformance.

Authors: We acknowledge the validity of this observation. The original abstract was intentionally concise, but we agree that including key quantitative highlights would better support the central claim. In the revised manuscript, we have updated the abstract to reference specific metrics drawn from the experimental results (e.g., attack success rates and mAP reductions relative to baselines), the primary SOTA methods compared, the detector architectures used for multi-model optimization, and the datasets involved. The experimental section already contains the full tables, figures, and statistical comparisons demonstrating outperformance in both digital and physical settings; these are now more explicitly tied back to the abstract for easier evaluation. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper describes an empirical optimization procedure for generating physical adversarial patches by jointly training over multiple object detectors, with adaptive loss balancing and standard data augmentations for physical robustness. No equations, derivations, or uniqueness theorems are presented that reduce any claimed prediction or result to a fitted parameter or self-citation by construction. The central claims of improved transferability rest entirely on experimental validation rather than internal self-referential structure, rendering the work self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard assumptions from adversarial machine learning about the existence of transferable perturbations across models; the method introduces no new mathematical entities or constants beyond typical optimization hyperparameters.

free parameters (1)

adaptive balancing weights
The optimization adaptively balances contributions from different detection models, implying parameters that are learned or tuned during the process.

axioms (1)

domain assumption Multiple object detectors share exploitable vulnerabilities that can be captured in a single patch perturbation.
This is invoked to justify the multi-model optimization framework for encouraging shared vulnerabilities.

pith-pipeline@v0.9.0 · 5467 in / 1367 out tokens · 58020 ms · 2026-05-08T08:28:31.129457+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

59 extracted references · 7 canonical work pages · 2 internal anchors

[1]

Towards Reliable Forgetting: A Survey on Machine Unlearning Verification

L. Xue, S. Hu, W. Lu, Y . Shen, D. Li, P. Guo, Z. Zhou, M. Li, Y . Zhang, and L. Y . Zhang, “Towards reliable forgetting: A survey on machine unlearning verification,”arXiv preprint arXiv:2506.15115, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[2]

Ufvideo: Towards unified fine-grained video cooperative understanding with large language models,

H. Pan, C. Wei, D. Liang, Z. Huang, P. Gao, Z. Zhou, L. Xue, P. Yan, X. Wei, M. Liet al., “Ufvideo: Towards unified fine-grained video cooperative understanding with large language models,” inProceedings of the Computer Vision and Pattern Recognition Conference (CVPR’26), 2026

2026
[3]

Tattoo: Training-free aesthetic- aware outfit recommendation,

Y . Wu, X. Hu, Z. Zhou, and H. Lu, “Tattoo: Training-free aesthetic- aware outfit recommendation,”arXiv preprint arXiv:2509.23242, 2025

work page arXiv 2025
[4]

Deep reinforcement learning framework for autonomous driving,

A. E. Sallab, M. Abdou, E. Perot, and S. K. Yogamani, “Deep reinforcement learning framework for autonomous driving,” in Autonomous Vehicles and Machines, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:12064877

2017
[5]

Object detection and recog- nition for assistive robots: Experimentation and implementation,

E. Martinez-Martin and A. P. Del Pobil, “Object detection and recog- nition for assistive robots: Experimentation and implementation,”IEEE Robotics & Automation Magazine, vol. 24, no. 3, pp. 123–138, 2017

2017
[6]

Darkhash: A data-free backdoor attack against deep hashing,

Z. Zhou, M. Deng, Y . Song, H. Zhang, W. Wan, S. Hu, M. Li, L. Y . Zhang, and D. Yao, “Darkhash: A data-free backdoor attack against deep hashing,”IEEE Transactions on Information Forensics and Security, 2025

2025
[7]

Badhash: Invisible backdoor attacks against deep hashing with clean label,

S. Hu, Z. Zhou, Y . Zhang, L. Y . Zhang, Y . Zheng, Y . He, and H. Jin, “Badhash: Invisible backdoor attacks against deep hashing with clean label,” inProceedings of the 30th ACM International Conference on Multimedia (ACM MM’22), 2022, pp. 678–686

2022
[8]

Mars: A malignity-aware backdoor defense in federated learning,

W. Wan, Y . Ning, Z. Huang, C. Hong, S. Hu, Z. Zhou, Y . Zhang, T. Zhu, W. Zhou, and L. Y . Zhang, “Mars: A malignity-aware backdoor defense in federated learning,” inProceedings of the 39th Annual Conference on Neural Information Processing Systems (NeurIPS’25), 2025

2025
[9]

Detector collapse: Backdooring object detection to catastrophic overload or blindness,

H. Zhang, S. Hu, Y . Wang, L. Y . Zhang, Z. Zhou, X. Wang, Y . Zhang, and C. Chen, “Detector collapse: Backdooring object detection to catastrophic overload or blindness,” inProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI’24), 2024

2024
[10]

Test-time backdoor detection for object detection models,

H. Zhang, Y . Wang, S. Yan, C. Zhu, Z. Zhou, L. Hou, S. Hu, M. Li, Y . Zhang, and L. Y . Zhang, “Test-time backdoor detection for object detection models,” inProceedings of the Computer Vision and Pattern Recognition Conference (CVPR’25), 2025, pp. 24 377–24 386

2025
[11]

Trojanrobot: Backdoor attacks against robotic manipulation in the physical world,

X. Wang, H. Pan, H. Zhang, M. Li, S. Hu, Z. Zhou, L. Xue, P. Guo, Y . Wang, W. Wanet al., “Trojanrobot: Backdoor attacks against robotic manipulation in the physical world,”arXiv e-prints, pp. arXiv–2411, 2024

2024
[12]

Detecting and corrupting convolution-based unlearnable exam- ples,

M. Li, X. Wang, Z. Yu, S. Hu, Z. Zhou, L. Zhang, and L. Y . Zhang, “Detecting and corrupting convolution-based unlearnable exam- ples,” inProceedings of the AAAI Conference on Artificial Intelligence (AAAI’25), vol. 39, no. 17, 2025, pp. 18 403–18 411

2025
[13]

Eclipse: Expunging clean-label indiscriminate poisons via sparse diffusion purification,

X. Wang, S. Hu, Y . Zhang, Z. Zhou, L. Y . Zhang, P. Xu, W. Wan, and H. Jin, “Eclipse: Expunging clean-label indiscriminate poisons via sparse diffusion purification,” inEuropean Symposium on Research in Computer Security. Springer, 2024, pp. 146–166

2024
[14]

Spa-vlm: Stealthy poisoning attacks on rag-based vlm,

L. Yu, Y . Zhang, Z. Zhou, Y . Wu, W. Wan, M. Li, S. Hu, P. Xiaobing, and J. Wang, “Spa-vlm: Stealthy poisoning attacks on rag-based vlm,” arXiv preprint arXiv:2505.23828, 2025

work page arXiv 2025
[15]

Unlearnable 3d point clouds: Class-wise transformation is all you need,

X. Wang, M. Li, W. Liu, H. Zhang, S. Hu, Y . Zhang, Z. Zhou, and H. Jin, “Unlearnable 3d point clouds: Class-wise transformation is all you need,” inProceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS’24), vol. 37, 2024, pp. 99 404–99 432

2024
[16]

Securely fine-tuning pre-trained encoders against adversarial examples,

Z. Zhou, M. Li, W. Liu, S. Hu, Y . Zhang, W. Wan, L. Xue, L. Y . Zhang, D. Yao, and H. Jin, “Securely fine-tuning pre-trained encoders against adversarial examples,” inProceedings of the 2024 IEEE Symposium on Security and Privacy (SP’24), 2024

2024
[17]

Advclip: Downstream-agnostic adversarial examples in multimodal contrastive learning,

Z. Zhou, S. Hu, M. Li, H. Zhang, Y . Zhang, and H. Jin, “Advclip: Downstream-agnostic adversarial examples in multimodal contrastive learning,” inProceedings of the 32nd ACM International Conference on Multimedia (MM’23), 2023, pp. 6311–6320

2023
[18]

Downstream-agnostic adversarial examples,

Z. Zhou, S. Hu, R. Zhao, Q. Wang, L. Y . Zhang, J. Hou, and H. Jin, “Downstream-agnostic adversarial examples,” inProceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV’23), 2023, pp. 4345–4355

2023
[19]

Numbod: A spatial-frequency fusion attack against object detectors,

Z. Zhou, B. Li, Y . Song, S. Hu, W. Wan, L. Y . Zhang, D. Yao, and H. Jin, “Numbod: A spatial-frequency fusion attack against object detectors,” inProceedings of the 39th Annual AAAI Conference on Artificial Intelligence (AAAI’25), 2025

2025
[20]

Darksam: Fooling segment anything model to segment nothing,

Z. Zhou, Y . Song, M. Li, S. Hu, X. Wang, L. Y . Zhang, D. Yao, and H. Jin, “Darksam: Fooling segment anything model to segment nothing,” inProceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS’24), 2024

2024
[21]

advpattern: Physical-world attacks on deep person re-identification via adversarially transformable patterns,

Z. Wang, S. Zheng, M. Song, Q. Wang, A. Rahimpour, and H. Qi, “advpattern: Physical-world attacks on deep person re-identification via adversarially transformable patterns,” in IEEE/CVF ICCV 2019, pp. 8340–8349, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:201650275

2019
[22]

Out- of-bounding-box triggers: A stealthy approach to cheat object detectors,

T. Lin, L. Yu, G. Jin, R. Li, P. Wu, and L. Zhang, “Out- of-bounding-box triggers: A stealthy approach to cheat object detectors,”ArXiv, vol. abs/2410.10091, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusID:273345346

work page arXiv 2024
[23]

T-sea: Transfer-based self-ensemble attack on object detection,

H. Huang, Z. Chen, H. Chen, Y . Wang, and K. A. Zhang, “T-sea: Transfer-based self-ensemble attack on object detection,”in IEEE/CVF CVPR 2023, pp. 20 514–20 523, 2022

2023
[24]

Ensemble-based blackbox attacks on dense prediction,

Z. Cai, Y . Tan, and M. S. Asif, “Ensemble-based blackbox attacks on dense prediction,”in IEEE/CVF CVPR 2023, pp. 4045–4055, 2023. [On- line]. Available: https://api.semanticscholar.org/CorpusID:257766952

2023
[25]

Doepatch: Dynamically optimized ensemble model for adversarial patches generation,

W. Tan, Y . Li, C. Zhao, Z. Liu, and Q. Pan, “Doepatch: Dynamically optimized ensemble model for adversarial patches generation,”IEEE Transactions on Information Forensics and Security, 2024

2024
[26]

L-hawk: A controllable physical adversarial patch against a long-distance target

T. Liu, Y . Liu, Z. Ma, T. Yang, X. Liu, T. Li, and J. Ma, “L-hawk: A controllable physical adversarial patch against a long-distance target.” inNDSS, 2025

2025
[27]

Blackboxbench: A comprehensive benchmark of black-box adversarial attacks,

M. Zheng, X. Yan, Z. Zhu, H. Chen, and B. Wu, “Blackboxbench: A comprehensive benchmark of black-box adversarial attacks,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

2025
[28]

Histograms of oriented gradients for human detection,

N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” inin IEEE CVPR 2005, vol. 1. Ieee, 2005, pp. 886–893

2005
[29]

Object recognition from local scale-invariant features,

D. G. Lowe, “Object recognition from local scale-invariant features,” in Proceedings of the seventh IEEE international conference on computer vision, vol. 2. Ieee, 1999, pp. 1150–1157

1999
[30]

A decision-theoretic generalization of on-line learning and an application to boosting,

Y . Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,”Journal of computer and system sciences, vol. 55, no. 1, pp. 119–139, 1997

1997
[31]

Spatial pyramid pooling in deep convolutional networks for visual recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1904– 1916, 2015

1904
[32]

Faster r-cnn: Towards real-time object detection with region proposal networks,

S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” inAdvances in Neural Information Processing Systems, vol. 28, 2015

2015
[33]

Cascade r-cnn: Delving into high quality object detection,

Z. Cai and N. Vasconcelos, “Cascade r-cnn: Delving into high quality object detection,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6154–6162

2018
[34]

YOLOv11: An Overview of the Key Architectural Enhancements

R. Khanam and M. Hussain, “Yolov11: An overview of the key architectural enhancements,” 2024, arXiv preprint arXiv:2410.17725

work page internal anchor Pith review arXiv 2024
[35]

Ssd: Single shot multibox detector,

W. Liu, D. Anguelov, and D. Erhan, “Ssd: Single shot multibox detector,” inEuropean Conference on Computer Vision. Springer, 2016, pp. 21–37

2016
[36]

Focal loss for dense object detection,

T.-Y . Lin, P. Goyal, R. Girshick, K. He, and P. Doll ´ar, “Focal loss for dense object detection,” inProceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2980–2988

2017
[37]

End-to-end object detection with transformers,

N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” in European Conference on Computer Vision. Springer, 2020, pp. 213– 229

2020
[38]

Vanish into thin air: Cross-prompt universal adversarial attacks for sam2,

Z. Zhou, Y . Hu, Y . Song, Z. Li, S. Hu, L. Y . Zhang, D. Yao, L. Zheng, and H. Jin, “Vanish into thin air: Cross-prompt universal adversarial attacks for sam2,” inProceedings of the 39th Annual Conference on Neural Information Processing Systems (NeurIPS’25), 2025

2025
[39]

Advedm: Fine-grained adversarial attack against vlm-based embodied agents,

Y . Wang, h. Zhang, P. Hewen, Z. Zhou, X. Wang, P. Guo, l. Xue, S. Hu, M. Li, L. Y . Zhang, and Yao, “Advedm: Fine-grained adversarial attack against vlm-based embodied agents,” inProceedings of the 39th Annual Conference on Neural Information Processing Systems (NeurIPS’25), 2025

2025
[40]

Breaking barriers in physical-world adversarial examples: Improving robustness and transferability via robust feature,

Y . Wang, Y . Chou, Z. Zhou, H. Zhang, W. Wan, S. Hu, and M. Li, “Breaking barriers in physical-world adversarial examples: Improving robustness and transferability via robust feature,” inProceedings of the 39th Annual AAAI Conference on Artificial Intelligence (AAAI’25), 2025

2025
[41]

Pb-uap: Hybrid universal adversarial attack for image segmentation

Y . Song, Z. Zhou, M. Li, X. Wang, M. Deng, W. Wan, S. Hu, and L. Y . Zhang, “Pb-uap: Hybrid universal adversarial attack for image segmentation.” inProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’25), 2025

2025
[42]

Transferable adversarial facial images for privacy protection,

M. Li, J. Wang, H. Zhang, Z. Zhou, S. Hu, and X. Pei, “Transferable adversarial facial images for privacy protection,” inProceedings of the 32nd ACM International Conference on Multimedia (ACM MM’24), 2024, pp. 10 649–10 658

2024
[43]

Segtrans: Transferable adversarial examples for segmentation models,

Y . Song, Z. Zhou, Q. Lu, H. Zhang, Y . Hu, L. Xue, S. Hu, M. Li, and L. Y . Zhang, “Segtrans: Transferable adversarial examples for segmentation models,”IEEE Transactions on Multimedia, 2025

2025
[44]

Erosion attack for adversarial training to enhance semantic segmenta- tion robustness,

Y . Song, Z. Zhou, M. Deng, Y . Hu, S. Hu, M. Li, and L. Y . Zhang, “Erosion attack for adversarial training to enhance semantic segmenta- tion robustness,” inProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’26), 2026

2026
[45]

Towards evaluating the robustness of neural networks,

N. Carlini and D. A. Wagner, “Towards evaluating the robustness of neural networks,”2017 IEEE Symposium on Security and Privacy (SP), 2016. [Online]. Available: https://api.semanticscholar.org/CorpusID:2893830

2017
[46]

https://doi.org/10.48550/arXiv.1712.09665

T. B. Brown, D. Man ´e, A. Roy, M. Abadi, and J. Gilmer, “Adversarial patch,”ArXiv, vol. abs/1712.09665, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:13198654

work page arXiv 2017
[47]

Anti-adversarially manipulated attributions for weakly and semi-supervised semantic segmentation,

J. Lee, E. Kim, and S. Yoon, “Anti-adversarially manipulated attributions for weakly and semi-supervised semantic segmentation,” 2021 IEEE/CVF CVPR, pp. 4070–4078, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:232240142

2021
[48]

Patchsegdet: Attack-agnostic detection of physical adversarial patches in face recognition systems,

Z. Shen, Q. Li, X. Zhang, Y . Cai, X. Chen, P. An, H. Weng, and Y . Liu, “Patchsegdet: Attack-agnostic detection of physical adversarial patches in face recognition systems,” in2025 IEEE International Conference on Multimedia and Expo (ICME), 2025, pp. 1–6

2025
[49]

Two-stage optimized unified adversarial patch for attacking visible-infrared cross- modal detectors in the physical world,

C. Hu, W. Shi, W. Yao, T. Jiang, L. Tian, and W. Li, “Two-stage optimized unified adversarial patch for attacking visible-infrared cross- modal detectors in the physical world,”Applied Soft Computing, vol. 171, p. 112818, 2025

2025
[50]

Adversarial universal stickers: Universal perturbation attacks on traffic sign using stickers,

A. Etim and J. Szefer, “Adversarial universal stickers: Universal perturbation attacks on traffic sign using stickers,”arXiv preprint arXiv:2502.18724, 2025

work page arXiv 2025
[51]

Active: Towards highly transferable 3d physical camouflage for universal and robust vehicle evasion,

N. Suryanto, Y . Kim, and H. T. Larasati, “Active: Towards highly transferable 3d physical camouflage for universal and robust vehicle evasion,”2023 IEEE/CVF ICCV, pp. 4282–4291, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:260887446

2023
[52]

A unified framework for adversarial patch attacks against visual 3d object detection in autonomous driving,

J. Wang, F. Li, and L. He, “A unified framework for adversarial patch attacks against visual 3d object detection in autonomous driving,”IEEE Transactions on Circuits and Systems for Video Technology, 2025

2025
[53]

Adversarial translucent patch: a robust physical attack technique against object detectors,

K. Tiliwalidi, C. Hu, W. Shi, G. Lu, and H. Wu, “Adversarial translucent patch: a robust physical attack technique against object detectors,” Pattern Analysis and Applications, vol. 28, no. 4, p. 157, 2025

2025
[54]

Fooling automated surveil- lance cameras: adversarial patches to attack person detection,

S. Thys, W. Van Ranst, and T. Goedem ´e, “Fooling automated surveil- lance cameras: adversarial patches to attack person detection,” inPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2019, pp. 0–0

2019
[55]

Naturalistic physical adversarial patch for object detectors,

Y .-C.-T. Hu, B.-H. Kung, D. S. Tan, J.-C. Chen, K.-L. Hua, and W.- H. Cheng, “Naturalistic physical adversarial patch for object detectors,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 7848–7857

2021
[56]

Napguard: Towards de- tecting naturalistic adversarial patches,

S. Wu, J. Wang, J. Zhao, Y . Wang, and X. Liu, “Napguard: Towards de- tecting naturalistic adversarial patches,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 24 367–24 376

2024
[57]

Microsoft coco: Common objects in context,

T.-Y . Lin, M. Maire, and S. J. Belongie, “Microsoft coco: Common objects in context,” inEuropean Conference on Computer Vision, 2014. [Online]. Available: https://api.semanticscholar.org/CorpusID:14113767

2014
[58]

The pascal visual object classes (voc) challenge,

M. Everingham and L. V . Gool, “The pascal visual object classes (voc) challenge,”International Journal of Computer Vision, vol. 88, pp. 303–338, 2010. [Online]. Available: https://api.semanticscholar.org/CorpusID:4246903

2010
[59]

Carla: An open urban driving simulator,

A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V . Koltun, “Carla: An open urban driving simulator,” inConference on robot learning. PMLR, 2017, pp. 1–16

2017