pith. machine review for the scientific record. sign in

arxiv: 2604.23105 · v1 · submitted 2026-04-25 · 💻 cs.CV

Recognition: unknown

Transferable Physical-World Adversarial Patches Against Object Detection in Autonomous Driving

Authors on Pith no claims yet

Pith reviewed 2026-05-08 08:28 UTC · model grok-4.3

classification 💻 cs.CV
keywords adversarial patchesobject detectionautonomous drivingtransferable attacksphysical adversarial examplesmulti-model optimization
0
0 comments X

The pith

Joint optimization over multiple detectors creates physical patches that transfer to unseen object detectors in driving scenes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents AdvAD, a method for generating physical adversarial patches that attack object detectors in autonomous driving systems. Rather than crafting patches for one model at a time, it optimizes them across several detectors simultaneously to exploit shared weaknesses. The process includes adaptive weighting of each model, plus data augmentation and geometric transforms to simulate real-world variations like viewpoint and lighting changes. Experiments show these patches achieve higher success rates and better transferability than prior attacks when tested both digitally and on physical objects in real environments.

Core claim

AdvAD optimizes adversarial patches over multiple detection models in a unified framework, adaptively balances model contributions, and enforces robustness to physical variations through data augmentation and geometric transformations, producing patches that outperform prior methods in attack performance and transferability to unseen detectors in both digital and real-world settings.

What carries the argument

A unified multi-model optimization process that jointly learns perturbations while incorporating physical robustness constraints via augmentation and transforms.

If this is right

  • Patches remain effective across different detector architectures without retraining for each one.
  • Effectiveness persists under changes in distance, angle, and lighting that occur in actual driving.
  • The attack success rate exceeds that of existing single-model or non-augmented methods in both simulated and physical tests.
  • The same patch can disrupt perception in multiple autonomous driving systems simultaneously.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Developers of autonomous driving systems may need to test detectors against patches optimized on ensembles rather than isolated models.
  • The multi-model approach could extend to other perception tasks such as lane detection or segmentation if similar joint optimization is used.
  • Physical patch attacks might serve as a practical stress test for deployed vehicles before they enter public roads.

Load-bearing premise

That shared vulnerabilities exist across different detector architectures and that the chosen augmentations sufficiently capture real physical variations so the patch remains effective on new models.

What would settle it

Apply the generated patches to a new object detector architecture never used in training and measure whether attack success rate in real-world driving tests falls to the level of single-model baselines.

Figures

Figures reproduced from arXiv: 2604.23105 by Lulu Xue, Minghui Li, Shengshan Hu, Yichen Wang, Zihui Zhu, Ziqi Zhou.

Figure 1
Figure 1. Figure 1: An illustration of attacking object detection models in autonomous view at source ↗
Figure 2
Figure 2. Figure 2: Overall framework of the proposed AdvAD method. Given an input image, tailored data augmentation is first applied to simulate diverse real-world view at source ↗
Figure 3
Figure 3. Figure 3: Visualization results in the digital world. The adversarial patch is aligned with the target object region during attack generation. Compared with view at source ↗
Figure 4
Figure 4. Figure 4: Visualization results in simulated environments. Notably, since the view at source ↗
Figure 5
Figure 5. Figure 5: Visualization results in the physical world. (a) Physical-world validation under controlled conditions, where adversarial patches are printed and captured view at source ↗
read the original abstract

Deep learning drives major advances in autonomous driving (AD), where object detectors are central to perception. However, adversarial attacks pose significant threats to the reliability and safety of these systems, with physical adversarial patches representing a particularly potent form of attack. Physical adversarial patch attacks pose severe risks but are usually crafted for a single model, yielding poor transferability to unseen detectors. We propose AdvAD, a transfer-based physical attack against object detection in autonomous driving. Instead of targeting a specific detector, AdvAD optimizes adversarial patches over multiple detection models in a unified framework, encouraging the learned perturbations to capture shared vulnerabilities across architectures. The optimization process adaptively balances model contributions and enforces robustness to physical variations. It further employs data augmentation and geometric transformations to maintain patch effectiveness under diverse physical conditions. Experiments in both digital and real-world settings show that AdvAD consistently outperforms state-of-the-art (SOTA) attacks in performance and transferability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes AdvAD, a transfer-based physical adversarial attack on object detectors for autonomous driving. It jointly optimizes adversarial patches over multiple detection models using an adaptive balancing mechanism for model contributions, combined with data augmentation and geometric transformations to promote robustness under physical variations. The central claim is that this approach yields patches with superior attack performance and transferability to unseen detectors compared to state-of-the-art methods, as demonstrated in both digital and real-world experiments.

Significance. If the experimental results hold and are reproducible, the work would be moderately significant for the computer vision and autonomous driving security communities. It directly tackles the known limitation of poor transferability in single-model physical patch attacks by using multi-model optimization, which is a reasonable empirical strategy. Successful real-world validation could highlight practical risks to AD perception systems and motivate stronger defenses, though the approach builds on established techniques rather than introducing fundamentally new theoretical insights.

major comments (1)
  1. [Abstract] Abstract: The abstract asserts that 'Experiments in both digital and real-world settings show that AdvAD consistently outperforms state-of-the-art (SOTA) attacks in performance and transferability,' yet provides no quantitative metrics (e.g., attack success rates, mAP drops, or transfer rates), named baselines, model architectures, or dataset details. This makes the central empirical claim difficult to evaluate from the summary alone; the experimental section must supply these specifics with clear comparisons to substantiate outperformance.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We agree that enhancing the abstract with concrete quantitative details will improve clarity and have revised it accordingly. Our point-by-point response to the major comment is provided below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The abstract asserts that 'Experiments in both digital and real-world settings show that AdvAD consistently outperforms state-of-the-art (SOTA) attacks in performance and transferability,' yet provides no quantitative metrics (e.g., attack success rates, mAP drops, or transfer rates), named baselines, model architectures, or dataset details. This makes the central empirical claim difficult to evaluate from the summary alone; the experimental section must supply these specifics with clear comparisons to substantiate outperformance.

    Authors: We acknowledge the validity of this observation. The original abstract was intentionally concise, but we agree that including key quantitative highlights would better support the central claim. In the revised manuscript, we have updated the abstract to reference specific metrics drawn from the experimental results (e.g., attack success rates and mAP reductions relative to baselines), the primary SOTA methods compared, the detector architectures used for multi-model optimization, and the datasets involved. The experimental section already contains the full tables, figures, and statistical comparisons demonstrating outperformance in both digital and physical settings; these are now more explicitly tied back to the abstract for easier evaluation. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper describes an empirical optimization procedure for generating physical adversarial patches by jointly training over multiple object detectors, with adaptive loss balancing and standard data augmentations for physical robustness. No equations, derivations, or uniqueness theorems are presented that reduce any claimed prediction or result to a fitted parameter or self-citation by construction. The central claims of improved transferability rest entirely on experimental validation rather than internal self-referential structure, rendering the work self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard assumptions from adversarial machine learning about the existence of transferable perturbations across models; the method introduces no new mathematical entities or constants beyond typical optimization hyperparameters.

free parameters (1)
  • adaptive balancing weights
    The optimization adaptively balances contributions from different detection models, implying parameters that are learned or tuned during the process.
axioms (1)
  • domain assumption Multiple object detectors share exploitable vulnerabilities that can be captured in a single patch perturbation.
    This is invoked to justify the multi-model optimization framework for encouraging shared vulnerabilities.

pith-pipeline@v0.9.0 · 5467 in / 1367 out tokens · 58020 ms · 2026-05-08T08:28:31.129457+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

59 extracted references · 7 canonical work pages · 2 internal anchors

  1. [1]

    Towards Reliable Forgetting: A Survey on Machine Unlearning Verification

    L. Xue, S. Hu, W. Lu, Y . Shen, D. Li, P. Guo, Z. Zhou, M. Li, Y . Zhang, and L. Y . Zhang, “Towards reliable forgetting: A survey on machine unlearning verification,”arXiv preprint arXiv:2506.15115, 2025

  2. [2]

    Ufvideo: Towards unified fine-grained video cooperative understanding with large language models,

    H. Pan, C. Wei, D. Liang, Z. Huang, P. Gao, Z. Zhou, L. Xue, P. Yan, X. Wei, M. Liet al., “Ufvideo: Towards unified fine-grained video cooperative understanding with large language models,” inProceedings of the Computer Vision and Pattern Recognition Conference (CVPR’26), 2026

  3. [3]

    Tattoo: Training-free aesthetic- aware outfit recommendation,

    Y . Wu, X. Hu, Z. Zhou, and H. Lu, “Tattoo: Training-free aesthetic- aware outfit recommendation,”arXiv preprint arXiv:2509.23242, 2025

  4. [4]

    Deep reinforcement learning framework for autonomous driving,

    A. E. Sallab, M. Abdou, E. Perot, and S. K. Yogamani, “Deep reinforcement learning framework for autonomous driving,” in Autonomous Vehicles and Machines, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:12064877

  5. [5]

    Object detection and recog- nition for assistive robots: Experimentation and implementation,

    E. Martinez-Martin and A. P. Del Pobil, “Object detection and recog- nition for assistive robots: Experimentation and implementation,”IEEE Robotics & Automation Magazine, vol. 24, no. 3, pp. 123–138, 2017

  6. [6]

    Darkhash: A data-free backdoor attack against deep hashing,

    Z. Zhou, M. Deng, Y . Song, H. Zhang, W. Wan, S. Hu, M. Li, L. Y . Zhang, and D. Yao, “Darkhash: A data-free backdoor attack against deep hashing,”IEEE Transactions on Information Forensics and Security, 2025

  7. [7]

    Badhash: Invisible backdoor attacks against deep hashing with clean label,

    S. Hu, Z. Zhou, Y . Zhang, L. Y . Zhang, Y . Zheng, Y . He, and H. Jin, “Badhash: Invisible backdoor attacks against deep hashing with clean label,” inProceedings of the 30th ACM International Conference on Multimedia (ACM MM’22), 2022, pp. 678–686

  8. [8]

    Mars: A malignity-aware backdoor defense in federated learning,

    W. Wan, Y . Ning, Z. Huang, C. Hong, S. Hu, Z. Zhou, Y . Zhang, T. Zhu, W. Zhou, and L. Y . Zhang, “Mars: A malignity-aware backdoor defense in federated learning,” inProceedings of the 39th Annual Conference on Neural Information Processing Systems (NeurIPS’25), 2025

  9. [9]

    Detector collapse: Backdooring object detection to catastrophic overload or blindness,

    H. Zhang, S. Hu, Y . Wang, L. Y . Zhang, Z. Zhou, X. Wang, Y . Zhang, and C. Chen, “Detector collapse: Backdooring object detection to catastrophic overload or blindness,” inProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI’24), 2024

  10. [10]

    Test-time backdoor detection for object detection models,

    H. Zhang, Y . Wang, S. Yan, C. Zhu, Z. Zhou, L. Hou, S. Hu, M. Li, Y . Zhang, and L. Y . Zhang, “Test-time backdoor detection for object detection models,” inProceedings of the Computer Vision and Pattern Recognition Conference (CVPR’25), 2025, pp. 24 377–24 386

  11. [11]

    Trojanrobot: Backdoor attacks against robotic manipulation in the physical world,

    X. Wang, H. Pan, H. Zhang, M. Li, S. Hu, Z. Zhou, L. Xue, P. Guo, Y . Wang, W. Wanet al., “Trojanrobot: Backdoor attacks against robotic manipulation in the physical world,”arXiv e-prints, pp. arXiv–2411, 2024

  12. [12]

    Detecting and corrupting convolution-based unlearnable exam- ples,

    M. Li, X. Wang, Z. Yu, S. Hu, Z. Zhou, L. Zhang, and L. Y . Zhang, “Detecting and corrupting convolution-based unlearnable exam- ples,” inProceedings of the AAAI Conference on Artificial Intelligence (AAAI’25), vol. 39, no. 17, 2025, pp. 18 403–18 411

  13. [13]

    Eclipse: Expunging clean-label indiscriminate poisons via sparse diffusion purification,

    X. Wang, S. Hu, Y . Zhang, Z. Zhou, L. Y . Zhang, P. Xu, W. Wan, and H. Jin, “Eclipse: Expunging clean-label indiscriminate poisons via sparse diffusion purification,” inEuropean Symposium on Research in Computer Security. Springer, 2024, pp. 146–166

  14. [14]

    Spa-vlm: Stealthy poisoning attacks on rag-based vlm,

    L. Yu, Y . Zhang, Z. Zhou, Y . Wu, W. Wan, M. Li, S. Hu, P. Xiaobing, and J. Wang, “Spa-vlm: Stealthy poisoning attacks on rag-based vlm,” arXiv preprint arXiv:2505.23828, 2025

  15. [15]

    Unlearnable 3d point clouds: Class-wise transformation is all you need,

    X. Wang, M. Li, W. Liu, H. Zhang, S. Hu, Y . Zhang, Z. Zhou, and H. Jin, “Unlearnable 3d point clouds: Class-wise transformation is all you need,” inProceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS’24), vol. 37, 2024, pp. 99 404–99 432

  16. [16]

    Securely fine-tuning pre-trained encoders against adversarial examples,

    Z. Zhou, M. Li, W. Liu, S. Hu, Y . Zhang, W. Wan, L. Xue, L. Y . Zhang, D. Yao, and H. Jin, “Securely fine-tuning pre-trained encoders against adversarial examples,” inProceedings of the 2024 IEEE Symposium on Security and Privacy (SP’24), 2024

  17. [17]

    Advclip: Downstream-agnostic adversarial examples in multimodal contrastive learning,

    Z. Zhou, S. Hu, M. Li, H. Zhang, Y . Zhang, and H. Jin, “Advclip: Downstream-agnostic adversarial examples in multimodal contrastive learning,” inProceedings of the 32nd ACM International Conference on Multimedia (MM’23), 2023, pp. 6311–6320

  18. [18]

    Downstream-agnostic adversarial examples,

    Z. Zhou, S. Hu, R. Zhao, Q. Wang, L. Y . Zhang, J. Hou, and H. Jin, “Downstream-agnostic adversarial examples,” inProceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV’23), 2023, pp. 4345–4355

  19. [19]

    Numbod: A spatial-frequency fusion attack against object detectors,

    Z. Zhou, B. Li, Y . Song, S. Hu, W. Wan, L. Y . Zhang, D. Yao, and H. Jin, “Numbod: A spatial-frequency fusion attack against object detectors,” inProceedings of the 39th Annual AAAI Conference on Artificial Intelligence (AAAI’25), 2025

  20. [20]

    Darksam: Fooling segment anything model to segment nothing,

    Z. Zhou, Y . Song, M. Li, S. Hu, X. Wang, L. Y . Zhang, D. Yao, and H. Jin, “Darksam: Fooling segment anything model to segment nothing,” inProceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS’24), 2024

  21. [21]

    advpattern: Physical-world attacks on deep person re-identification via adversarially transformable patterns,

    Z. Wang, S. Zheng, M. Song, Q. Wang, A. Rahimpour, and H. Qi, “advpattern: Physical-world attacks on deep person re-identification via adversarially transformable patterns,” in IEEE/CVF ICCV 2019, pp. 8340–8349, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:201650275

  22. [22]

    Out- of-bounding-box triggers: A stealthy approach to cheat object detectors,

    T. Lin, L. Yu, G. Jin, R. Li, P. Wu, and L. Zhang, “Out- of-bounding-box triggers: A stealthy approach to cheat object detectors,”ArXiv, vol. abs/2410.10091, 2024. [Online]. Available: https://api.semanticscholar.org/CorpusID:273345346

  23. [23]

    T-sea: Transfer-based self-ensemble attack on object detection,

    H. Huang, Z. Chen, H. Chen, Y . Wang, and K. A. Zhang, “T-sea: Transfer-based self-ensemble attack on object detection,”in IEEE/CVF CVPR 2023, pp. 20 514–20 523, 2022

  24. [24]

    Ensemble-based blackbox attacks on dense prediction,

    Z. Cai, Y . Tan, and M. S. Asif, “Ensemble-based blackbox attacks on dense prediction,”in IEEE/CVF CVPR 2023, pp. 4045–4055, 2023. [On- line]. Available: https://api.semanticscholar.org/CorpusID:257766952

  25. [25]

    Doepatch: Dynamically optimized ensemble model for adversarial patches generation,

    W. Tan, Y . Li, C. Zhao, Z. Liu, and Q. Pan, “Doepatch: Dynamically optimized ensemble model for adversarial patches generation,”IEEE Transactions on Information Forensics and Security, 2024

  26. [26]

    L-hawk: A controllable physical adversarial patch against a long-distance target

    T. Liu, Y . Liu, Z. Ma, T. Yang, X. Liu, T. Li, and J. Ma, “L-hawk: A controllable physical adversarial patch against a long-distance target.” inNDSS, 2025

  27. [27]

    Blackboxbench: A comprehensive benchmark of black-box adversarial attacks,

    M. Zheng, X. Yan, Z. Zhu, H. Chen, and B. Wu, “Blackboxbench: A comprehensive benchmark of black-box adversarial attacks,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

  28. [28]

    Histograms of oriented gradients for human detection,

    N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” inin IEEE CVPR 2005, vol. 1. Ieee, 2005, pp. 886–893

  29. [29]

    Object recognition from local scale-invariant features,

    D. G. Lowe, “Object recognition from local scale-invariant features,” in Proceedings of the seventh IEEE international conference on computer vision, vol. 2. Ieee, 1999, pp. 1150–1157

  30. [30]

    A decision-theoretic generalization of on-line learning and an application to boosting,

    Y . Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,”Journal of computer and system sciences, vol. 55, no. 1, pp. 119–139, 1997

  31. [31]

    Spatial pyramid pooling in deep convolutional networks for visual recognition,

    K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1904– 1916, 2015

  32. [32]

    Faster r-cnn: Towards real-time object detection with region proposal networks,

    S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” inAdvances in Neural Information Processing Systems, vol. 28, 2015

  33. [33]

    Cascade r-cnn: Delving into high quality object detection,

    Z. Cai and N. Vasconcelos, “Cascade r-cnn: Delving into high quality object detection,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6154–6162

  34. [34]

    YOLOv11: An Overview of the Key Architectural Enhancements

    R. Khanam and M. Hussain, “Yolov11: An overview of the key architectural enhancements,” 2024, arXiv preprint arXiv:2410.17725

  35. [35]

    Ssd: Single shot multibox detector,

    W. Liu, D. Anguelov, and D. Erhan, “Ssd: Single shot multibox detector,” inEuropean Conference on Computer Vision. Springer, 2016, pp. 21–37

  36. [36]

    Focal loss for dense object detection,

    T.-Y . Lin, P. Goyal, R. Girshick, K. He, and P. Doll ´ar, “Focal loss for dense object detection,” inProceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2980–2988

  37. [37]

    End-to-end object detection with transformers,

    N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” in European Conference on Computer Vision. Springer, 2020, pp. 213– 229

  38. [38]

    Vanish into thin air: Cross-prompt universal adversarial attacks for sam2,

    Z. Zhou, Y . Hu, Y . Song, Z. Li, S. Hu, L. Y . Zhang, D. Yao, L. Zheng, and H. Jin, “Vanish into thin air: Cross-prompt universal adversarial attacks for sam2,” inProceedings of the 39th Annual Conference on Neural Information Processing Systems (NeurIPS’25), 2025

  39. [39]

    Advedm: Fine-grained adversarial attack against vlm-based embodied agents,

    Y . Wang, h. Zhang, P. Hewen, Z. Zhou, X. Wang, P. Guo, l. Xue, S. Hu, M. Li, L. Y . Zhang, and Yao, “Advedm: Fine-grained adversarial attack against vlm-based embodied agents,” inProceedings of the 39th Annual Conference on Neural Information Processing Systems (NeurIPS’25), 2025

  40. [40]

    Breaking barriers in physical-world adversarial examples: Improving robustness and transferability via robust feature,

    Y . Wang, Y . Chou, Z. Zhou, H. Zhang, W. Wan, S. Hu, and M. Li, “Breaking barriers in physical-world adversarial examples: Improving robustness and transferability via robust feature,” inProceedings of the 39th Annual AAAI Conference on Artificial Intelligence (AAAI’25), 2025

  41. [41]

    Pb-uap: Hybrid universal adversarial attack for image segmentation

    Y . Song, Z. Zhou, M. Li, X. Wang, M. Deng, W. Wan, S. Hu, and L. Y . Zhang, “Pb-uap: Hybrid universal adversarial attack for image segmentation.” inProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’25), 2025

  42. [42]

    Transferable adversarial facial images for privacy protection,

    M. Li, J. Wang, H. Zhang, Z. Zhou, S. Hu, and X. Pei, “Transferable adversarial facial images for privacy protection,” inProceedings of the 32nd ACM International Conference on Multimedia (ACM MM’24), 2024, pp. 10 649–10 658

  43. [43]

    Segtrans: Transferable adversarial examples for segmentation models,

    Y . Song, Z. Zhou, Q. Lu, H. Zhang, Y . Hu, L. Xue, S. Hu, M. Li, and L. Y . Zhang, “Segtrans: Transferable adversarial examples for segmentation models,”IEEE Transactions on Multimedia, 2025

  44. [44]

    Erosion attack for adversarial training to enhance semantic segmenta- tion robustness,

    Y . Song, Z. Zhou, M. Deng, Y . Hu, S. Hu, M. Li, and L. Y . Zhang, “Erosion attack for adversarial training to enhance semantic segmenta- tion robustness,” inProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’26), 2026

  45. [45]

    Towards evaluating the robustness of neural networks,

    N. Carlini and D. A. Wagner, “Towards evaluating the robustness of neural networks,”2017 IEEE Symposium on Security and Privacy (SP), 2016. [Online]. Available: https://api.semanticscholar.org/CorpusID:2893830

  46. [46]

    https://doi.org/10.48550/arXiv.1712.09665

    T. B. Brown, D. Man ´e, A. Roy, M. Abadi, and J. Gilmer, “Adversarial patch,”ArXiv, vol. abs/1712.09665, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:13198654

  47. [47]

    Anti-adversarially manipulated attributions for weakly and semi-supervised semantic segmentation,

    J. Lee, E. Kim, and S. Yoon, “Anti-adversarially manipulated attributions for weakly and semi-supervised semantic segmentation,” 2021 IEEE/CVF CVPR, pp. 4070–4078, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:232240142

  48. [48]

    Patchsegdet: Attack-agnostic detection of physical adversarial patches in face recognition systems,

    Z. Shen, Q. Li, X. Zhang, Y . Cai, X. Chen, P. An, H. Weng, and Y . Liu, “Patchsegdet: Attack-agnostic detection of physical adversarial patches in face recognition systems,” in2025 IEEE International Conference on Multimedia and Expo (ICME), 2025, pp. 1–6

  49. [49]

    Two-stage optimized unified adversarial patch for attacking visible-infrared cross- modal detectors in the physical world,

    C. Hu, W. Shi, W. Yao, T. Jiang, L. Tian, and W. Li, “Two-stage optimized unified adversarial patch for attacking visible-infrared cross- modal detectors in the physical world,”Applied Soft Computing, vol. 171, p. 112818, 2025

  50. [50]

    Adversarial universal stickers: Universal perturbation attacks on traffic sign using stickers,

    A. Etim and J. Szefer, “Adversarial universal stickers: Universal perturbation attacks on traffic sign using stickers,”arXiv preprint arXiv:2502.18724, 2025

  51. [51]

    Active: Towards highly transferable 3d physical camouflage for universal and robust vehicle evasion,

    N. Suryanto, Y . Kim, and H. T. Larasati, “Active: Towards highly transferable 3d physical camouflage for universal and robust vehicle evasion,”2023 IEEE/CVF ICCV, pp. 4282–4291, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:260887446

  52. [52]

    A unified framework for adversarial patch attacks against visual 3d object detection in autonomous driving,

    J. Wang, F. Li, and L. He, “A unified framework for adversarial patch attacks against visual 3d object detection in autonomous driving,”IEEE Transactions on Circuits and Systems for Video Technology, 2025

  53. [53]

    Adversarial translucent patch: a robust physical attack technique against object detectors,

    K. Tiliwalidi, C. Hu, W. Shi, G. Lu, and H. Wu, “Adversarial translucent patch: a robust physical attack technique against object detectors,” Pattern Analysis and Applications, vol. 28, no. 4, p. 157, 2025

  54. [54]

    Fooling automated surveil- lance cameras: adversarial patches to attack person detection,

    S. Thys, W. Van Ranst, and T. Goedem ´e, “Fooling automated surveil- lance cameras: adversarial patches to attack person detection,” inPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2019, pp. 0–0

  55. [55]

    Naturalistic physical adversarial patch for object detectors,

    Y .-C.-T. Hu, B.-H. Kung, D. S. Tan, J.-C. Chen, K.-L. Hua, and W.- H. Cheng, “Naturalistic physical adversarial patch for object detectors,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 7848–7857

  56. [56]

    Napguard: Towards de- tecting naturalistic adversarial patches,

    S. Wu, J. Wang, J. Zhao, Y . Wang, and X. Liu, “Napguard: Towards de- tecting naturalistic adversarial patches,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 24 367–24 376

  57. [57]

    Microsoft coco: Common objects in context,

    T.-Y . Lin, M. Maire, and S. J. Belongie, “Microsoft coco: Common objects in context,” inEuropean Conference on Computer Vision, 2014. [Online]. Available: https://api.semanticscholar.org/CorpusID:14113767

  58. [58]

    The pascal visual object classes (voc) challenge,

    M. Everingham and L. V . Gool, “The pascal visual object classes (voc) challenge,”International Journal of Computer Vision, vol. 88, pp. 303–338, 2010. [Online]. Available: https://api.semanticscholar.org/CorpusID:4246903

  59. [59]

    Carla: An open urban driving simulator,

    A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V . Koltun, “Carla: An open urban driving simulator,” inConference on robot learning. PMLR, 2017, pp. 1–16