Recognition: no theorem link
Physical Adversarial Attacks on AI Surveillance Systems:Detection, Tracking, and Visible--Infrared Evasion
Pith reviewed 2026-05-10 19:12 UTC · model grok-4.3
The pith
Surveillance system robustness to physical attacks must be evaluated over time and across sensors rather than in isolated frames.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper organizes prior work on physical attacks using a four-part taxonomy of temporal persistence, sensing modality, carrier realism, and system-level objective. It concludes that surveillance robustness cannot be judged reliably from isolated per-frame benchmarks alone and must instead be examined as a system problem unfolding over time, across sensors, and under realistic physical deployment constraints.
What carries the argument
A four-part taxonomy of temporal persistence, sensing modality, carrier realism, and system-level objective that re-frames the literature around surveillance-specific questions rather than isolated image benchmarks.
If this is right
- An attack that suppresses a detector in one frame may lose effect if multi-object tracking recovers the person's identity across subsequent frames.
- Results obtained only on visible-light images may not transfer to systems that fuse visible and infrared inputs at night.
- Conspicuous printed patches imply a different threat model than wearable or selectively activated carriers that blend into normal clothing.
- Evaluation protocols need to incorporate distance robustness, camera-pipeline variations, and identity-level metrics instead of per-frame accuracy alone.
- Activation-aware testing becomes necessary when carriers can be turned on or off during operation.
Where Pith is reading between the lines
- Future attack generation methods could prioritize maintaining evasion consistency across frame sequences rather than optimizing single-frame fooling rates.
- Surveillance system designers might add explicit cross-modal consistency checks to reduce the impact of dual-visible-infrared attacks.
- The taxonomy could guide standardized test suites that include realistic carrier motion and selective activation patterns.
Load-bearing premise
The four-part taxonomy and the selected prior work adequately represent the central technical questions in deployed surveillance settings.
What would settle it
A controlled deployment study in which single-frame attack success rates predict end-to-end tracking or multi-modal evasion performance as accurately as or better than full time-series, cross-sensor evaluations.
Figures
read the original abstract
Physical adversarial attacks are increasingly studied in settings that resemble deployed surveillance systems rather than isolated image benchmarks. In these settings, person detection, multi-object tracking, visible--infrared sensing, and the practical form of the attack carrier all matter at once. This changes how the literature should be read. A perturbation that suppresses a detector in one frame may have limited practical effect if identity is recovered over time; an RGB-only result may say little about night-time systems that rely on visible and thermal inputs together; and a conspicuous patch can imply a different threat model from a wearable or selectively activated carrier. This paper reviews physical attacks from that surveillance-oriented viewpoint. Rather than attempting a complete catalogue of all physical attacks in computer vision, we focus on the technical questions that become central in surveillance: temporal persistence, sensing modality, carrier realism, and system-level objective. We organize prior work through a four-part taxonomy and discuss how recent results on multi-object tracking, dual-modal visible--infrared evasion, and controllable clothing reflect a broader change in the field. We also summarize evaluation practices and unresolved gaps, including distance robustness, camera-pipeline variation, identity-level metrics, and activation-aware testing. The resulting picture is that surveillance robustness cannot be judged reliably from isolated per-frame benchmarks alone; it has to be examined as a system problem unfolding over time, across sensors, and under realistic physical deployment constraints.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper reviews physical adversarial attacks on AI surveillance systems, arguing that robustness must be assessed as a system-level problem involving temporal persistence, multi-sensor modalities, carrier realism, and overall objectives rather than isolated per-frame benchmarks. It organizes prior work via a four-part taxonomy, discusses recent results on multi-object tracking, visible-infrared dual-modal evasion, and controllable clothing, and identifies gaps in evaluation practices such as distance robustness, camera-pipeline variation, identity-level metrics, and activation-aware testing.
Significance. If the taxonomy is comprehensive and the selected literature accurately illustrates the shift toward realistic surveillance constraints, the review could usefully reframe how the community evaluates physical attacks, encouraging more system-oriented benchmarks and defenses that account for tracking recovery, cross-modal sensing, and practical deployment factors.
minor comments (2)
- [Introduction] The abstract and introduction motivate the taxonomy well, but the manuscript should include an explicit statement of literature selection criteria (e.g., search terms, time window, inclusion of only peer-reviewed works) to allow readers to assess completeness.
- [Evaluation Practices] In the discussion of evaluation practices, the claim that identity-level metrics are underused would be strengthened by citing at least two concrete examples from the reviewed tracking papers where per-frame mAP was reported but track-level identity preservation was not.
Simulated Author's Rebuttal
We thank the referee for their accurate summary of the manuscript and for recommending minor revision. The description correctly reflects our focus on reframing physical adversarial attacks through a surveillance-oriented lens that prioritizes temporal persistence, multi-modal sensing, carrier realism, and system-level objectives over isolated per-frame metrics.
Circularity Check
No significant circularity
full rationale
The paper is a literature review that proposes a four-part taxonomy (temporal persistence, sensing modality, carrier realism, system-level objective) to organize prior work on physical adversarial attacks in surveillance contexts. No mathematical derivations, equations, fitted parameters, or predictions appear; the central claim that isolated per-frame benchmarks are insufficient follows directly from the taxonomy's framing and external citations rather than any internal construction or self-referential reduction. No self-citation load-bearing steps, ansatz smuggling, or renaming of known results are present.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The four-part taxonomy (temporal persistence, sensing modality, carrier realism, system-level objective) captures the key technical questions that become central in surveillance settings.
Reference graph
Works this paper leans on
-
[1]
Threat of adversarial attacks on deep learning in computer vision: A survey.IEEE Access, 6:14410–14430, 2018
Naveed Akhtar and Ajmal Mian. Threat of adversarial attacks on deep learning in computer vision: A survey.IEEE Access, 6:14410–14430, 2018. 1, 2, 3
2018
-
[2]
Advances in adversarial attacks and defenses in com- puter vision: A survey.IEEE Access, 9:155161–155196,
Naveed Akhtar, Ajmal Mian, Navid Kardan, and Mubarak Shah. Advances in adversarial attacks and defenses in com- puter vision: A survey.IEEE Access, 9:155161–155196,
-
[3]
Synthesizing robust adversarial examples
Anish Athalye, Logan Engstrom, Andrew Ilyas, and Kevin Kwok. Synthesizing robust adversarial examples. InInter- national Conference on Machine Learning, pages 284–293,
-
[4]
https://doi.org/10.48550/arXiv.1712.09665
Tom B. Brown, Dandelion Man´e, Aurko Roy, Mart´ın Abadi, and Justin Gilmer. Adversarial patch.arXiv preprint arXiv:1712.09665, 2017. 3
-
[5]
Shang-Tse Chen, Cory Cornelius, Jason Martin, and Duen Horng Chau. Shapeshifter: Robust physical adver- sarial attack on faster r-cnn object detector.arXiv preprint arXiv:1804.05810, 2018. 2, 4, 6
-
[6]
A unified multi-scenario attacking network for visual object tracking.Proceedings of the AAAI Conference on Artificial Intelligence, 35(2):1097– 1104, 2021
Xuesong Chen, Canmiao Fu, Feng Zheng, Yong Zhao, Hong- sheng Li, Ping Luo, and Guo-Jun Qi. A unified multi-scenario attacking network for visual object tracking.Proceedings of the AAAI Conference on Artificial Intelligence, 35(2):1097– 1104, 2021. 2, 4, 6
2021
-
[7]
Zhi Cheng, Zhanhao Hu, Yuqiu Liu, Jianmin Li, Hang Su, and Xiaolin Hu. Full-distance evasion of pedestrian detec- 6 Stage 1 Digital tests Single model Per-frame metrics Stage 2 Lab physical tests Printed or wearable carrier Short-range views Stage 3 Operational variation Distance change Camera and ISP shifts Stage 4 Temporal persistence Tracking and ident...
2024
-
[8]
Jane Wang
Li Ding, Yongwei Wang, Kaiwen Yuan, Minyang Jiang, Ping Wang, Hua Huang, and Z. Jane Wang. Towards universal physical attacks on single object tracking.Proceedings of the AAAI Conference on Artificial Intelligence, 35(2):1236–1245,
-
[9]
Robust physical-world attacks on deep learning visual classification
Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. Robust physical-world attacks on deep learning visual classification. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1625–1634, 2018. 4, 6
2018
-
[10]
Jiang, Chuanyi Liu, and Siu-Ming Yiu
Junbin Fang, You Jiang, Canjian Jiang, Zoe L. Jiang, Chuanyi Liu, and Siu-Ming Yiu. State-of-the-art optical-based physical adversarial attacks for deep learning computer vision systems. Expert Systems with Applications, 252:123761, 2024. 1, 2, 3, 5
2024
-
[11]
Dap: A dynamic adversarial patch for evading person detectors
Amira Guesmi, Ruitian Ding, Muhammad Abdullah Hanif, Ihsen Alouani, and Muhammad Shafique. Dap: A dynamic adversarial patch for evading person detectors. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24595–24604, 2024. 2, 4, 5, 6
2024
-
[12]
Reap: A large-scale realistic adversarial patch benchmark
Nabeel Hingun, Chawin Sitawarin, Jerry Li, and David Wag- ner. Reap: A large-scale realistic adversarial patch benchmark. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 7860–7871, 2023. 5
2023
-
[13]
Adversarial infrared blocks: A multi-view black-box attack to thermal infrared detectors in physical world.Neural Networks, 175: 106310, 2024
Chengyin Hu, Weiwen Shi, Tingsong Jiang, Wen Yao, Ling Tian, Xiaoqian Chen, Jingzhi Zhou, and Wen Li. Adversarial infrared blocks: A multi-view black-box attack to thermal infrared detectors in physical world.Neural Networks, 175: 106310, 2024. 2, 3, 4, 5, 6
2024
-
[14]
Adversarial infrared curves: An attack on infrared pedestrian detectors in the physical world.Neural Networks, 178:106459, 2024
Chengyin Hu, Weiwen Shi, Wen Yao, Tingsong Jiang, Ling Tian, Xiaoqian Chen, and Wen Li. Adversarial infrared curves: An attack on infrared pedestrian detectors in the physical world.Neural Networks, 178:106459, 2024. 2, 3, 4, 5, 6
2024
-
[15]
Nat- uralistic physical adversarial patch for object detectors
Yu-Chih-Tuan Hu, Bo-Han Kung, Daniel Stanley Tan, Jun- Cheng Chen, Kai-Lung Hua, and Wen-Huang Cheng. Nat- uralistic physical adversarial patch for object detectors. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7848–7857, 2021. 3, 4
2021
-
[16]
Adversarial texture for fooling person detectors in the physical world
Zhanhao Hu, Siyuan Huang, Xiaopei Zhu, Fuchun Sun, Bo Zhang, and Xiaolin Hu. Adversarial texture for fooling person detectors in the physical world. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13307–13316, 2022. 2, 3, 4, 5
2022
-
[17]
Iou attack: Towards temporally coherent black-box adver- sarial attack for visual object tracking
Shuai Jia, Yibing Song, Chao Ma, and Xiaokang Yang. Iou attack: Towards temporally coherent black-box adver- sarial attack for visual object tracking. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6709–6718, 2021. 2, 4, 6
2021
-
[18]
Llvip: A visible-infrared paired dataset for low-light vision
Xinyu Jia, Chuang Zhu, Minzhen Li, Wenqi Tang, and Wenli Zhou. Llvip: A visible-infrared paired dataset for low-light vision. InProceedings of the IEEE/CVF International Con- ference on Computer Vision, pages 3496–3504, 2021. 4, 5
2021
-
[19]
Delv Lin, Qi Chen, Chengyu Zhou, and Kun He. Tracklet- switch adversarial attack against pedestrian multi-object track- ing trackers.arXiv preprint arXiv:2111.08954, 2021. 2, 3, 4, 5, 6
-
[20]
https://doi.org/10.48550/arXiv.1806.02299
Xin Liu, Huanrui Yang, Ziwei Liu, Linghao Song, Hai Li, and Yiran Chen. Dpatch: An adversarial patch attack on object detectors.arXiv preprint arXiv:1806.02299, 2018. 2, 3, 4, 6
-
[21]
Papmot: Exploring adversarial patch attack against multiple object tracking
Jiahuan Long, Tingsong Jiang, Wen Yao, Shuai Jia, Weijia Zhang, Weien Zhou, Chao Ma, and Xiaoqian Chen. Papmot: Exploring adversarial patch attack against multiple object tracking. InEuropean Conference on Computer Vision, pages 128–144, 2024. 2, 3, 4, 5, 6
2024
-
[22]
Cdupatch: Color-driven universal adversarial patch attack for dual-modal visible-infrared detectors
Jiahuan Long, Wen Yao, Tingsong Jiang, Jiacheng Hou, Shuai Jia, Junqi Wu, Xiaoya Zhang, Xiaohu Zheng, and Chao Ma. Cdupatch: Color-driven universal adversarial patch attack for dual-modal visible-infrared detectors. InProceedings of the 33rd ACM International Conference on Multimedia, pages 1462–1470, 2025. 2, 3, 4, 5, 6
2025
-
[23]
Thermally activated dual-modal adversarial clothing against ai surveillance systems
Jiahuan Long, Tingsong Jiang, Hanqing Liu, Chao Ma, and Wen Yao. Thermally activated dual-modal adversarial clothing against ai surveillance systems. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2026. 2, 3, 4, 5, 6
2026
-
[24]
Multiple object tracking: A literature review.Artificial Intelligence, 293:103448, 2021
Wenhan Luo, Junliang Xing, Anton Milan, Xiaoqin Zhang, Wei Liu, and Tae-Kyun Kim. Multiple object tracking: A literature review.Artificial Intelligence, 293:103448, 2021. 2, 4
2021
-
[25]
https://doi.org/10.48550/arXiv.2305.01074
Kien Nguyen, Tharindu Fernando, Clinton Fookes, and Sridha Sridharan. Physical adversarial attacks for surveillance: A survey.arXiv preprint arXiv:2305.01074, 2023. 1, 2, 3, 5, 6
-
[26]
Blinding and blurring the multi-object tracker with adversarial perturbations.Neural Networks, 178:106408,
Haibo Pang, Rongqi Ma, Jie Su, Cheng-Ming Liu, Yufei Gao, and Qun Jin. Blinding and blurring the multi-object tracker with adversarial perturbations.Neural Networks, 178:106408,
-
[27]
Legiti- mate adversarial patches: Evading human eyes and detection models in the physical world
Jia Tan, Nan Ji, Haidong Xie, and Xueshuang Xiang. Legiti- mate adversarial patches: Evading human eyes and detection models in the physical world. InProceedings of the 29th ACM International Conference on Multimedia, pages 5307–5315,
-
[28]
Fooling automated surveillance cameras: Adversarial patches to attack person detection
Simen Thys, Wiebe Van Ranst, and Toon Goedem´e. Fooling automated surveillance cameras: Adversarial patches to attack person detection. InProceedings of the IEEE/CVF Confer- 7 ence on Computer Vision and Pattern Recognition Workshops, pages 49–55, 2019. 2, 3, 4, 5, 6
2019
-
[29]
Derui Wang, Chaoran Li, Sheng Wen, Qing-Long Han, Surya Nepal, Xiangyu Zhang, and Yang Xiang. Daedalus: Breaking non-maximum suppression in object detection via adversarial examples.arXiv preprint arXiv:1902.02067, 2019. 3, 4, 6
-
[30]
A survey on physical adversarial attack in computer vision.arXiv preprint arXiv:2209.14262, 2022
Donghua Wang, Wen Yao, Tingsong Jiang, Guijian Tang, and Xiaoqian Chen. A survey on physical adversarial attack in computer vision.arXiv preprint arXiv:2209.14262, 2022. 1, 2, 3, 5
-
[31]
Adversarial examples in the physical world: A survey
Jiakai Wang, Xianglong Liu, Jin Hu, Donghua Wang, Siyang Wu, Tingsong Jiang, Yuanfang Guo, Aishan Liu, and Jiantao Zhou. Adversarial examples in the physical world: A survey. arXiv preprint arXiv:2311.01473, 2023. 3
-
[32]
Hui Wei, Hao Tang, Xuemei Jia, Zhixiang Wang, Hanxun Yu, Zhubo Li, Shin’ichi Satoh, Luc Van Gool, and Zheng Wang. Physical adversarial attack meets computer vision: A decade survey.arXiv preprint arXiv:2209.15179, 2022. 1, 2, 3
-
[33]
Hotcold block: Fooling thermal infrared detectors with a novel wearable design
Hui Wei, Zhixiang Wang, Xuemei Jia, Yinqiang Zheng, Hao Tang, Shin’ichi Satoh, and Zheng Wang. Hotcold block: Fooling thermal infrared detectors with a novel wearable design. InProceedings of the AAAI Conference on Artificial Intelligence, pages 2907–2915, 2023. 2, 3, 4, 6
2023
-
[34]
Revisiting adversarial patches for designing camera-agnostic attacks against per- son detection.Advances in Neural Information Processing Systems, 37, 2024
Hui Wei, Zhixiang Wang, Kewei Zhang, Jiaqi Hou, Yuan- wei Liu, Hao Tang, and Zheng Wang. Revisiting adversarial patches for designing camera-agnostic attacks against per- son detection.Advances in Neural Information Processing Systems, 37, 2024. 2, 4, 5, 6
2024
-
[35]
Unified adversarial patch for visible-infrared cross-modal attacks in the physical world.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023
Xingxing Wei, Yao Huang, Yitong Sun, and Jie Yu. Unified adversarial patch for visible-infrared cross-modal attacks in the physical world.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023. 2, 3, 4, 5, 6
2023
-
[36]
Infrared adversarial patches with learnable shapes and locations in the physical world.International Journal of Computer Vision, 132(6): 1928–1944, 2024
Xingxing Wei, Jie Yu, and Yao Huang. Infrared adversarial patches with learnable shapes and locations in the physical world.International Journal of Computer Vision, 132(6): 1928–1944, 2024. 2, 4, 6
1928
-
[37]
Physical adversarial tex- tures that fool visual object tracking
Rey Reza Wiyatno and Anqi Xu. Physical adversarial tex- tures that fool visual object tracking. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 4824–4833, 2019. 2, 4, 5, 6
2019
-
[38]
Simple online and realtime tracking with a deep association metric
Nicolai Wojke, Alex Bewley, and Dietrich Paulus. Simple online and realtime tracking with a deep association metric. In IEEE International Conference on Image Processing, pages 3645–3649, 2017. 2, 4
2017
-
[39]
Davis, and Tom Gold- stein
Zuxuan Wu, Ser-Nam Lim, Larry S. Davis, and Tom Gold- stein. Making an invisibility cloak: Real world adversarial attacks on object detectors. InEuropean Conference on Com- puter Vision, pages 1–17, 2020. 3, 6
2020
-
[40]
Adversarial examples for seman- tic segmentation and object detection
Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, and Alan Yuille. Adversarial examples for seman- tic segmentation and object detection. InProceedings of the IEEE International Conference on Computer Vision, 2017. 2
2017
-
[41]
Adversarial t-shirt! evading person detectors in a physical world
Kaidi Xu, Gaoyuan Zhang, Sijia Liu, Quanfu Fan, Mengshu Sun, Hongge Chen, Pin-Yu Chen, Yanzhi Wang, and Xue Lin. Adversarial t-shirt! evading person detectors in a physical world. InEuropean Conference on Computer Vision, pages 665–681, 2020. 2, 3, 4, 5, 6
2020
-
[42]
Yue Zhao, Hong Zhu, Ruigang Liang, Qintao Shen, Shengzhi Zhang, and Kai Chen. Seeing isn’t believing: Practical adversarial attack against object detectors.arXiv preprint arXiv:1812.10217, 2018. 4, 6
-
[43]
F&f attack: Adversarial attack against multiple object trackers by inducing false negatives and false positives
Tao Zhou, Qi Ye, Wenhan Luo, Kaihao Zhang, Zhiguo Shi, and Jiming Chen. F&f attack: Adversarial attack against multiple object trackers by inducing false negatives and false positives. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 4573–4583, 2023. 2, 3, 4, 5, 6
2023
-
[44]
Tpatch: A triggered physical adversarial patch
Wenjun Zhu, Xiaoyu Ji, Yushi Cheng, Shibo Zhang, and Wenyuan Xu. Tpatch: A triggered physical adversarial patch. In32nd USENIX Security Symposium, pages 3035–3052,
-
[45]
Fooling thermal infrared pedestrian detectors in real world using small bulbs
Xiaopei Zhu, Xiao Li, Jianmin Li, Zheyao Wang, and Xi- aolin Hu. Fooling thermal infrared pedestrian detectors in real world using small bulbs. InProceedings of the AAAI Conference on Artificial Intelligence, pages 3616–3624, 2021. 2, 3, 4, 6
2021
-
[46]
Infrared invisible clothing: Hiding from infrared detectors at multiple angles in real world
Xiaopei Zhu, Zhanhao Hu, Siyuan Huang, Jianmin Li, and Xiaolin Hu. Infrared invisible clothing: Hiding from infrared detectors at multiple angles in real world. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13317–13326, 2022. 6
2022
-
[47]
Hiding from infrared detectors in real world with adversarial clothes.Applied Intelligence, 53(23): 29537–29555, 2023
Xiaopei Zhu, Zhanhao Hu, Siyuan Huang, Jianmin Li, Xiaolin Hu, and Zheyao Wang. Hiding from infrared detectors in real world with adversarial clothes.Applied Intelligence, 53(23): 29537–29555, 2023. 3
2023
-
[48]
Hiding from thermal imaging pedestrian detectors in the physical world.Neurocomputing, 564:126923, 2024
Xiaopei Zhu, Xiao Li, Jianmin Li, Zheyao Wang, and Xiaolin Hu. Hiding from thermal imaging pedestrian detectors in the physical world.Neurocomputing, 564:126923, 2024. 6
2024
-
[49]
Infrared adversarial car stickers
Xiaopei Zhu, Yuqiu Liu, Zhanhao Hu, Jianmin Li, and Xiaolin Hu. Infrared adversarial car stickers. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24284–24293, 2024. 2, 3, 4, 6
2024
-
[50]
The translucent patch: A physical and universal attack on object detectors
Alon Zolfi, Moshe Kravchik, Yuval Elovici, and Asaf Shabtai. The translucent patch: A physical and universal attack on object detectors. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15232– 15241, 2021. 3 8
2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.