pith. machine review for the scientific record. sign in

arxiv: 2605.04675 · v1 · submitted 2026-05-06 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

Physical Adversarial Clothing Evades Visible-Thermal Detectors via Non-Overlapping RGB-T Pattern

Authors on Pith no claims yet

Pith reviewed 2026-05-08 17:46 UTC · model grok-4.3

classification 💻 cs.CV
keywords physical adversarial attacksRGB-T object detectionnon-overlapping patternsadversarial clothingmultimodal fusionvisible-thermal detectorstransferable attacks
0
0 comments X

The pith

Adversarial clothing with non-overlapping visible and thermal patterns evades RGB-T detectors in both digital and physical settings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that clothing can be printed with separate visible and thermal adversarial patterns that do not overlap on the fabric. This design avoids the light reduction that occurs when patterns overlap and is optimized on 3D models of a person to cover all viewing angles. The approach uses a spatial discrete-continuous optimization process to generate the patterns and achieves high success rates against multiple RGB-T detectors that fuse the two modalities differently. A fusion-stage ensemble further allows the same clothing to transfer effectively to detectors it was not trained on. Readers would care because RGB-T detection supports safety-critical systems such as autonomous driving under low light.

Core claim

Non-overlapping RGB-T patterns on adversarial clothing, generated via spatial discrete-continuous optimization on full-view 3D models, produce high attack success rates on visible-thermal detectors across different fusion architectures in both digital and physical worlds, while a fusion-stage ensemble improves transferability to unseen detectors.

What carries the argument

The non-overlapping RGB-T pattern (NORP) that places distinct visible and thermal adversarial materials on separate regions of the clothing, optimized by spatial discrete-continuous optimization (SDCO) on 3D human and clothing models to enable full 360-degree attacks.

Load-bearing premise

The 3D RGB-T models and material simulations accurately capture real-world lighting, thermal emission, and sensor responses across all viewing angles.

What would settle it

A controlled physical test in which the printed adversarial clothing is worn by a moving person under varied outdoor lighting and angles, then the attack success rate is measured against the simulated rates on the same detectors.

Figures

Figures reproduced from arXiv: 2605.04675 by Guanning Zeng, Jun Zhu, Xiaolin Hu, Xiaopei Zhu, Zhanhao Hu.

Figure 1
Figure 1. Figure 1: Demonstration of physical attacks against RGB-T de view at source ↗
Figure 2
Figure 2. Figure 2: The overall pipeline of the proposed method. We jointly optimizes visible and thermal patterns on 3D RGB-T clothing models view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of the SDCO method. In SRD, black pixels view at source ↗
Figure 4
Figure 4. Figure 4: ASRs for different RGB-T detectors at various (a) dis view at source ↗
Figure 5
Figure 5. Figure 5: Visualization of physical RGB-T attacks across diverse scenes. Top row: indoor scenarios. Bottom row: outdoor scenarios. O: view at source ↗
read the original abstract

Visible-thermal (RGB-T) object detection is a crucial technology for applications such as autonomous driving, where multimodal fusion enhances performance in challenging conditions like low light. However, the security of RGB-T detectors, particularly in the physical world, has been largely overlooked. This paper proposes a novel approach to RGB-T physical attacks using adversarial clothing with a non-overlapping RGB-T pattern (NORP). To simulate full-view (0$^{\circ}$--360$^{\circ}$) RGB-T attacks, we construct 3D RGB-T models for human and adversarial clothing. NORP is a new adversarial pattern design using distinct visible and thermal materials without overlap, avoiding the light reduction in overlapping RGB-T patterns (ORP). To optimize the NORP on adversarial clothing, we propose a spatial discrete-continuous optimization (SDCO) method. We systematically evaluated our method on RGB-T detectors with different fusion architectures, demonstrating high attack success rates both in the digital and physical worlds. Additionally, we introduce a fusion-stage ensemble method that enhances the transferability of adversarial attacks across unseen RGB-T detectors with different fusion architectures.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The paper proposes non-overlapping RGB-T adversarial patterns (NORP) printed on clothing to evade visible-thermal (RGB-T) object detectors. It constructs 3D RGB-T models of humans and clothing to enable full 0°–360° view simulation, introduces a spatial discrete-continuous optimization (SDCO) procedure to generate the patterns, reports high attack success rates (ASR) against RGB-T detectors with varied fusion architectures in both digital and physical settings, and adds a fusion-stage ensemble to boost transferability to unseen detectors.

Significance. If the physical-world transfer results hold under rigorous validation, the work would be significant for highlighting practical vulnerabilities in multimodal RGB-T detectors used in safety-critical settings such as autonomous driving. The NORP design directly addresses light-reduction problems of overlapping patterns, the 3D full-view modeling is a reasonable attempt to handle viewpoint variation, and the ensemble technique targets a known weakness in adversarial transfer. These elements could inform future defense research if supported by stronger empirical grounding.

major comments (3)
  1. [Physical evaluation] Physical-world evaluation section: high ASR is claimed for the fabricated NORP clothing across viewing angles and fusion architectures, yet no quantitative sim-to-real validation metrics (temperature prediction error, emissivity calibration error, or RGB-T image similarity scores between rendered and real captures) are supplied. Without these, it is impossible to determine whether the reported physical success stems from accurate modeling or from unmodeled factors, directly undermining the central transfer claim.
  2. [Method] SDCO optimization and 3D model construction: the method optimizes patterns on simulated 3D RGB-T meshes, but the manuscript provides no ablation on the impact of material property assumptions (e.g., thermal emissivity values or non-overlapping layer interactions) or on how sensor response functions are approximated. These modeling choices are load-bearing for the assertion that NORP outperforms ORP and generalizes across architectures.
  3. [Evaluation] Transferability experiments: the fusion-stage ensemble is presented as improving ASR on unseen detectors, but the evaluation lacks explicit baseline comparisons (e.g., single-model attacks or standard ensemble methods) and reports no statistical significance or variance across the tested fusion architectures, weakening the transferability conclusion.
minor comments (3)
  1. [Figures] Figure captions for the physical clothing results should explicitly list the detector fusion types, viewing angles, and environmental conditions under which each image was captured.
  2. [Method] The distinction between NORP and ORP would benefit from a short equation or pseudocode block defining the non-overlap constraint and the resulting radiance model.
  3. [Related Work] A small number of recent references on thermal adversarial attacks and multimodal fusion defenses are missing from the related-work section.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We sincerely thank the referee for the constructive and detailed feedback. We address each major comment point by point below, providing clarifications and committing to revisions that strengthen the empirical grounding of our claims without altering the core contributions.

read point-by-point responses
  1. Referee: [Physical evaluation] Physical-world evaluation section: high ASR is claimed for the fabricated NORP clothing across viewing angles and fusion architectures, yet no quantitative sim-to-real validation metrics (temperature prediction error, emissivity calibration error, or RGB-T image similarity scores between rendered and real captures) are supplied. Without these, it is impossible to determine whether the reported physical success stems from accurate modeling or from unmodeled factors, directly undermining the central transfer claim.

    Authors: We thank the referee for this important point. Our physical evaluations used real captures of the printed NORP clothing under controlled indoor and outdoor conditions matching the simulation viewpoints, yielding the reported ASRs. However, we did not include explicit quantitative sim-to-real metrics in the original manuscript. In the revised version we will add: (i) temperature prediction errors computed by comparing simulated thermal maps against contactless thermometer measurements on the fabric surface, (ii) details of emissivity calibration using known reference materials, and (iii) RGB-T image similarity scores (SSIM and LPIPS) between rendered and captured pairs. These additions will directly address the concern and support the modeling fidelity. revision: yes

  2. Referee: [Method] SDCO optimization and 3D model construction: the method optimizes patterns on simulated 3D RGB-T meshes, but the manuscript provides no ablation on the impact of material property assumptions (e.g., thermal emissivity values or non-overlapping layer interactions) or on how sensor response functions are approximated. These modeling choices are load-bearing for the assertion that NORP outperforms ORP and generalizes across architectures.

    Authors: We appreciate the referee drawing attention to the modeling assumptions. The 3D RGB-T meshes use literature-standard emissivity values (0.85 for clothing, 0.95 for skin) and approximate sensor responses via typical RGB and LWIR spectral sensitivity curves; non-overlapping layers are modeled by independent material assignment without cross-layer thermal interaction. While these choices are justified by prior work, we agree an ablation would be valuable. The revised manuscript will include a new ablation subsection (and supplementary figures) varying emissivity by ±0.1, testing alternative sensor response approximations, and measuring resulting ASR changes for both NORP and ORP. This will confirm robustness and strengthen the generalization claims. revision: yes

  3. Referee: [Evaluation] Transferability experiments: the fusion-stage ensemble is presented as improving ASR on unseen detectors, but the evaluation lacks explicit baseline comparisons (e.g., single-model attacks or standard ensemble methods) and reports no statistical significance or variance across the tested fusion architectures, weakening the transferability conclusion.

    Authors: We agree that the transferability evaluation can be made more rigorous. The current results show the fusion-stage ensemble achieving higher ASR on held-out detectors than the individual models used for optimization. To address the gaps, the revision will add: explicit comparisons against single-model attacks and both input-stage and decision-stage ensemble baselines; mean ASR with standard deviation across five independent optimization runs; and statistical significance testing (paired t-tests with p-values) across the different fusion architectures. These updates will provide clearer evidence for the ensemble's benefit. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation uses standard adversarial optimization and empirical evaluation on constructed models

full rationale

The paper's chain consists of constructing 3D RGB-T models, defining NORP as a non-overlapping pattern design, proposing SDCO for optimization, and reporting attack success rates on various fusion architectures in digital and physical settings. None of these steps reduce to self-definition, fitted inputs renamed as predictions, or load-bearing self-citations. The central claims rest on explicit simulation choices and experimental measurements rather than any quantity being equivalent to its inputs by construction. This is the expected non-finding for an empirical attack paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the work relies on standard 3D rendering and optimization techniques from prior adversarial ML literature.

pith-pipeline@v0.9.0 · 5507 in / 1021 out tokens · 30838 ms · 2026-05-08T17:46:56.091485+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

51 extracted references · 8 canonical work pages · 4 internal anchors

  1. [1]

    Cognitive data augmentation for adversarial defense via pixel masking.Pattern Recognition Letters, 146:244– 251, 2021

    Akshay Agarwal, Mayank Vatsa, Richa Singh, and Nalini Ratha. Cognitive data augmentation for adversarial defense via pixel masking.Pattern Recognition Letters, 146:244– 251, 2021. 8

  2. [2]

    Synthesizing robust adversarial examples

    Anish Athalye, Logan Engstrom, Andrew Ilyas, and Kevin Kwok. Synthesizing robust adversarial examples. InPro- ceedings of the 35th International Conference on Machine Learning, ICML, 2018. 4

  3. [3]

    Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

    Yoshua Bengio, Nicholas L ´eonard, and Aaron Courville. Estimating or propagating gradients through stochastic neurons for conditional computation.arXiv preprint arXiv:1308.3432, 2013. 7

  4. [4]

    Multimodal object detection via probabilistic ensembling

    Yi-Ting Chen, Jinghao Shi, Zelin Ye, Christoph Mertz, Deva Ramanan, and Shu Kong. Multimodal object detection via probabilistic ensembling. InEuropean Conference on Com- puter Vision, pages 139–158. Springer, 2022. 2, 6, 8

  5. [5]

    Spatio-contextual deep network-based multimodal pedestrian detection for au- tonomous driving.IEEE transactions on intelligent trans- portation systems, 23(9):15940–15950, 2022

    Kinjal Dasgupta, Arindam Das, Sudip Das, Ujjwal Bhat- tacharya, and Senthil Yogamani. Spatio-contextual deep network-based multimodal pedestrian detection for au- tonomous driving.IEEE transactions on intelligent trans- portation systems, 23(9):15940–15950, 2022. 1

  6. [6]

    Fusion-mamba for cross-modality object detection.arXiv preprint arXiv:2404.09146, 2024

    Wenhao Dong, Haodong Zhu, Shaohui Lin, Xiaoyan Luo, Yunhang Shen, Xuhui Liu, Juan Zhang, Guodong Guo, and Baochang Zhang. Fusion-mamba for cross-modality object detection.arXiv preprint arXiv:2404.09146, 2024. 2

  7. [7]

    Kai Qin, and Yun Yang

    Ranjie Duan, Xingjun Ma, Yisen Wang, James Bailey, A. Kai Qin, and Yun Yang. Adversarial camouflage: Hiding physical-world attacks with natural styles. InCVPR, pages 997–1005, 2020. 2

  8. [8]

    Enhanced thermal-rgb fusion for robust object detection

    Wassim El Ahmar, Yahya Massoud, Dhanvin Kolhatkar, Hamzah AlGhamdi, Mohammad Alja’Afreh, Riad Ham- moud, and Robert Laganiere. Enhanced thermal-rgb fusion for robust object detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 365–374, 2023. 2

  9. [9]

    Explaining and Harnessing Adversarial Examples

    Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples.arXiv preprint arXiv:1412.6572, 2014. 8

  10. [10]

    Countering Adversarial Images using Input Transformations

    Chuan Guo, Mayank Rana, Moustapha Cisse, and Laurens Van Der Maaten. Countering adversarial images using input transformations.arXiv preprint arXiv:1711.00117, 2017. 8

  11. [11]

    Damsdet: Dynamic adaptive multispectral detec- tion transformer with competitive query selection and adap- tive feature fusion

    Junjie Guo, Chenqiang Gao, Fangcen Liu, Deyu Meng, and Xinbo Gao. Damsdet: Dynamic adaptive multispectral detec- tion transformer with competitive query selection and adap- tive feature fusion. InECCV 2024, pages 464–481. Springer,

  12. [12]

    Dmffnet: Dual-mode multi-scale fea- ture fusion-based pedestrian detection method.IEEE Access,

    Ruizhe Hu, Ting Rui, Yan Ouyang, Jinkang Wang, Qunyan Jiang, and Yinan Du. Dmffnet: Dual-mode multi-scale fea- ture fusion-based pedestrian detection method.IEEE Access,

  13. [13]

    Nat- uralistic physical adversarial patch for object detectors

    Yu-Chih-Tuan Hu, Bo-Han Kung, Daniel Stanley Tan, Jun- Cheng Chen, Kai-Lung Hua, and Wen-Huang Cheng. Nat- uralistic physical adversarial patch for object detectors. In ICCV, 2021. 2

  14. [14]

    Adversarial texture for fooling per- son detectors in the physical world

    Zhanhao Hu, Siyuan Huang, Xiaopei Zhu, Fuchun Sun, Bo Zhang, and Xiaolin Hu. Adversarial texture for fooling per- son detectors in the physical world. InProceedings of the IEEE/CVF conference on computer vision and pattern recog- nition, pages 13307–13316, 2022

  15. [15]

    Physically realizable natural-looking clothing textures evade person detectors via 3d modeling

    Zhanhao Hu, Wenda Chu, Xiaopei Zhu, Hui Zhang, Bo Zhang, and Xiaolin Hu. Physically realizable natural-looking clothing textures evade person detectors via 3d modeling. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), pages 16975–16984,

  16. [16]

    Categorical repa- rameterization with gumbel-softmax

    Eric Jang, Shixiang Gu, and Ben Poole. Categorical repa- rameterization with gumbel-softmax. InICLR, 2017. 7

  17. [17]

    Pad: Patch-agnostic defense against adversarial patch at- tacks

    Lihua Jing, Rui Wang, Wenqi Ren, Xin Dong, and Cong Zou. Pad: Patch-agnostic defense against adversarial patch at- tacks. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 24472–24481,

  18. [18]

    YOLOv11: An Overview of the Key Architectural Enhancements

    Rahima Khanam and Muhammad Hussain. Yolov11: An overview of the key architectural enhancements.arXiv preprint arXiv:2410.17725, 2024. 2, 6, 8

  19. [19]

    Map: Multispectral adversarial patch to attack person detection

    Taeheon Kim, Hong Joo Lee, and Yong Man Ro. Map: Multispectral adversarial patch to attack person detection. InICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 4853–4857. IEEE, 2022. 2, 3, 6

  20. [20]

    Multispec- tral invisible coating: laminated visible-thermal physical at- tack against multispectral object detectors using transparent low-e films

    Taeheon Kim, Youngjoon Yu, and Yong Man Ro. Multispec- tral invisible coating: laminated visible-thermal physical at- tack against multispectral object detectors using transparent low-e films. InProceedings of the AAAI Conference on Arti- ficial Intelligence, pages 1151–1159, 2023. 2, 3, 6

  21. [21]

    Regseg: An end-to-end network for multi- modal rgb-thermal registration and semantic segmentation

    Wenjie Lai, Fanyu Zeng, Xiao Hu, Shaowei He, Ziji Liu, and Yadong Jiang. Regseg: An end-to-end network for multi- modal rgb-thermal registration and semantic segmentation. IEEE Transactions on Image Processing, 2024. 1

  22. [22]

    Multispectral deep neural networks for pedestrian detection.arXiv preprint arXiv:1611.02644, 2016

    Jingjing Liu, Shaoting Zhang, Shu Wang, and Dimitris N Metaxas. Multispectral deep neural networks for pedestrian detection.arXiv preprint arXiv:1611.02644, 2016. 2, 6, 8

  23. [23]

    Target-aware dual ad- versarial learning and a multi-scenario multi-modality bench- mark to fuse infrared and visible for object detection

    Jinyuan Liu, Xin Fan, Zhanbo Huang, Guanyao Wu, Risheng Liu, Wei Zhong, and Zhongxuan Luo. Target-aware dual ad- versarial learning and a multi-scenario multi-modality bench- mark to fuse infrared and visible for object detection. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5802–5811, 2022. 2

  24. [24]

    Improving autonomous vehicle cognitive robustness in extreme weather with deep learning and thermal camera fusion.IEEE Open Journal of Vehicular Technology, 6:426– 441, 2025

    Mehmood Nawaz, Sheheryar Khan, Muhammad Daud, Muhammad Asim, Ghazanfar Ali Anwar, Ali Raza Shahid, Ho Pui Aaron HO, Tom Chan, Daniel Pak Kong, and Wu Yuan. Improving autonomous vehicle cognitive robustness in extreme weather with deep learning and thermal camera fusion.IEEE Open Journal of Vehicular Technology, 6:426– 441, 2025. 1

  25. [25]

    Modality-independent regression and training for improving multispectral pedestrian detection

    Han Ni, Wenna Wang, Shuai Yun, Zixu Zhao, and Xiuwei Zhang. Modality-independent regression and training for improving multispectral pedestrian detection. In2022 7th International Conference on Image, Vision and Computing (ICIVC), pages 75–80. IEEE, 2022. 2

  26. [26]

    Icafusion: Iterative cross-attention guided feature fusion for multispectral object detection.Pattern Recognition, 145:109913, 2024

    Jifeng Shen, Yifei Chen, Yue Liu, Xin Zuo, Heng Fan, and Wankou Yang. Icafusion: Iterative cross-attention guided feature fusion for multispectral object detection.Pattern Recognition, 145:109913, 2024. 2

  27. [27]

    Jedi: Entropy-based localization and removal of adversarial patches

    Bilel Tarchoun, Anouar Ben Khalifa, Mohamed Ali Mahjoub, Nael Abu-Ghazaleh, and Ihsen Alouani. Jedi: Entropy-based localization and removal of adversarial patches. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4087–4095,

  28. [28]

    Fool- ing automated surveillance cameras: Adversarial patches to attack person detection

    Simen Thys, Wiebe Van Ranst, and Toon Goedem ´e. Fool- ing automated surveillance cameras: Adversarial patches to attack person detection. InIEEE Conference on Computer Vi- sion and Pattern Recognition Workshops, CVPR Workshops,

  29. [29]

    Multispectral pedestrian detection using deep fusion convolutional neural networks

    J ¨org Wagner, V olker Fischer, Michael Herman, Sven Behnke, et al. Multispectral pedestrian detection using deep fusion convolutional neural networks. InESANN, pages 509–514,

  30. [30]

    Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

    Chien-Yao Wang, Alexey Bochkovskiy, and Hong- Yuan Mark Liao. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7464–7475, 2023. 2

  31. [31]

    Yolov9: Learning what you want to learn using pro- grammable gradient information

    Chien-Yao Wang, I-Hau Yeh, and Hong-Yuan Mark Liao. Yolov9: Learning what you want to learn using pro- grammable gradient information. InEuropean conference on computer vision, pages 1–21. Springer, 2024. 2

  32. [32]

    Hotcold block: Fooling thermal infrared detectors with a novel wearable de- sign

    Hui Wei, Zhixiang Wang, Xuemei Jia, Yinqiang Zheng, Hao Tang, Shin’ichi Satoh, and Zheng Wang. Hotcold block: Fooling thermal infrared detectors with a novel wearable de- sign. InProceedings of the AAAI conference on artificial intelligence, pages 15233–15241, 2023. 2

  33. [33]

    Physical adversarial attack meets computer vision: A decade survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024

    Hui Wei, Hao Tang, Xuemei Jia, Zhixiang Wang, Hanxun Yu, Zhubo Li, Shin’ichi Satoh, Luc Van Gool, and Zheng Wang. Physical adversarial attack meets computer vision: A decade survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024. 2

  34. [34]

    Adversarial sticker: A stealthy attack method in the physical world.IEEE Trans- actions on Pattern Analysis and Machine Intelligence, 45(3): 2711–2725, 2022

    Xingxing Wei, Ying Guo, and Jie Yu. Adversarial sticker: A stealthy attack method in the physical world.IEEE Trans- actions on Pattern Analysis and Machine Intelligence, 45(3): 2711–2725, 2022. 2, 6

  35. [35]

    Unified adversarial patch for visible-infrared cross-modal attacks in the physical world.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023

    Xingxing Wei, Yao Huang, Yitong Sun, and Jie Yu. Unified adversarial patch for visible-infrared cross-modal attacks in the physical world.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023. 2, 3, 6, 7

  36. [36]

    Physically adversar- ial infrared patches with learnable shapes and locations

    Xingxing Wei, Jie Yu, and Yao Huang. Physically adversar- ial infrared patches with learnable shapes and locations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12334–12342, 2023. 2

  37. [37]

    Napguard: Towards detecting naturalistic adver- sarial patches

    Siyang Wu, Jiakai Wang, Jiejie Zhao, Yazhe Wang, and Xian- glong Liu. Napguard: Towards detecting naturalistic adver- sarial patches. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24367– 24376, 2024. 8

  38. [38]

    Making an invisibility cloak: Real world adversarial attacks on object detectors

    Zuxuan Wu, Ser-Nam Lim, Larry S Davis, and Tom Gold- stein. Making an invisibility cloak: Real world adversarial attacks on object detectors. InEuropean Conference on Com- puter Vision, 2020. 2

  39. [39]

    Rgb-thermal based pedestrian detection with single-modal augmentation and roi pooling multiscale fusion

    Jiajun Xiang, Shuiping Gou, Ruimin Li, and Zhihui Zheng. Rgb-thermal based pedestrian detection with single-modal augmentation and roi pooling multiscale fusion. InIGARSS 2022-2022 IEEE International Geoscience and Remote Sens- ing Symposium, pages 3532–3535. IEEE, 2022. 2

  40. [40]

    Ms-detr: Multispectral pedestrian detection transformer with loosely coupled fusion and modality-balanced optimization.IEEE Transactions on Intelligent Transportation Systems, 2024

    Yinghui Xing, Shuo Yang, Song Wang, Shizhou Zhang, Guo- qiang Liang, Xiuwei Zhang, and Yanning Zhang. Ms-detr: Multispectral pedestrian detection transformer with loosely coupled fusion and modality-balanced optimization.IEEE Transactions on Intelligent Transportation Systems, 2024. 2

  41. [41]

    Adversarial t-shirt! evading person detectors in a physi- cal world

    Kaidi Xu, Gaoyuan Zhang, Sijia Liu, Quanfu Fan, Mengshu Sun, Hongge Chen, Pin-Yu Chen, Yanzhi Wang, and Xue Lin. Adversarial t-shirt! evading person detectors in a physi- cal world. InECCV, 2020. 2

  42. [42]

    Feature squeez- ing: Detecting adversarial examples in deep neural networks

    Weilin Xu, David Evans, and Yanjun Qi. Feature squeez- ing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155, 2017. 8

  43. [43]

    Multispectral fusion for object detection with cyclic fuse-and-refine blocks

    Heng Zhang, Elisa Fromont, S ´ebastien Lefevre, and Bruno Avignon. Multispectral fusion for object detection with cyclic fuse-and-refine blocks. In2020 IEEE International conference on image processing (ICIP), pages 276–280. IEEE, 2020. 5

  44. [44]

    E2e-mfd: Towards end-to-end synchronous multimodal fusion detec- tion.Advances in Neural Information Processing Systems, 37:52296–52322, 2024

    Jiaqing Zhang, Mingxiang Cao, Weiying Xie, Jie Lei, Daixun Li, Wenbo Huang, Yunsong Li, and Xue Yang. E2e-mfd: Towards end-to-end synchronous multimodal fusion detec- tion.Advances in Neural Information Processing Systems, 37:52296–52322, 2024. 2

  45. [45]

    Weakly aligned cross-modal learn- ing for multispectral pedestrian detection

    Lu Zhang, Xiangyu Zhu, Xiangyu Chen, Xu Yang, Zhen Lei, and Zhiyong Liu. Weakly aligned cross-modal learn- ing for multispectral pedestrian detection. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019. 2, 6, 8

  46. [46]

    Metafusion: Infrared and visible image fusion via meta- feature embedding from object detection

    Wenda Zhao, Shigeng Xie, Fan Zhao, You He, and Huchuan Lu. Metafusion: Infrared and visible image fusion via meta- feature embedding from object detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13955–13965, 2023. 2

  47. [47]

    Deformable DETR: Deformable Transformers for End-to-End Object Detection

    Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai. Deformable detr: Deformable trans- formers for end-to-end object detection.arXiv preprint arXiv:2010.04159, 2020. 2, 6, 8

  48. [48]

    Fooling thermal infrared pedestrian detectors in real world using small bulbs

    Xiaopei Zhu, Xiao Li, Jianmin Li, Zheyao Wang, and Xiaolin Hu. Fooling thermal infrared pedestrian detectors in real world using small bulbs. InProceedings of the AAAI Con- ference on Artificial Intelligence, pages 3616–3624, 2021. 2, 3, 6

  49. [49]

    Infrared invisible clothing: Hiding from infrared detectors at multiple angles in realworld

    Xiaopei Zhu, Zhanhao Hu, Siyuan Huang, Jianmin Li, and Xiaolin Hu. Infrared invisible clothing: Hiding from infrared detectors at multiple angles in realworld. InCVPR, 2022. 2, 6

  50. [50]

    Hiding from thermal imaging pedestrian detectors in the physical world.Neurocomputing, 564:126923, 2024

    Xiaopei Zhu, Xiao Li, Jianmin Li, Zheyao Wang, and Xiaolin Hu. Hiding from thermal imaging pedestrian detectors in the physical world.Neurocomputing, 564:126923, 2024. 2, 3

  51. [51]

    Infrared adversarial car stickers

    Xiaopei Zhu, Yuqiu Liu, Zhanhao Hu, Jianmin Li, and Xi- aolin Hu. Infrared adversarial car stickers. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24284–24293, 2024. 2, 3, 6