Fast and Lightweight Backdoor Detection via Head Random Probing
Pith reviewed 2026-05-20 12:35 UTC · model grok-4.3
The pith
Backdoored neural networks concentrate responses on the target class when random latent probes are sent directly into the prediction head.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
HTell generates architecture-aware random latent probes, feeds them directly into the model head, and detects backdoors by analyzing class-wise response statistics; backdoored models exhibit abnormal response concentration on the target class under these probes.
What carries the argument
Head random probing: random latent inputs fed only to the prediction head followed by class-wise response concentration analysis.
Load-bearing premise
Backdoored models exhibit abnormal response concentration on the target class under random latent probes to the prediction head.
What would settle it
A backdoored model that produces evenly distributed class responses instead of target-class concentration when the prediction head receives random latent probes would invalidate the detection rule.
Figures
read the original abstract
Deep neural networks (DNNs) remain critically vulnerable to backdoor attacks. Existing post-training detectors often require clean or surrogate data, gradients, or iterative trigger reconstruction, leading to high computational costs and limited robustness under practical model-auditing scenarios. In this paper, we propose HTell, a fast and lightweight data-free backdoor detector based on head random probing. Instead of reconstructing diverse trigger patterns, HTell inspects their unified manifestation in the prediction head: backdoored models tend to exhibit abnormal response concentration on the target class under random latent probes. HTell generates architecture-aware random latent probes, feeds them directly into the model head, and detects backdoors by analyzing class-wise response statistics, without accessing real or surrogate data, model gradients, or parameter optimization. We evaluate HTell on a large-scale benchmark containing more than 6,000 backdoored models and over 700 clean models, covering 4 datasets, 14 architectures, and 21 types of backdoor attacks. HTell achieves 99.03% true positive rate and 2.11% false positive rate with only 12.69 ms/model detection latency, reducing the time cost by over 30,000$\times$ compared with representative gradient-based detectors. These results demonstrate that head random probing provides an accurate, robust, and efficient solution for large-scale data-free backdoor model auditing.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes HTell, a data-free backdoor detector that generates architecture-aware random latent probes, feeds them directly into the prediction head, and detects backdoors via class-wise response concentration statistics on the target label. It reports 99.03% true positive rate and 2.11% false positive rate on a benchmark of more than 6,000 backdoored models and over 700 clean models spanning 4 datasets, 14 architectures, and 21 attack types, with 12.69 ms/model latency (over 30,000× faster than gradient-based detectors).
Significance. If the central empirical observation holds—that backdoored models reliably exhibit detectable response concentration on the target class under random head probes across the evaluated attacks and architectures—the approach would represent a substantial practical advance for scalable, data-free model auditing in security-critical settings. The scale of the benchmark and the extreme efficiency are clear strengths that could enable large-scale deployment where existing methods are prohibitive.
major comments (3)
- [§3] §3 (Head Random Probing): The claim that all 21 attack types produce a unified, detectable head-level bias (abnormal concentration on the target class for random non-trigger probes) is load-bearing for the general applicability, yet the manuscript provides no per-attack analysis or mechanistic explanation of why attacks primarily modifying earlier layers must induce this specific head statistic; without it, the 99.03% aggregate TPR may not generalize beyond the benchmark.
- [§4.2] §4.2 (Evaluation): The reported TPR/FPR figures are aggregates only; absent a breakdown table by attack type or architecture showing uniform separation, it remains possible that a subset of the 21 attacks evades the concentration signal, undermining the cross-attack robustness asserted in the abstract.
- [§3.2] §3.2 (Probe Generation and Threshold): The concentration metric and decision threshold are presented as fixed, but no sensitivity study to probe distribution parameters or threshold choice is reported; this leaves open whether the separation is an intrinsic property or partly an artifact of benchmark-specific tuning.
minor comments (2)
- [Figure 3] Figure 3 (response distribution plots): axis labels and legend entries for the clean vs. backdoored histograms could be enlarged for readability.
- The manuscript cites prior detectors but could add a short related-work paragraph explicitly contrasting HTell with other recent data-free or head-only methods.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and for highlighting the strengths of our large-scale benchmark and the practical efficiency of HTell. We address each major comment below, proposing targeted revisions to improve clarity and robustness where appropriate.
read point-by-point responses
-
Referee: [§3] §3 (Head Random Probing): The claim that all 21 attack types produce a unified, detectable head-level bias (abnormal concentration on the target class for random non-trigger probes) is load-bearing for the general applicability, yet the manuscript provides no per-attack analysis or mechanistic explanation of why attacks primarily modifying earlier layers must induce this specific head statistic; without it, the 99.03% aggregate TPR may not generalize beyond the benchmark.
Authors: We agree that a per-attack breakdown would strengthen the presentation of cross-attack robustness. The manuscript's core contribution is the empirical demonstration that backdoored models exhibit this head-level concentration bias across the 21 evaluated attack types, supported by the aggregate results on over 6,000 models. A comprehensive mechanistic account of how every attack variant (including those primarily affecting earlier layers) propagates to produce this specific head statistic lies beyond the empirical scope of the current work. In the revision we will add a table reporting TPR/FPR per attack type to confirm consistency of the signal. revision: partial
-
Referee: [§4.2] §4.2 (Evaluation): The reported TPR/FPR figures are aggregates only; absent a breakdown table by attack type or architecture showing uniform separation, it remains possible that a subset of the 21 attacks evades the concentration signal, undermining the cross-attack robustness asserted in the abstract.
Authors: We accept this observation. While the aggregate metrics reflect strong overall performance, disaggregated results will better address potential concerns about non-uniform behavior. We will include a breakdown table by attack type and architecture in the revised manuscript. revision: yes
-
Referee: [§3.2] §3.2 (Probe Generation and Threshold): The concentration metric and decision threshold are presented as fixed, but no sensitivity study to probe distribution parameters or threshold choice is reported; this leaves open whether the separation is an intrinsic property or partly an artifact of benchmark-specific tuning.
Authors: The concentration statistic is computed directly from class-wise response distributions under architecture-aware random probes, and the threshold is calibrated on clean-model statistics to control FPR. We will add a sensitivity study in the revision examining variations in probe distribution parameters and threshold values to demonstrate that the separation is robust rather than benchmark-specific. revision: yes
- Mechanistic explanation of why attacks that primarily modify earlier layers reliably induce the specific head-level response concentration on the target class
Circularity Check
No significant circularity; detection rests on observable empirical property.
full rationale
The paper's core claim is that backdoored models exhibit abnormal response concentration on the target class when random latent probes are fed to the prediction head. This property is presented as a unified manifestation observed across attacks, not derived by fitting parameters to the target detection result or by self-referential definition. HTell simply measures class-wise statistics on architecture-aware random probes without data, gradients, or optimization. The large-scale evaluation (6000+ backdoored models, 700+ clean models across 21 attacks and 14 architectures) serves as independent validation rather than a closed loop. No self-citation chains, uniqueness theorems, or ansatz smuggling appear in the derivation; the method is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Backdoored models exhibit abnormal response concentration on the target class under random latent probes
Reference graph
Works this paper leans on
-
[1]
Y . Li, Y . Jiang, Z. Li, and S.-T. Xia, “Backdoor learning: A survey,” IEEE TNNLS, vol. 35, no. 1, pp. 5–22, 2022
work page 2022
-
[2]
Neural cleanse: Identifying and mitigating backdoor attacks in neural networks,
B. Wang, Y . Yao, S. Shan, H. Li, B. Viswanath, H. Zheng, and B. Y . Zhao, “Neural cleanse: Identifying and mitigating backdoor attacks in neural networks,” inIEEE S&P, 2019, pp. 707–723
work page 2019
-
[3]
H. Wang, Z. Xiang, D. J. Miller, and G. Kesidis, “Mm-bd: Post-training detection of backdoor attacks with arbitrary backdoor pattern types using a maximum margin statistic,” inIEEE S&P, 2024, pp. 1994–2012
work page 2024
-
[4]
Rethinking the reverse- engineering of trojan triggers,
Z. Wang, K. Mei, H. Ding, J. Zhai, and S. Ma, “Rethinking the reverse- engineering of trojan triggers,” vol. 35, pp. 9738–9753, 2022
work page 2022
-
[5]
Freeea- gle: Detecting complex neural trojans in data-free cases,
C. Fu, X. Zhang, S. Ji, T. Wang, P. Lin, Y . Feng, and J. Yin, “Freeea- gle: Detecting complex neural trojans in data-free cases,” inUSENIX Security, 2023, pp. 6399–6416
work page 2023
-
[6]
Barbie: Robust backdoor detection based on latent separability,
H. Zhang, Y . Bai, Y . Chen, Z. Ma, and W. Xu, “Barbie: Robust backdoor detection based on latent separability,” inNDSS, 2025
work page 2025
-
[7]
Detecting backdoor attacks on deep neural networks by activation clustering,
B. Chen, W. Carvalho, N. Baracaldo, H. Ludwig, B. Edwards, T. Lee, I. Molloy, and B. Srivastava, “Detecting backdoor attacks on deep neural networks by activation clustering,” inAAAI Workshop, 2019
work page 2019
-
[8]
Spectral signatures in backdoor attacks,
B. Tran, J. Li, and A. Madry, “Spectral signatures in backdoor attacks,” NeurIPS, vol. 31, 2018
work page 2018
-
[9]
Adversarial-inspired backdoor defense via bridging backdoor and adversarial attacks,
J.-L. Yin, W. Wang, W. Lin, X. Liuet al., “Adversarial-inspired backdoor defense via bridging backdoor and adversarial attacks,” inAAAI, vol. 39, no. 9, 2025, pp. 9508–9516
work page 2025
-
[10]
Need for speed: Taming backdoor attacks with speed and precision,
Z. Ma, Y . Yang, Y . Liu, T. Yang, X. Liu, T. Li, and Z. Qin, “Need for speed: Taming backdoor attacks with speed and precision,” inIEEE S&P, 2024, pp. 1217–1235
work page 2024
-
[11]
Test-time backdoor detection for object detection models,
H. Zhang, Y . Wang, S. Yan, C. Zhu, Z. Zhou, L. Hou, S. Hu, M. Li, Y . Zhang, and L. Y . Zhang, “Test-time backdoor detection for object detection models,” inCVPR, 2025, pp. 24 377–24 386. 12
work page 2025
-
[12]
Trojan signatures in dnn weights,
G. Fields, M. Samragh, M. Javaheripi, F. Koushanfar, and T. Javidi, “Trojan signatures in dnn weights,” inICCV, 2021, pp. 12–20
work page 2021
-
[13]
Data-free backdoor model inspection: Masking and reverse engineering loops for feature counting,
Q. Zhou, W. Luo, Z. Ye, and Y . Tang, “Data-free backdoor model inspection: Masking and reverse engineering loops for feature counting,” inIJCNN. IEEE, 2024, pp. 1–9
work page 2024
-
[14]
B. Cao, J. Jia, C. Hu, W. Guo, Z. Xiang, J. Chen, B. Li, and D. Song, “Data free backdoor attacks,”NeurIPS, vol. 37, pp. 23 881–23 911, 2024
work page 2024
-
[15]
Practical detection of trojan neural networks: Data-limited and data- free cases,
R. Wang, G. Zhang, S. Liu, P.-Y . Chen, J. Xiong, and M. Wang, “Practical detection of trojan neural networks: Data-limited and data- free cases,” inECCV, 2020, pp. 222–238
work page 2020
-
[16]
Tbt: Targeted neural network attack with bit trojan,
A. S. Rakin, Z. He, and D. Fan, “Tbt: Targeted neural network attack with bit trojan,” inCPVR, 2020, pp. 13 198–13 207
work page 2020
-
[17]
Model x- ray: Detecting backdoored models via decision boundary,
Y . Su, J. Zhang, T. Xu, T. Zhang, W. Zhang, and N. Yu, “Model x- ray: Detecting backdoored models via decision boundary,” inACM MM, 2024, pp. 10 296–10 305
work page 2024
-
[18]
H. Karimi, T. Derr, and J. Tang, “Characterizing the decision boundary of deep neural networks,”arXiv preprint arXiv:1912.11460, 2019
-
[19]
G. Somepalli, L. Fowl, A. Bansal, P. Yeh-Chiang, Y . Dar, R. Baraniuk, M. Goldblum, and T. Goldstein, “Can neural nets learn the same model twice? investigating reproducibility and double descent from the decision boundary perspective,” inCVPR, 2022, pp. 13 699–13 708
work page 2022
-
[20]
BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain
T. Gu, B. Dolan-Gavitt, and S. Garg, “Badnets: Identifying vulnera- bilities in the machine learning model supply chain,”arXiv preprint arXiv:1708.06733, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[21]
Revisiting the assumption of latent separability for backdoor defenses,
X. Qi, T. Xie, Y . Li, S. Mahloujifar, and P. Mittal, “Revisiting the assumption of latent separability for backdoor defenses,” inICLR, 2023
work page 2023
-
[22]
Lotus: Evasive and resilient backdoor attacks through sub-partitioning,
S. Cheng, G. Tao, Y . Liu, G. Shen, S. An, S. Feng, X. Xu, K. Zhang, S. Ma, and X. Zhang, “Lotus: Evasive and resilient backdoor attacks through sub-partitioning,” inCVPR, 2024, pp. 24 798–24 809
work page 2024
-
[23]
Input-aware dynamic backdoor attack,
T. A. Nguyen and A. Tran, “Input-aware dynamic backdoor attack,” NeurIPS, vol. 33, pp. 3454–3464, 2020
work page 2020
-
[24]
Invisible backdoor attack with sample-specific triggers,
Y . Li, Y . Li, B. Wu, L. Li, R. He, and S. Lyu, “Invisible backdoor attack with sample-specific triggers,” inICCV, 2021, pp. 16 463–16 472
work page 2021
-
[25]
Z. Wang, J. Zhai, and S. Ma, “Bppattack: Stealthy and efficient trojan attacks against deep neural networks via image quantization and con- trastive adversarial learning,” inCVPR, 2022, pp. 15 074–15 084
work page 2022
-
[26]
Blind backdoors in deep learning models,
E. Bagdasaryan and V . Shmatikov, “Blind backdoors in deep learning models,” inUSENIX Security, 2021, pp. 1505–1521
work page 2021
-
[27]
Hardly perceptible trojan attack against neural networks with bit flips,
J. Bai, K. Gao, D. Gong, S.-T. Xia, Z. Li, and W. Liu, “Hardly perceptible trojan attack against neural networks with bit flips,” in ECCV. Springer, 2022, pp. 104–121
work page 2022
-
[28]
Badencoder: Backdoor attacks to pre- trained encoders in self-supervised learning,
J. Jia, Y . Liu, and N. Z. Gong, “Badencoder: Backdoor attacks to pre- trained encoders in self-supervised learning,” inIEEE S&P, 2022, pp. 2043–2059
work page 2022
-
[29]
Distribution preserving backdoor attack in self-supervised learning,
G. Tao, Z. Wang, S. Feng, G. Shen, S. Ma, and X. Zhang, “Distribution preserving backdoor attack in self-supervised learning,” inIEEE S&P, 2024, pp. 2029–2047
work page 2024
-
[30]
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size
F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and<0.5 mb model size,”arXiv preprint arXiv:1602.07360, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[31]
Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning
X. Chen, C. Liu, B. Li, K. Lu, and D. Song, “Targeted backdoor attacks on deep learning systems using data poisoning,”arXiv preprint arXiv:1712.05526, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[32]
Wanet-imperceptible warping-based backdoor attack,
T. A. Nguyen and A. T. Tran, “Wanet-imperceptible warping-based backdoor attack,” inICLR, 2020
work page 2020
-
[33]
Lira: Learnable, imperceptible and robust backdoor attacks,
K. Doan, Y . Lao, W. Zhao, and P. Li, “Lira: Learnable, imperceptible and robust backdoor attacks,” inICCV, 2021, pp. 11 966–11 976
work page 2021
-
[34]
Trojaning attack on neural networks,
Y . Liu, S. Ma, Y . Aafer, W.-C. Lee, J. Zhai, W. Wang, and X. Zhang, “Trojaning attack on neural networks,” inNDSS, 2018
work page 2018
-
[35]
Rethinking the backdoor attacks’ triggers: A frequency perspective,
Y . Zeng, W. Park, Z. M. Mao, and R. Jia, “Rethinking the backdoor attacks’ triggers: A frequency perspective,” inICCV, 2021, pp. 16 473– 16 481
work page 2021
-
[36]
A data-free backdoor injection approach in neural networks,
P. Lv, C. Yue, R. Liang, Y . Yang, S. Zhang, H. Ma, and K. Chen, “A data-free backdoor injection approach in neural networks,” inUSENIX Security, 2023, pp. 2671–2688
work page 2023
-
[37]
Y . Yu, J. Liu, H. Guo, B. Mao, and N. Kato, “A spatiotemporal backdoor attack against behavior-oriented decision makers in metaverse: From perspective of autonomous driving,”IEEE JSAC, vol. 42, no. 4, pp. 948–962, 2024
work page 2024
-
[38]
Live trojan attacks on deep neural networks,
R. Costales, C. Mao, R. Norwitz, B. Kim, and J. Yang, “Live trojan attacks on deep neural networks,” inCVPR, 2020, pp. 796–797
work page 2020
-
[39]
Bit-flip attack: Crushing neural network with progressive bit search,
A. S. Rakin, Z. He, and D. Fan, “Bit-flip attack: Crushing neural network with progressive bit search,” inICCV, 2019, pp. 1211–1220
work page 2019
-
[40]
A new backdoor attack in cnns by training set corruption without label poisoning,
M. Barni, K. Kallas, and B. Tondi, “A new backdoor attack in cnns by training set corruption without label poisoning,” inIEEE ICIP, 2019
work page 2019
-
[41]
Label-Consistent Backdoor Attacks, December 2019
A. Turner, D. Tsipras, and A. Madry, “Label-consistent backdoor at- tacks,”arXiv preprint arXiv:1912.02771, 2019
-
[42]
Narcissus: A practical clean-label backdoor attack with limited information,
Y . Zeng, M. Pan, H. A. Just, L. Lyu, M. Qiu, and R. Jia, “Narcissus: A practical clean-label backdoor attack with limited information,” inCCS, 2023, pp. 771–785
work page 2023
-
[43]
Backdoor defense via decoupling the training process,
K. Huang, Y . Li, B. Wu, Z. Qin, and K. Ren, “Backdoor defense via decoupling the training process,”arXiv preprint arXiv:2202.03423, 2022
-
[44]
Strip: A defence against trojan attacks on deep neural networks,
Y . Gao, C. Xu, D. Wang, S. Chen, D. C. Ranasinghe, and S. Nepal, “Strip: A defence against trojan attacks on deep neural networks,” in ACSAC, 2019, pp. 113–125
work page 2019
-
[45]
Februus: Input purification defense against trojan attacks on deep neural network systems,
B. G. Doan, E. Abbasnejad, and D. C. Ranasinghe, “Februus: Input purification defense against trojan attacks on deep neural network systems,” inACSAC, 2020, pp. 897–912
work page 2020
-
[46]
J. Guo, Y . Li, X. Chen, H. Guo, L. Sun, and C. Liu, “Scale-up: An efficient black-box input-level backdoor detection via analyzing scaled prediction consistency,” inICLR, 2023
work page 2023
-
[47]
Detecting backdoors during the inference stage based on corruption robustness consistency,
X. Liu, M. Li, H. Wang, S. Hu, D. Ye, H. Jin, L. Wu, and C. Xiao, “Detecting backdoors during the inference stage based on corruption robustness consistency,” inCVPR, 2023, pp. 16 363–16 372
work page 2023
-
[48]
Detection of backdoors in trained classifiers without access to the training set,
Z. Xiang, D. J. Miller, and G. Kesidis, “Detection of backdoors in trained classifiers without access to the training set,”IEEE TNNLS, vol. 33, no. 3, pp. 1177–1191, 2020
work page 2020
-
[49]
Debackdoor: A deductive framework for detecting backdoor attacks on deep models with limited data,
D. Popovic, A. Sadeghi, T. Yu, S. Chawla, and I. Khalil, “Debackdoor: A deductive framework for detecting backdoor attacks on deep models with limited data,” inUSENIX Security, 2025
work page 2025
-
[50]
Abs: Scanning neural networks for back-doors by artificial brain stimulation,
Y . Liu, W.-C. Lee, G. Tao, S. Ma, Y . Aafer, and X. Zhang, “Abs: Scanning neural networks for back-doors by artificial brain stimulation,” inACM CCS, 2019, pp. 1265–1282
work page 2019
-
[51]
Deepinspect: A black-box trojan detection and mitigation framework for deep neural networks
H. Chen, C. Fu, J. Zhao, and F. Koushanfar, “Deepinspect: A black-box trojan detection and mitigation framework for deep neural networks.” in IJCAI, vol. 2, no. 5, 2019, p. 8
work page 2019
-
[52]
Detecting ai trojans using meta neural analysis,
X. Xu, Q. Wang, H. Li, N. Borisov, C. A. Gunter, and B. Li, “Detecting ai trojans using meta neural analysis,” inIEEE S&P, 2021, pp. 103–120
work page 2021
-
[53]
Demon in the variant: Statistical analysis of dnns for robust backdoor contamination detection,
D. Tang, X. Wang, H. Tang, and K. Zhang, “Demon in the variant: Statistical analysis of dnns for robust backdoor contamination detection,” inUSENIX Security, 2021, pp. 1541–1558
work page 2021
-
[54]
Randomized channel shuffling: minimal-overhead backdoor attack detection without clean datasets,
R. Cai, Z. Zhang, T. Chen, X. Chen, and Z. Wang, “Randomized channel shuffling: minimal-overhead backdoor attack detection without clean datasets,” inNeurIPS, 2022, pp. 33 876–33 889
work page 2022
-
[55]
Universal litmus patterns: Revealing backdoor attacks in cnns,
S. Kolouri, A. Saha, H. Pirsiavash, and H. Hoffmann, “Universal litmus patterns: Revealing backdoor attacks in cnns,” inCVPR, 2020, pp. 301– 310
work page 2020
-
[56]
Data-free backdoor removal based on channel lipschitzness,
R. Zheng, R. Tang, J. Li, and L. Liu, “Data-free backdoor removal based on channel lipschitzness,” inECCV, 2022, pp. 175–191
work page 2022
-
[57]
Exploring the orthogonality and linearity of backdoor attacks,
K. Zhang, S. Cheng, G. Shen, G. Tao, S. An, A. Makur, S. Ma, and X. Zhang, “Exploring the orthogonality and linearity of backdoor attacks,” inIEEE S&P, 2024, pp. 2105–2123
work page 2024
-
[58]
Robust backdoor detection for deep learning via topological evolution dynamics,
X. Mo, Y . Zhang, L. Y . Zhang, W. Luo, N. Sun, S. Hu, S. Gao, and Y . Xiang, “Robust backdoor detection for deep learning via topological evolution dynamics,” inIEEE S&P. IEEE, 2024, pp. 2048–2066
work page 2024
-
[59]
Backdoorbench: A comprehensive benchmark of backdoor learning,
B. Wu, H. Chen, M. Zhang, Z. Zhu, S. Wei, D. Yuan, and C. Shen, “Backdoorbench: A comprehensive benchmark of backdoor learning,” NeurIPS, vol. 35, pp. 10 546–10 559, 2022
work page 2022
-
[60]
A simple framework for contrastive learning of visual representations,
T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” inICML, 2020, pp. 1597–1607
work page 2020
-
[61]
Reading digits in natural images with unsupervised feature learning,
Y . Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, A. Y . Nget al., “Reading digits in natural images with unsupervised feature learning,” inNIPS workshop on deep learning and unsupervised feature learning, vol. 2011, no. 5, 2011, p. 7
work page 2011
-
[62]
Odscan: Backdoor scanning for object detection models,
S. Cheng, G. Shen, G. Tao, K. Zhang, Z. Zhang, S. An, X. Xu, Y . Li, S. Ma, and X. Zhang, “Odscan: Backdoor scanning for object detection models,” inIEEE S&P, 2024, pp. 1703–1721
work page 2024
-
[63]
A temporal-pattern backdoor attack to deep reinforcement learning,
Y . Yu, J. Liu, S. Li, K. Huang, and X. Feng, “A temporal-pattern backdoor attack to deep reinforcement learning,” inIEEE GLOBECOM, 2022, pp. 2710–2715
work page 2022
-
[64]
Marnet: Backdoor attacks against cooperative multi-agent reinforcement learning,
Y . Chen, Z. Zheng, and X. Gong, “Marnet: Backdoor attacks against cooperative multi-agent reinforcement learning,”IEEE TDSC, vol. 20, no. 5, pp. 4188–4198, 2022
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.