Lightweight and Fast Backdoor Model Detection

Chunwei Tian; Daoqiang Zhang; Jiajia Liu; Jing Fang; Qi Zhu; Xuewen Zhang; Yinbo Yu

arxiv: 2605.18907 · v1 · pith:RACEEXFFnew · submitted 2026-05-17 · 💻 cs.CR · cs.AI

Lightweight and Fast Backdoor Model Detection

Yinbo Yu , Jing Fang , Xuewen Zhang , Chunwei Tian , Qi Zhu , Daoqiang Zhang , Jiajia Liu This is my paper

Pith reviewed 2026-05-20 12:40 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords backdoor detectionneural network securityparameter anomalytrojan cluemodel inspectiondeep learning defensestatic analysis

0 comments

The pith

DFBScanner detects backdoors by scoring anomalous parameter updates in the final classification layer.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that backdoor attacks leave a consistent signature of anomalous updates specifically in the parameters of the final classification layer of a neural network. Instead of hunting for particular triggers or needing clean reference data, DFBScanner builds several statistical indicators of these anomalies and merges them into a single Trojan clue score. Maximum anomaly scoring on this clue then flags backdoored models. A reader would care because the method runs in one millisecond per model while maintaining a 97 percent true-positive rate across thousands of models, many architectures, and many attack variants.

Core claim

Backdoor-induced feature perturbations produce distinctive and anomalous parameter updates in the final classification layer. By constructing multiple anomaly indicators from these parameters and combining them strategically into a Trojan clue, DFBScanner detects backdoors through maximum anomaly scoring. This yields an attack-agnostic detector that works without clean samples or trigger knowledge.

What carries the argument

The Trojan clue, a composite of multiple anomaly indicators computed on the final-layer parameters, which produces a maximum anomaly score used for backdoor classification.

If this is right

Detection time drops to one millisecond per model, enabling real-time scanning of large model repositories.
The same indicators work across twenty trigger types, three injection methods, and both all-to-one and all-to-all attack strategies.
No clean reference samples or prior trigger knowledge are required for operation.
Performance holds over twelve network architectures and four datasets in a benchmark of more than five thousand models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Model marketplaces could run this check automatically on every uploaded checkpoint before distribution.
If final-layer anomalies prove reliable, similar lightweight inspections might extend to detecting other forms of model tampering such as weight poisoning or adversarial fine-tuning.
The observation suggests that many attacks converge on the same last-layer vulnerability, which could guide future defense design toward protecting or monitoring that layer specifically.

Load-bearing premise

Backdoor attacks of different types and injection methods all produce parameter anomalies in the final layer that remain separable from the parameter distributions of clean models.

What would settle it

A backdoor attack that succeeds while leaving the final-layer parameters statistically indistinguishable from those of clean models, or a set of clean models that consistently receive high anomaly scores under the proposed indicators.

Figures

Figures reproduced from arXiv: 2605.18907 by Chunwei Tian, Daoqiang Zhang, Jiajia Liu, Jing Fang, Qi Zhu, Xuewen Zhang, Yinbo Yu.

**Figure 1.** Figure 1: T-SNE visualization of backdoor (in red dots) and clean (in blue dots) latent features and violin plot of final-layer weights of different classes (including the poison class and other clean classes) under different attacks. The violin plot demonstrates the probability density of the weight distribution through kernel density estimation. All models are trained on CIFAR-10, and the poison label is 4. R18=Re… view at source ↗

**Figure 2.** Figure 2: Bias value of the clean and backdoor models’ final layer. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Different trigger patches. and Wanet. In total, we construct 382 clean models and 4,320 all-to-one backdoor models as the full benchmark. Besides, we further train 120 all-to-all backdoor models on CIFAR10 and GTSRB using 6 architectures and 10 attacks. For each attack, we follow a loop permutation to generate target and source class pairs, i.e., each class k ∈ K is a poison target class with a source clas… view at source ↗

**Figure 4.** Figure 4: Backdoor detection accuracy of parameter anomaly indicators using maximum anomaly indices. [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Backdoor detection F1-score curve with different numbers [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Cosine similarity of the anomaly score between benign and backdoor models using all indicators and selected indicators. CNN6- [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

read the original abstract

Deep neural networks (DNN), despite their remarkable performance, are highly vulnerable to backdoor attacks. Existing defenses mainly rely on activation anomaly analysis or trigger reverse engineering and often require clean samples or prior knowledge of trigger patterns, resulting in limited efficacy, practicability, and generalizability. More critically, while advanced attacks can implement backdoor implantation in milliseconds, current detection approaches typically demand minutes or even hours. To this end, we propose DFBScanner, a lightweight static parameter inspection framework for fast backdoor scanning. DFBScanner leverages our key observation that backdoor-induced feature perturbations can lead to distinctive and anomalous parameter updates in the final classification layer. Hence, we shift our detection focus from recognizing diverse and attack-specific trigger patterns targeted by prior work, to identifying the unified backdoor manifestation within the final layer, thereby enabling efficient and attack-agnostic detection. Specifically, by constructing and strategically combining multiple anomaly indicators of the final-layer parameters into a Trojan clue, DFBScanner detects backdoors through maximum anomaly scoring. DFBScanner is evaluated on a large-scale backdoor benchmark, including over 5,000 backdoor models trained on 4 datasets, 12 network architectures, 20 types of backdoor triggers, 2 attack strategies (all-to-one and -all), and 3 backdoor injection methods (data poisoning, training pipeline manipulation, and bit-flips). Numerical results show that DFBScanner achieves a 97.17% true-positive rate, 0.95% false-positive rate, and an average detection time of only 1 ms per model, significantly outperforming prior methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes DFBScanner, a lightweight static parameter inspection framework for detecting backdoored DNN models. It is based on the observation that backdoor-induced feature perturbations lead to distinctive anomalous parameter updates in the final classification layer. The method constructs and combines multiple anomaly indicators into a 'Trojan clue' and detects backdoors via maximum anomaly scoring. Evaluation is performed on a large-scale benchmark of over 5,000 models across 4 datasets, 12 network architectures, 20 trigger types, 2 attack strategies, and 3 injection methods (data poisoning, training pipeline manipulation, and bit-flips), reporting 97.17% true-positive rate, 0.95% false-positive rate, and 1 ms average detection time per model.

Significance. If the central observation and detection performance hold under scrutiny, this would represent a meaningful contribution to backdoor defense by enabling fast, attack-agnostic, and sample-free detection that scales to large model repositories, addressing the speed and generality limitations of activation-analysis or trigger-reversal approaches.

major comments (3)

[§3.1] §3.1: The manuscript refers to 'constructing and strategically combining multiple anomaly indicators' of the final-layer parameters but provides no explicit mathematical definitions, normalization steps, or formulas for these indicators (e.g., no equations for statistical measures or weight perturbation quantification). This detail is load-bearing for reproducing the claimed 0.95% FPR and verifying that the indicators do not overlap with clean-model variation across the 12 architectures.
[§4.2] §4.2 and Table 3: While the evaluation includes bit-flip attacks and reports high TPR, there is no per-injection-method breakdown or ablation showing that anomalous updates are concentrated in the final classification layer rather than distributed or sparse across earlier layers. Bit-flip attacks can target arbitrary weights, so the 'unified backdoor manifestation within the final layer' claim requires explicit evidence that the indicators capture these cases without significant clean-model overlap.
[§3.3] §3.3: The threshold selection and weighting for the maximum anomaly scoring are not described (e.g., whether thresholds are fixed, cross-validated, or architecture-specific). Without this, it is difficult to assess the robustness of the reported metrics on the diverse benchmark of 5,000+ models.

minor comments (2)

[§3.2] The notation for the 'Trojan clue' combination step could be formalized with a short equation or pseudocode for clarity.
Figure 2 (or equivalent) showing example parameter distributions would benefit from explicit comparison between clean and backdoored models for each injection method.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. These have highlighted important areas where additional clarity and evidence will strengthen the manuscript. We address each major comment below and will revise the paper accordingly.

read point-by-point responses

Referee: [§3.1] §3.1: The manuscript refers to 'constructing and strategically combining multiple anomaly indicators' of the final-layer parameters but provides no explicit mathematical definitions, normalization steps, or formulas for these indicators (e.g., no equations for statistical measures or weight perturbation quantification). This detail is load-bearing for reproducing the claimed 0.95% FPR and verifying that the indicators do not overlap with clean-model variation across the 12 architectures.

Authors: We agree that explicit mathematical definitions are required for reproducibility. In the revised manuscript we will expand §3.1 with precise formulas for each anomaly indicator, including the statistical measures (mean, standard deviation, and deviation from clean-model baselines), the normalization procedure applied to each, and the exact combination rule used to form the Trojan clue score. These additions will allow direct verification that the indicators exhibit limited overlap with clean-model parameter variation across the 12 architectures. revision: yes
Referee: [§4.2] §4.2 and Table 3: While the evaluation includes bit-flip attacks and reports high TPR, there is no per-injection-method breakdown or ablation showing that anomalous updates are concentrated in the final classification layer rather than distributed or sparse across earlier layers. Bit-flip attacks can target arbitrary weights, so the 'unified backdoor manifestation within the final layer' claim requires explicit evidence that the indicators capture these cases without significant clean-model overlap.

Authors: We acknowledge that a per-injection-method breakdown and layer-specific ablation would provide stronger support for the final-layer claim, especially for bit-flip attacks. In the revision we will add a new table in §4.2 reporting TPR/FPR separately for data poisoning, training-pipeline manipulation, and bit-flips. We will also include an ablation that quantifies the concentration of anomalous updates in the final layer versus earlier layers for the bit-flip subset, together with direct comparison against clean-model score distributions to confirm limited overlap. revision: yes
Referee: [§3.3] §3.3: The threshold selection and weighting for the maximum anomaly scoring are not described (e.g., whether thresholds are fixed, cross-validated, or architecture-specific). Without this, it is difficult to assess the robustness of the reported metrics on the diverse benchmark of 5,000+ models.

Authors: We thank the referee for noting this omission. The thresholds are fixed values derived from the 99th-percentile anomaly scores of clean models and are applied uniformly across all architectures; the indicators receive equal weight in the maximum anomaly score. In the revised §3.3 we will explicitly state this selection procedure, report the exact percentile used, and add a short sensitivity analysis demonstrating that the reported metrics remain stable across the 5,000-model benchmark. revision: yes

Circularity Check

0 steps flagged

No significant circularity; detection method is empirically validated on independent benchmark

full rationale

The paper defines DFBScanner by constructing anomaly indicators on final-layer parameters and combining them into a Trojan clue for maximum anomaly scoring. This is presented as a shift to unified backdoor manifestation, then evaluated directly on a large-scale benchmark of over 5,000 models across multiple datasets, architectures, triggers, and injection methods. No equations, self-citations, or steps are shown that reduce the anomaly score or detection output to a fitted parameter or input defined by the method itself. The performance metrics (TPR, FPR, speed) are reported as empirical results rather than tautological consequences of the construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The framework rests primarily on the domain assumption that backdoors produce detectable final-layer anomalies and on any scoring thresholds used to combine indicators.

free parameters (1)

anomaly scoring thresholds or weights
Parameters likely needed to combine multiple indicators into the Trojan clue and set the maximum anomaly decision boundary, though not detailed in the abstract.

axioms (1)

domain assumption Backdoor-induced feature perturbations lead to distinctive and anomalous parameter updates in the final classification layer.
This is the central observation stated in the abstract upon which the entire detection approach is built.

pith-pipeline@v0.9.0 · 5832 in / 1443 out tokens · 53069 ms · 2026-05-20T12:40:29.137701+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

66 extracted references · 66 canonical work pages · 4 internal anchors

[1]

Badmerging: Backdoor attacks against model merging,

J. Zhang, J. Chi, Z. Li, K. Cai, Y . Zhang, and Y . Tian, “Badmerging: Backdoor attacks against model merging,” inACM CCS, 2024, pp. 4450– 4464

work page 2024
[2]

BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain

T. Gu, B. Dolan-Gavitt, and S. Garg, “Badnets: Identifying vulnera- bilities in the machine learning model supply chain,”arXiv preprint arXiv:1708.06733, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[3]

Spectral signatures in backdoor attacks,

B. Tran, J. Li, and A. Madry, “Spectral signatures in backdoor attacks,” NeurIPS, vol. 31, 2018

work page 2018
[4]

Detecting backdoor attacks on deep neural networks by activation clustering,

B. Chen, W. Carvalho, N. Baracaldo, H. Ludwig, B. Edwards, T. Lee, I. Molloy, and B. Srivastava, “Detecting backdoor attacks on deep neural networks by activation clustering,” inAAAI Workshop, 2019

work page 2019
[5]

Poison as a cure: Detecting & neutralizing variable-sized backdoor attacks in deep neural networks,

A. Chan and Y .-S. Ong, “Poison as a cure: Detecting & neutralizing variable-sized backdoor attacks in deep neural networks,”arXiv preprint arXiv:1911.08040, 2019

work page arXiv 1911
[6]

Detecting ai trojans using meta neural analysis,

X. Xu, Q. Wang, H. Li, N. Borisov, C. A. Gunter, and B. Li, “Detecting ai trojans using meta neural analysis,” inIEEE S&P, 2021, pp. 103–120

work page 2021
[7]

Deepinspect: A black-box trojan detection and mitigation framework for deep neural networks

H. Chen, C. Fu, J. Zhao, and F. Koushanfar, “Deepinspect: A black-box trojan detection and mitigation framework for deep neural networks.” in IJCAI, vol. 2, no. 5, 2019, p. 8

work page 2019
[8]

Neural cleanse: Identifying and mitigating backdoor attacks in neural networks,

B. Wang, Y . Yao, S. Shan, H. Li, B. Viswanath, H. Zheng, and B. Y . Zhao, “Neural cleanse: Identifying and mitigating backdoor attacks in neural networks,” inIEEE S&P, 2019, pp. 707–723

work page 2019
[9]

Defending neural backdoors via generative distribution modeling,

X. Qiao, Y . Yang, and H. Li, “Defending neural backdoors via generative distribution modeling,” inNeurIPS, vol. 32, 2019

work page 2019
[10]

Rethinking the reverse- engineering of trojan triggers,

Z. Wang, K. Mei, H. Ding, J. Zhai, and S. Ma, “Rethinking the reverse- engineering of trojan triggers,” vol. 35, pp. 9738–9753, 2022

work page 2022
[11]

Need for speed: Taming backdoor attacks with speed and precision,

Z. Ma, Y . Yang, Y . Liu, T. Yang, X. Liu, T. Li, and Z. Qin, “Need for speed: Taming backdoor attacks with speed and precision,” inIEEE S&P, 2024, pp. 1217–1235

work page 2024
[12]

Practical detection of trojan neural networks: Data-limited and data- free cases,

R. Wang, G. Zhang, S. Liu, P.-Y . Chen, J. Xiong, and M. Wang, “Practical detection of trojan neural networks: Data-limited and data- free cases,” inECCV, 2020, pp. 222–238

work page 2020
[13]

Freeea- gle: Detecting complex neural trojans in data-free cases,

C. Fu, X. Zhang, S. Ji, T. Wang, P. Lin, Y . Feng, and J. Yin, “Freeea- gle: Detecting complex neural trojans in data-free cases,” inUSENIX Security, 2023, pp. 6399–6416

work page 2023
[14]

Data-free backdoor model inspection: Masking and reverse engineering loops for feature counting,

Q. Zhou, W. Luo, Z. Ye, and Y . Tang, “Data-free backdoor model inspection: Masking and reverse engineering loops for feature counting,” inIJCNN. IEEE, 2024, pp. 1–9

work page 2024
[15]

Mm-bd: Post-training detection of backdoor attacks with arbitrary backdoor pattern types using a maximum margin statistic,

H. Wang, Z. Xiang, D. J. Miller, and G. Kesidis, “Mm-bd: Post-training detection of backdoor attacks with arbitrary backdoor pattern types using a maximum margin statistic,” inIEEE S&P, 2024, pp. 1994–2012. 12

work page 2024
[16]

Barbie: Robust backdoor detection based on latent separability,

H. Zhang, Y . Bai, Y . Chen, Z. Ma, and W. Xu, “Barbie: Robust backdoor detection based on latent separability,” inNDSS, 2025

work page 2025
[17]

Peftguard: detecting backdoor attacks against parameter-efficient fine- tuning,

Z. Sun, T. Cong, Y . Liu, C. Lin, X. He, R. Chen, X. Han, and X. Huang, “Peftguard: detecting backdoor attacks against parameter-efficient fine- tuning,” inIEEE S&P, 2025, pp. 1713–1731

work page 2025
[18]

Data free backdoor attacks,

B. Cao, J. Jia, C. Hu, W. Guo, Z. Xiang, J. Chen, B. Li, and D. Song, “Data free backdoor attacks,”NeurIPS, vol. 37, pp. 23 881–23 911, 2024

work page 2024
[19]

Tbt: Targeted neural network attack with bit trojan,

A. S. Rakin, Z. He, and D. Fan, “Tbt: Targeted neural network attack with bit trojan,” inCPVR, 2020, pp. 13 198–13 207

work page 2020
[20]

Live trojan attacks on deep neural networks,

R. Costales, C. Mao, R. Norwitz, B. Kim, and J. Yang, “Live trojan attacks on deep neural networks,” inCVPR, 2020, pp. 796–797

work page 2020
[21]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gellyet al., “An image is worth 16x16 words: Transformers for image recognition at scale,”arXiv preprint arXiv:2010.11929, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010
[22]

Blind backdoors in deep learning models,

E. Bagdasaryan and V . Shmatikov, “Blind backdoors in deep learning models,” inUSENIX Security, 2021, pp. 1505–1521

work page 2021
[23]

Hardly perceptible trojan attack against neural networks with bit flips,

J. Bai, K. Gao, D. Gong, S.-T. Xia, Z. Li, and W. Liu, “Hardly perceptible trojan attack against neural networks with bit flips,” in ECCV. Springer, 2022, pp. 104–121

work page 2022
[24]

Trojaning attack on neural networks,

Y . Liu, S. Ma, Y . Aafer, W.-C. Lee, J. Zhai, W. Wang, and X. Zhang, “Trojaning attack on neural networks,” inNDSS, 2018

work page 2018
[25]

Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning

X. Chen, C. Liu, B. Li, K. Lu, and D. Song, “Targeted backdoor attacks on deep learning systems using data poisoning,”arXiv preprint arXiv:1712.05526, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[26]

Lira: Learnable, imperceptible and robust backdoor attacks,

K. Doan, Y . Lao, W. Zhao, and P. Li, “Lira: Learnable, imperceptible and robust backdoor attacks,” inICCV, 2021, pp. 11 966–11 976

work page 2021
[27]

Invisible backdoor attack with sample-specific triggers,

Y . Li, Y . Li, B. Wu, L. Li, R. He, and S. Lyu, “Invisible backdoor attack with sample-specific triggers,” inICCV, 2021, pp. 16 463–16 472

work page 2021
[28]

Rethinking the backdoor attacks’ triggers: A frequency perspective,

Y . Zeng, W. Park, Z. M. Mao, and R. Jia, “Rethinking the backdoor attacks’ triggers: A frequency perspective,” inICCV, 2021, pp. 16 473– 16 481

work page 2021
[29]

Revisiting the assumption of latent separability for backdoor defenses,

X. Qi, T. Xie, Y . Li, S. Mahloujifar, and P. Mittal, “Revisiting the assumption of latent separability for backdoor defenses,” inICLR, 2023

work page 2023
[30]

Distribution preserving backdoor attack in self-supervised learning,

G. Tao, Z. Wang, S. Feng, G. Shen, S. Ma, and X. Zhang, “Distribution preserving backdoor attack in self-supervised learning,” inIEEE S&P, 2024, pp. 2029–2047

work page 2024
[31]

Wanet-imperceptible warping-based backdoor attack,

T. A. Nguyen and A. T. Tran, “Wanet-imperceptible warping-based backdoor attack,” inICLR, 2020

work page 2020
[32]

Practical attacks on deep neural networks by memory trojaning,

X. Hu, Y . Zhao, L. Deng, L. Liang, P. Zuo, J. Ye, Y . Lin, and Y . Xie, “Practical attacks on deep neural networks by memory trojaning,”IEEE TCAD, vol. 40, no. 6, pp. 1230–1243, 2020

work page 2020
[33]

Bit-flip attack: Crushing neural network with progressive bit search,

A. S. Rakin, Z. He, and D. Fan, “Bit-flip attack: Crushing neural network with progressive bit search,” inICCV, 2019, pp. 1211–1220

work page 2019
[34]

Input-aware dynamic backdoor attack,

T. A. Nguyen and A. Tran, “Input-aware dynamic backdoor attack,” NeurIPS, vol. 33, pp. 3454–3464, 2020

work page 2020
[35]

Bppattack: Stealthy and efficient trojan attacks against deep neural networks via image quantization and con- trastive adversarial learning,

Z. Wang, J. Zhai, and S. Ma, “Bppattack: Stealthy and efficient trojan attacks against deep neural networks via image quantization and con- trastive adversarial learning,” inCVPR, 2022, pp. 15 074–15 084

work page 2022
[36]

Poison ink: Robust and invisible backdoor attack,

J. Zhang, C. Dongdong, Q. Huang, J. Liao, W. Zhang, H. Feng, G. Hua, and N. Yu, “Poison ink: Robust and invisible backdoor attack,”IEEE TIP, vol. 31, pp. 5691–5705, 2022

work page 2022
[37]

Backdoor defense via decoupling the training process,

K. Huang, Y . Li, B. Wu, Z. Qin, and K. Ren, “Backdoor defense via decoupling the training process,”arXiv preprint arXiv:2202.03423, 2022

work page arXiv 2022
[38]

Strip: A defence against trojan attacks on deep neural networks,

Y . Gao, C. Xu, D. Wang, S. Chen, D. C. Ranasinghe, and S. Nepal, “Strip: A defence against trojan attacks on deep neural networks,” in ACSAC, 2019, pp. 113–125

work page 2019
[39]

Februus: Input purification defense against trojan attacks on deep neural network systems,

B. G. Doan, E. Abbasnejad, and D. C. Ranasinghe, “Februus: Input purification defense against trojan attacks on deep neural network systems,” inACSAC, 2020, pp. 897–912

work page 2020
[40]

Scale-up: An efficient black-box input-level backdoor detection via analyzing scaled prediction consistency,

J. Guo, Y . Li, X. Chen, H. Guo, L. Sun, and C. Liu, “Scale-up: An efficient black-box input-level backdoor detection via analyzing scaled prediction consistency,” inICLR, 2023

work page 2023
[41]

Detection of backdoors in trained classifiers without access to the training set,

Z. Xiang, D. J. Miller, and G. Kesidis, “Detection of backdoors in trained classifiers without access to the training set,”IEEE TNNLS, vol. 33, no. 3, pp. 1177–1191, 2020

work page 2020
[42]

Abs: Scanning neural networks for back-doors by artificial brain stimulation,

Y . Liu, W.-C. Lee, G. Tao, S. Ma, Y . Aafer, and X. Zhang, “Abs: Scanning neural networks for back-doors by artificial brain stimulation,” inACM CCS, 2019, pp. 1265–1282

work page 2019
[43]

Demon in the variant: Statistical analysis of dnns for robust backdoor contamination detection,

D. Tang, X. Wang, H. Tang, and K. Zhang, “Demon in the variant: Statistical analysis of dnns for robust backdoor contamination detection,” inUSENIX Security, 2021, pp. 1541–1558

work page 2021
[44]

Randomized channel shuffling: minimal-overhead backdoor attack detection without clean datasets,

R. Cai, Z. Zhang, T. Chen, X. Chen, and Z. Wang, “Randomized channel shuffling: minimal-overhead backdoor attack detection without clean datasets,” inNeurIPS, 2022, pp. 33 876–33 889

work page 2022
[45]

Universal litmus patterns: Revealing backdoor attacks in cnns,

S. Kolouri, A. Saha, H. Pirsiavash, and H. Hoffmann, “Universal litmus patterns: Revealing backdoor attacks in cnns,” inCVPR, 2020, pp. 301– 310

work page 2020
[46]

Trojan signatures in dnn weights,

G. Fields, M. Samragh, M. Javaheripi, F. Koushanfar, and T. Javidi, “Trojan signatures in dnn weights,” inICCV, 2021, pp. 12–20

work page 2021
[47]

Deephammer: Depleting the intelli- gence of deep neural networks through targeted chain of bit flips,

F. Yao, A. S. Rakin, and D. Fan, “Deephammer: Depleting the intelli- gence of deep neural networks through targeted chain of bit flips,” in USENIX Security, 2020, pp. 1463–1480

work page 2020
[48]

Hugging face – the ai community building the future,

H. Face, “Hugging face – the ai community building the future,” https: //huggingface.co

work page
[49]

Proflip: Targeted trojan attack with progressive bit flips,

H. Chen, C. Fu, J. Zhao, and F. Koushanfar, “Proflip: Targeted trojan attack with progressive bit flips,” inICCV, 2021, pp. 7718–7727

work page 2021
[50]

Contrastive neuron pruning for backdoor defense,

Y . Feng, B. Ma, D. Liu, Y . Zhang, W. Cai, and Y . Xia, “Contrastive neuron pruning for backdoor defense,”IEEE TIP, vol. 34, pp. 1234– 1245, 2025

work page 2025
[51]

Evidential deep learning to quantify classification uncertainty,

M. Sensoy, L. Kaplan, and M. Kandemir, “Evidential deep learning to quantify classification uncertainty,”NeurIPS, vol. 31, 2018

work page 2018
[52]

Gradient-based learning applied to document recognition,

Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,”Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998

work page 1998
[53]

Learning multiple layers of features from tiny images,

A. Krizhevsky, G. Hintonet al., “Learning multiple layers of features from tiny images,” 2009

work page 2009
[54]

Detection of traffic signs in real-world images: The german traffic sign detection benchmark,

S. Houben, J. Stallkamp, J. Salmen, M. Schlipsing, and C. Igel, “Detection of traffic signs in real-world images: The german traffic sign detection benchmark,” inIJCNN. IEEE, 2013, pp. 1–8

work page 2013
[55]

Tiny imagenet visual recognition challenge,

Y . Le and X. Yang, “Tiny imagenet visual recognition challenge,”CS 231N, vol. 7, no. 7, p. 3, 2015

work page 2015
[56]

Imagenet: A large-scale hierarchical image database,

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” inCVPR, 2009, pp. 248–255

work page 2009
[57]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inCVPR, 2016, pp. 770–778

work page 2016
[58]

Inception-v4, inception-resnet and the impact of residual connections on learning,

C. Szegedy, S. Ioffe, V . Vanhoucke, and A. Alemi, “Inception-v4, inception-resnet and the impact of residual connections on learning,” inAAAI, vol. 31, no. 1, 2017

work page 2017
[59]

Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,”arXiv preprint arXiv:1409.1556, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[60]

Efficientnet: Rethinking model scaling for convo- lutional neural networks,

M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convo- lutional neural networks,” inICML. PMLR, 2019, pp. 6105–6114

work page 2019
[61]

Searching for mobilenetv3,

A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y . Zhu, R. Pang, V . Vasudevanet al., “Searching for mobilenetv3,” in ICCV, 2019, pp. 1314–1324

work page 2019
[62]

Backdoorbench: A comprehensive benchmark of backdoor learning,

B. Wu, H. Chen, M. Zhang, Z. Zhu, S. Wei, D. Yuan, and C. Shen, “Backdoorbench: A comprehensive benchmark of backdoor learning,” NeurIPS, vol. 35, pp. 10 546–10 559, 2022

work page 2022
[63]

Pyod: A python toolbox for scalable outlier detection,

Y . Zhao, Z. Nasrullah, and Z. Li, “Pyod: A python toolbox for scalable outlier detection,”Journal of machine learning research, vol. 20, no. 96, pp. 1–7, 2019

work page 2019
[64]

Can neural nets learn the same model twice? investigating reproducibility and double descent from the decision boundary perspective,

G. Somepalli, L. Fowl, A. Bansal, P. Yeh-Chiang, Y . Dar, R. Baraniuk, M. Goldblum, and T. Goldstein, “Can neural nets learn the same model twice? investigating reproducibility and double descent from the decision boundary perspective,” inCVPR, 2022, pp. 13 699–13 708

work page 2022
[65]

Exploring the orthogonality and linearity of backdoor attacks,

K. Zhang, S. Cheng, G. Shen, G. Tao, S. An, A. Makur, S. Ma, and X. Zhang, “Exploring the orthogonality and linearity of backdoor attacks,” inIEEE S&P, 2024, pp. 2105–2123

work page 2024
[66]

Clean & compact: Efficient data-free backdoor defense with model compactness,

H. Phan, J. Xiao, Y . Sui, T. Zhang, Z. Tang, C. Shi, Y . Wang, Y . Chen, and B. Yuan, “Clean & compact: Efficient data-free backdoor defense with model compactness,” inECCV, 2024

work page 2024

[1] [1]

Badmerging: Backdoor attacks against model merging,

J. Zhang, J. Chi, Z. Li, K. Cai, Y . Zhang, and Y . Tian, “Badmerging: Backdoor attacks against model merging,” inACM CCS, 2024, pp. 4450– 4464

work page 2024

[2] [2]

BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain

T. Gu, B. Dolan-Gavitt, and S. Garg, “Badnets: Identifying vulnera- bilities in the machine learning model supply chain,”arXiv preprint arXiv:1708.06733, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[3] [3]

Spectral signatures in backdoor attacks,

B. Tran, J. Li, and A. Madry, “Spectral signatures in backdoor attacks,” NeurIPS, vol. 31, 2018

work page 2018

[4] [4]

Detecting backdoor attacks on deep neural networks by activation clustering,

B. Chen, W. Carvalho, N. Baracaldo, H. Ludwig, B. Edwards, T. Lee, I. Molloy, and B. Srivastava, “Detecting backdoor attacks on deep neural networks by activation clustering,” inAAAI Workshop, 2019

work page 2019

[5] [5]

Poison as a cure: Detecting & neutralizing variable-sized backdoor attacks in deep neural networks,

A. Chan and Y .-S. Ong, “Poison as a cure: Detecting & neutralizing variable-sized backdoor attacks in deep neural networks,”arXiv preprint arXiv:1911.08040, 2019

work page arXiv 1911

[6] [6]

Detecting ai trojans using meta neural analysis,

X. Xu, Q. Wang, H. Li, N. Borisov, C. A. Gunter, and B. Li, “Detecting ai trojans using meta neural analysis,” inIEEE S&P, 2021, pp. 103–120

work page 2021

[7] [7]

Deepinspect: A black-box trojan detection and mitigation framework for deep neural networks

H. Chen, C. Fu, J. Zhao, and F. Koushanfar, “Deepinspect: A black-box trojan detection and mitigation framework for deep neural networks.” in IJCAI, vol. 2, no. 5, 2019, p. 8

work page 2019

[8] [8]

Neural cleanse: Identifying and mitigating backdoor attacks in neural networks,

B. Wang, Y . Yao, S. Shan, H. Li, B. Viswanath, H. Zheng, and B. Y . Zhao, “Neural cleanse: Identifying and mitigating backdoor attacks in neural networks,” inIEEE S&P, 2019, pp. 707–723

work page 2019

[9] [9]

Defending neural backdoors via generative distribution modeling,

X. Qiao, Y . Yang, and H. Li, “Defending neural backdoors via generative distribution modeling,” inNeurIPS, vol. 32, 2019

work page 2019

[10] [10]

Rethinking the reverse- engineering of trojan triggers,

Z. Wang, K. Mei, H. Ding, J. Zhai, and S. Ma, “Rethinking the reverse- engineering of trojan triggers,” vol. 35, pp. 9738–9753, 2022

work page 2022

[11] [11]

Need for speed: Taming backdoor attacks with speed and precision,

Z. Ma, Y . Yang, Y . Liu, T. Yang, X. Liu, T. Li, and Z. Qin, “Need for speed: Taming backdoor attacks with speed and precision,” inIEEE S&P, 2024, pp. 1217–1235

work page 2024

[12] [12]

Practical detection of trojan neural networks: Data-limited and data- free cases,

R. Wang, G. Zhang, S. Liu, P.-Y . Chen, J. Xiong, and M. Wang, “Practical detection of trojan neural networks: Data-limited and data- free cases,” inECCV, 2020, pp. 222–238

work page 2020

[13] [13]

Freeea- gle: Detecting complex neural trojans in data-free cases,

C. Fu, X. Zhang, S. Ji, T. Wang, P. Lin, Y . Feng, and J. Yin, “Freeea- gle: Detecting complex neural trojans in data-free cases,” inUSENIX Security, 2023, pp. 6399–6416

work page 2023

[14] [14]

Data-free backdoor model inspection: Masking and reverse engineering loops for feature counting,

Q. Zhou, W. Luo, Z. Ye, and Y . Tang, “Data-free backdoor model inspection: Masking and reverse engineering loops for feature counting,” inIJCNN. IEEE, 2024, pp. 1–9

work page 2024

[15] [15]

Mm-bd: Post-training detection of backdoor attacks with arbitrary backdoor pattern types using a maximum margin statistic,

H. Wang, Z. Xiang, D. J. Miller, and G. Kesidis, “Mm-bd: Post-training detection of backdoor attacks with arbitrary backdoor pattern types using a maximum margin statistic,” inIEEE S&P, 2024, pp. 1994–2012. 12

work page 2024

[16] [16]

Barbie: Robust backdoor detection based on latent separability,

H. Zhang, Y . Bai, Y . Chen, Z. Ma, and W. Xu, “Barbie: Robust backdoor detection based on latent separability,” inNDSS, 2025

work page 2025

[17] [17]

Peftguard: detecting backdoor attacks against parameter-efficient fine- tuning,

Z. Sun, T. Cong, Y . Liu, C. Lin, X. He, R. Chen, X. Han, and X. Huang, “Peftguard: detecting backdoor attacks against parameter-efficient fine- tuning,” inIEEE S&P, 2025, pp. 1713–1731

work page 2025

[18] [18]

Data free backdoor attacks,

B. Cao, J. Jia, C. Hu, W. Guo, Z. Xiang, J. Chen, B. Li, and D. Song, “Data free backdoor attacks,”NeurIPS, vol. 37, pp. 23 881–23 911, 2024

work page 2024

[19] [19]

Tbt: Targeted neural network attack with bit trojan,

A. S. Rakin, Z. He, and D. Fan, “Tbt: Targeted neural network attack with bit trojan,” inCPVR, 2020, pp. 13 198–13 207

work page 2020

[20] [20]

Live trojan attacks on deep neural networks,

R. Costales, C. Mao, R. Norwitz, B. Kim, and J. Yang, “Live trojan attacks on deep neural networks,” inCVPR, 2020, pp. 796–797

work page 2020

[21] [21]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gellyet al., “An image is worth 16x16 words: Transformers for image recognition at scale,”arXiv preprint arXiv:2010.11929, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010

[22] [22]

Blind backdoors in deep learning models,

E. Bagdasaryan and V . Shmatikov, “Blind backdoors in deep learning models,” inUSENIX Security, 2021, pp. 1505–1521

work page 2021

[23] [23]

Hardly perceptible trojan attack against neural networks with bit flips,

J. Bai, K. Gao, D. Gong, S.-T. Xia, Z. Li, and W. Liu, “Hardly perceptible trojan attack against neural networks with bit flips,” in ECCV. Springer, 2022, pp. 104–121

work page 2022

[24] [24]

Trojaning attack on neural networks,

Y . Liu, S. Ma, Y . Aafer, W.-C. Lee, J. Zhai, W. Wang, and X. Zhang, “Trojaning attack on neural networks,” inNDSS, 2018

work page 2018

[25] [25]

Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning

X. Chen, C. Liu, B. Li, K. Lu, and D. Song, “Targeted backdoor attacks on deep learning systems using data poisoning,”arXiv preprint arXiv:1712.05526, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[26] [26]

Lira: Learnable, imperceptible and robust backdoor attacks,

K. Doan, Y . Lao, W. Zhao, and P. Li, “Lira: Learnable, imperceptible and robust backdoor attacks,” inICCV, 2021, pp. 11 966–11 976

work page 2021

[27] [27]

Invisible backdoor attack with sample-specific triggers,

Y . Li, Y . Li, B. Wu, L. Li, R. He, and S. Lyu, “Invisible backdoor attack with sample-specific triggers,” inICCV, 2021, pp. 16 463–16 472

work page 2021

[28] [28]

Rethinking the backdoor attacks’ triggers: A frequency perspective,

Y . Zeng, W. Park, Z. M. Mao, and R. Jia, “Rethinking the backdoor attacks’ triggers: A frequency perspective,” inICCV, 2021, pp. 16 473– 16 481

work page 2021

[29] [29]

Revisiting the assumption of latent separability for backdoor defenses,

X. Qi, T. Xie, Y . Li, S. Mahloujifar, and P. Mittal, “Revisiting the assumption of latent separability for backdoor defenses,” inICLR, 2023

work page 2023

[30] [30]

Distribution preserving backdoor attack in self-supervised learning,

G. Tao, Z. Wang, S. Feng, G. Shen, S. Ma, and X. Zhang, “Distribution preserving backdoor attack in self-supervised learning,” inIEEE S&P, 2024, pp. 2029–2047

work page 2024

[31] [31]

Wanet-imperceptible warping-based backdoor attack,

T. A. Nguyen and A. T. Tran, “Wanet-imperceptible warping-based backdoor attack,” inICLR, 2020

work page 2020

[32] [32]

Practical attacks on deep neural networks by memory trojaning,

X. Hu, Y . Zhao, L. Deng, L. Liang, P. Zuo, J. Ye, Y . Lin, and Y . Xie, “Practical attacks on deep neural networks by memory trojaning,”IEEE TCAD, vol. 40, no. 6, pp. 1230–1243, 2020

work page 2020

[33] [33]

Bit-flip attack: Crushing neural network with progressive bit search,

A. S. Rakin, Z. He, and D. Fan, “Bit-flip attack: Crushing neural network with progressive bit search,” inICCV, 2019, pp. 1211–1220

work page 2019

[34] [34]

Input-aware dynamic backdoor attack,

T. A. Nguyen and A. Tran, “Input-aware dynamic backdoor attack,” NeurIPS, vol. 33, pp. 3454–3464, 2020

work page 2020

[35] [35]

Bppattack: Stealthy and efficient trojan attacks against deep neural networks via image quantization and con- trastive adversarial learning,

Z. Wang, J. Zhai, and S. Ma, “Bppattack: Stealthy and efficient trojan attacks against deep neural networks via image quantization and con- trastive adversarial learning,” inCVPR, 2022, pp. 15 074–15 084

work page 2022

[36] [36]

Poison ink: Robust and invisible backdoor attack,

J. Zhang, C. Dongdong, Q. Huang, J. Liao, W. Zhang, H. Feng, G. Hua, and N. Yu, “Poison ink: Robust and invisible backdoor attack,”IEEE TIP, vol. 31, pp. 5691–5705, 2022

work page 2022

[37] [37]

Backdoor defense via decoupling the training process,

K. Huang, Y . Li, B. Wu, Z. Qin, and K. Ren, “Backdoor defense via decoupling the training process,”arXiv preprint arXiv:2202.03423, 2022

work page arXiv 2022

[38] [38]

Strip: A defence against trojan attacks on deep neural networks,

Y . Gao, C. Xu, D. Wang, S. Chen, D. C. Ranasinghe, and S. Nepal, “Strip: A defence against trojan attacks on deep neural networks,” in ACSAC, 2019, pp. 113–125

work page 2019

[39] [39]

Februus: Input purification defense against trojan attacks on deep neural network systems,

B. G. Doan, E. Abbasnejad, and D. C. Ranasinghe, “Februus: Input purification defense against trojan attacks on deep neural network systems,” inACSAC, 2020, pp. 897–912

work page 2020

[40] [40]

Scale-up: An efficient black-box input-level backdoor detection via analyzing scaled prediction consistency,

J. Guo, Y . Li, X. Chen, H. Guo, L. Sun, and C. Liu, “Scale-up: An efficient black-box input-level backdoor detection via analyzing scaled prediction consistency,” inICLR, 2023

work page 2023

[41] [41]

Detection of backdoors in trained classifiers without access to the training set,

Z. Xiang, D. J. Miller, and G. Kesidis, “Detection of backdoors in trained classifiers without access to the training set,”IEEE TNNLS, vol. 33, no. 3, pp. 1177–1191, 2020

work page 2020

[42] [42]

Abs: Scanning neural networks for back-doors by artificial brain stimulation,

Y . Liu, W.-C. Lee, G. Tao, S. Ma, Y . Aafer, and X. Zhang, “Abs: Scanning neural networks for back-doors by artificial brain stimulation,” inACM CCS, 2019, pp. 1265–1282

work page 2019

[43] [43]

Demon in the variant: Statistical analysis of dnns for robust backdoor contamination detection,

D. Tang, X. Wang, H. Tang, and K. Zhang, “Demon in the variant: Statistical analysis of dnns for robust backdoor contamination detection,” inUSENIX Security, 2021, pp. 1541–1558

work page 2021

[44] [44]

Randomized channel shuffling: minimal-overhead backdoor attack detection without clean datasets,

R. Cai, Z. Zhang, T. Chen, X. Chen, and Z. Wang, “Randomized channel shuffling: minimal-overhead backdoor attack detection without clean datasets,” inNeurIPS, 2022, pp. 33 876–33 889

work page 2022

[45] [45]

Universal litmus patterns: Revealing backdoor attacks in cnns,

S. Kolouri, A. Saha, H. Pirsiavash, and H. Hoffmann, “Universal litmus patterns: Revealing backdoor attacks in cnns,” inCVPR, 2020, pp. 301– 310

work page 2020

[46] [46]

Trojan signatures in dnn weights,

G. Fields, M. Samragh, M. Javaheripi, F. Koushanfar, and T. Javidi, “Trojan signatures in dnn weights,” inICCV, 2021, pp. 12–20

work page 2021

[47] [47]

Deephammer: Depleting the intelli- gence of deep neural networks through targeted chain of bit flips,

F. Yao, A. S. Rakin, and D. Fan, “Deephammer: Depleting the intelli- gence of deep neural networks through targeted chain of bit flips,” in USENIX Security, 2020, pp. 1463–1480

work page 2020

[48] [48]

Hugging face – the ai community building the future,

H. Face, “Hugging face – the ai community building the future,” https: //huggingface.co

work page

[49] [49]

Proflip: Targeted trojan attack with progressive bit flips,

H. Chen, C. Fu, J. Zhao, and F. Koushanfar, “Proflip: Targeted trojan attack with progressive bit flips,” inICCV, 2021, pp. 7718–7727

work page 2021

[50] [50]

Contrastive neuron pruning for backdoor defense,

Y . Feng, B. Ma, D. Liu, Y . Zhang, W. Cai, and Y . Xia, “Contrastive neuron pruning for backdoor defense,”IEEE TIP, vol. 34, pp. 1234– 1245, 2025

work page 2025

[51] [51]

Evidential deep learning to quantify classification uncertainty,

M. Sensoy, L. Kaplan, and M. Kandemir, “Evidential deep learning to quantify classification uncertainty,”NeurIPS, vol. 31, 2018

work page 2018

[52] [52]

Gradient-based learning applied to document recognition,

Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,”Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998

work page 1998

[53] [53]

Learning multiple layers of features from tiny images,

A. Krizhevsky, G. Hintonet al., “Learning multiple layers of features from tiny images,” 2009

work page 2009

[54] [54]

Detection of traffic signs in real-world images: The german traffic sign detection benchmark,

S. Houben, J. Stallkamp, J. Salmen, M. Schlipsing, and C. Igel, “Detection of traffic signs in real-world images: The german traffic sign detection benchmark,” inIJCNN. IEEE, 2013, pp. 1–8

work page 2013

[55] [55]

Tiny imagenet visual recognition challenge,

Y . Le and X. Yang, “Tiny imagenet visual recognition challenge,”CS 231N, vol. 7, no. 7, p. 3, 2015

work page 2015

[56] [56]

Imagenet: A large-scale hierarchical image database,

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” inCVPR, 2009, pp. 248–255

work page 2009

[57] [57]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inCVPR, 2016, pp. 770–778

work page 2016

[58] [58]

Inception-v4, inception-resnet and the impact of residual connections on learning,

C. Szegedy, S. Ioffe, V . Vanhoucke, and A. Alemi, “Inception-v4, inception-resnet and the impact of residual connections on learning,” inAAAI, vol. 31, no. 1, 2017

work page 2017

[59] [59]

Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,”arXiv preprint arXiv:1409.1556, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[60] [60]

Efficientnet: Rethinking model scaling for convo- lutional neural networks,

M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convo- lutional neural networks,” inICML. PMLR, 2019, pp. 6105–6114

work page 2019

[61] [61]

Searching for mobilenetv3,

A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y . Zhu, R. Pang, V . Vasudevanet al., “Searching for mobilenetv3,” in ICCV, 2019, pp. 1314–1324

work page 2019

[62] [62]

Backdoorbench: A comprehensive benchmark of backdoor learning,

B. Wu, H. Chen, M. Zhang, Z. Zhu, S. Wei, D. Yuan, and C. Shen, “Backdoorbench: A comprehensive benchmark of backdoor learning,” NeurIPS, vol. 35, pp. 10 546–10 559, 2022

work page 2022

[63] [63]

Pyod: A python toolbox for scalable outlier detection,

Y . Zhao, Z. Nasrullah, and Z. Li, “Pyod: A python toolbox for scalable outlier detection,”Journal of machine learning research, vol. 20, no. 96, pp. 1–7, 2019

work page 2019

[64] [64]

Can neural nets learn the same model twice? investigating reproducibility and double descent from the decision boundary perspective,

G. Somepalli, L. Fowl, A. Bansal, P. Yeh-Chiang, Y . Dar, R. Baraniuk, M. Goldblum, and T. Goldstein, “Can neural nets learn the same model twice? investigating reproducibility and double descent from the decision boundary perspective,” inCVPR, 2022, pp. 13 699–13 708

work page 2022

[65] [65]

Exploring the orthogonality and linearity of backdoor attacks,

K. Zhang, S. Cheng, G. Shen, G. Tao, S. An, A. Makur, S. Ma, and X. Zhang, “Exploring the orthogonality and linearity of backdoor attacks,” inIEEE S&P, 2024, pp. 2105–2123

work page 2024

[66] [66]

Clean & compact: Efficient data-free backdoor defense with model compactness,

H. Phan, J. Xiao, Y . Sui, T. Zhang, Z. Tang, C. Shi, Y . Wang, Y . Chen, and B. Yuan, “Clean & compact: Efficient data-free backdoor defense with model compactness,” inECCV, 2024

work page 2024