Sparse Backdoor plants a provably undetectable backdoor in neural network weights via structured sparse perturbations and isotropic Gaussian dithering, with detection hardness reduced to Sparse PCA.
arXiv preprint arXiv:1908.01763 (2019) 2
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
MIST detects Trojaned DNN updates by measuring spectral deviations in pre-activation representations against a benign fine-tuning reference, achieving high accuracy across datasets and attacks after a single update.
An explanation-based detector using seven novel metrics derived from GNN explanations identifies backdoored graphs with high performance on benchmark datasets against multiple attack models.
DeTrigger detects and mitigates backdoor attacks in federated learning via gradient analysis and temperature scaling, claiming up to 251x faster detection and 98.9% attack reduction on four datasets with minimal accuracy loss.
citing papers explorer
-
Undetectable Backdoors in Model Parameters: Hiding Sparse Secrets in High Dimensions
Sparse Backdoor plants a provably undetectable backdoor in neural network weights via structured sparse perturbations and isotropic Gaussian dithering, with detection hardness reduced to Sparse PCA.
-
Detecting Trojaned DNNs via Spectral Regression Analysis
MIST detects Trojaned DNN updates by measuring spectral deviations in pre-activation representations against a benign fine-tuning reference, achieving high accuracy across datasets and attacks after a single update.
-
Identifying Backdoored Graphs in Graph Neural Network Training: An Explanation-Based Approach with Novel Metrics
An explanation-based detector using seven novel metrics derived from GNN explanations identifies backdoored graphs with high performance on benchmark datasets against multiple attack models.
-
DeTrigger: A Gradient-Centric Approach to Backdoor Attack Mitigation in Federated Learning
DeTrigger detects and mitigates backdoor attacks in federated learning via gradient analysis and temperature scaling, claiming up to 251x faster detection and 98.9% attack reduction on four datasets with minimal accuracy loss.