Backdoors can be realized as statistically natural latent directions in modern neural networks, achieving high attack success with negligible clean accuracy loss and resisting existing defenses.
Injecting undetectable backdoors in obfuscated neural networks and language models.Advances in Neural Information Processing Systems, 37:21537–21571, 2024
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CR 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
Sparse Backdoor plants a provably undetectable backdoor in neural network weights via structured sparse perturbations and isotropic Gaussian dithering, with detection hardness reduced to Sparse PCA.
citing papers explorer
-
Backdoor Channels Hidden in Latent Space: Cryptographic Undetectability in Modern Neural Networks
Backdoors can be realized as statistically natural latent directions in modern neural networks, achieving high attack success with negligible clean accuracy loss and resisting existing defenses.
-
Undetectable Backdoors in Model Parameters: Hiding Sparse Secrets in High Dimensions
Sparse Backdoor plants a provably undetectable backdoor in neural network weights via structured sparse perturbations and isotropic Gaussian dithering, with detection hardness reduced to Sparse PCA.