Clustering DINO representations via K-means and LDA filters poisoned speech samples, reducing attack success rate from 99.75% to 0.25% at 10% poisoning level.
Clustering Unsupervised Representations as Defense against Poisoning Attacks on Speech Commands Classification System
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Poisoning attacks entail attackers intentionally tampering with training data. In this paper, we consider a dirty-label poisoning attack scenario on a speech commands classification system. The threat model assumes that certain utterances from one of the classes (source class) are poisoned by superimposing a trigger on it, and its label is changed to another class selected by the attacker (target class). We propose a filtering defense against such an attack. First, we use DIstillation with NO labels (DINO) to learn unsupervised representations for all the training examples. Next, we use K-means and LDA to cluster these representations. Finally, we keep the utterances with the most repeated label in their cluster for training and discard the rest. For a 10% poisoned source class, we demonstrate a drop in attack success rate from 99.75% to 0.25%. We test our defense against a variety of threat models, including different target and source classes, as well as trigger variations.
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Clustering Unsupervised Representations as Defense against Poisoning Attacks on Speech Commands Classification System
Clustering DINO representations via K-means and LDA filters poisoned speech samples, reducing attack success rate from 99.75% to 0.25% at 10% poisoning level.