SRD provides a threshold-independent, representation-level privacy assessment for voice anonymization that reveals system weaknesses not detected by equal error rate evaluation.
2nd V oicePrivacy Challenge Evaluation Plan
4 Pith papers cite this work. Polarity classification is still indexing.
fields
eess.AS 4representative citing papers
A dual-branch multimodal model combining ECAPA-TDNN on anonymized audio and BERT on transcripts outperforms prior attackers on five of seven VPAC benchmarks and reaches SOTA with augmentation.
Listeners detect automatic anonymization in pathological speech at 91-93% accuracy with a 30-point perceived quality drop, yet clinical severity ratings stay nearly unchanged for dysarthria, dysglossia, and dysphonia.
A two-stage framework replaces personally identifiable information via generative editing and anonymizes voices with a flow-matching model to achieve stronger privacy than VoicePrivacy baselines while keeping utility high for retrained ASR, TTS, and SER models.
citing papers explorer
-
Evaluating voice anonymisation using similarity rank disclosure
SRD provides a threshold-independent, representation-level privacy assessment for voice anonymization that reveals system weaknesses not detected by equal error rate evaluation.
-
VoxATtack: A Multimodal Attack on Voice Anonymization Systems
A dual-branch multimodal model combining ECAPA-TDNN on anonymized audio and BERT on transcripts outperforms prior attackers on five of seven VPAC benchmarks and reaches SOTA with augmentation.
-
Perceptual implications of automatic anonymization in pathological speech
Listeners detect automatic anonymization in pathological speech at 91-93% accuracy with a 30-point perceived quality drop, yet clinical severity ratings stay nearly unchanged for dysarthria, dysglossia, and dysphonia.
-
Anonymization, Not Elimination: Utility-Preserved Speech Anonymization
A two-stage framework replaces personally identifiable information via generative editing and anonymizes voices with a flow-matching model to achieve stronger privacy than VoicePrivacy baselines while keeping utility high for retrained ASR, TTS, and SER models.