A neural model predicts a set of speaker embeddings from noisy mixtures to enable enrollment-free target speech extraction, outperforming baselines on LibriMix and generalizing to real recordings.
Dnsmos p.835: A non- intrusive perceptual objective speech quality metric to evaluate noise suppressors
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.AS 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Unmixing the Crowd: Learning Mixture-to-Set Speaker Embeddings for Enrollment-Free Target Speech Extraction
A neural model predicts a set of speaker embeddings from noisy mixtures to enable enrollment-free target speech extraction, outperforming baselines on LibriMix and generalizing to real recordings.