SR-CorrNet introduces an asymmetric TF-domain architecture with separation-reconstruction strategy and correlation-to-filter estimation that yields consistent gains on WSJ0-Mix, WHAMR!, and LibriCSS under anechoic, noisy-reverberant, and real-recorded conditions.
Multitalker speech separation with utterance-level permutation invariant training of deep recurrent neural networks,
2 Pith papers cite this work. Polarity classification is still indexing.
fields
eess.AS 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
A hybrid two-stage framework pairs a discriminative front-end for interference suppression with a generative decoder-only LM back-end to improve perceptual quality and speaker consistency in target speaker extraction and speech enhancement.
citing papers explorer
-
Asymmetric Encoder-Decoder Based on Time-Frequency Correlation for Speech Separation
SR-CorrNet introduces an asymmetric TF-domain architecture with separation-reconstruction strategy and correlation-to-filter estimation that yields consistent gains on WSJ0-Mix, WHAMR!, and LibriCSS under anechoic, noisy-reverberant, and real-recorded conditions.
-
Discriminative-Generative Target Speaker Extraction with Decoder-Only Language Models
A hybrid two-stage framework pairs a discriminative front-end for interference suppression with a generative decoder-only LM back-end to improve perceptual quality and speaker consistency in target speaker extraction and speech enhancement.