Replacing early-reflected speech with time-shifted anechoic clean speech as the training target, combined with a two-stage distortion-perception framework, yields state-of-the-art universal speech enhancement.
V oicefixer: Toward general speech restoration with neural vocoder.arXiv:2109.13731,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.SD 2verdicts
UNVERDICTED 2representative citing papers
SonicMaster is a text-conditioned flow-matching generative model for unified music restoration and mastering, trained on a dataset of simulated degradations across equalization, dynamics, reverb, amplitude, and stereo.
citing papers explorer
-
Rethinking Training Targets, Architectures and Data Quality for Universal Speech Enhancement
Replacing early-reflected speech with time-shifted anechoic clean speech as the training target, combined with a two-stage distortion-perception framework, yields state-of-the-art universal speech enhancement.
-
SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering
SonicMaster is a text-conditioned flow-matching generative model for unified music restoration and mastering, trained on a dataset of simulated degradations across equalization, dynamics, reverb, amplitude, and stereo.