MeanFlow applied in latent space enables true one-step Token2Wav generation with up to 17x RTF improvement and negligible quality loss versus multi-step baselines.
MSR-Codec: A low-bitrate multi-stream residual codec for high-fidelity speech generation with information disentanglement,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
SDP-Codec decouples speaker attributes from content and prosody via pitch injection in a single-stage pipeline, delivering competitive reconstruction, strong zero-shot voice conversion, and the lowest speaker-probing accuracy at comparable bitrates.
citing papers explorer
-
SDP-Codec: A Speaker-Decoupled Speech Codec with Pitch Injection for Low-Bitrate Coding and Zero-Shot Voice Conversion
SDP-Codec decouples speaker attributes from content and prosody via pitch injection in a single-stage pipeline, delivering competitive reconstruction, strong zero-shot voice conversion, and the lowest speaker-probing accuracy at comparable bitrates.