SignRecGAN trains on separate sign and speech datasets via adversarial and reconstruction objectives to inject sign-derived prosody into TTS output using the S2PFormer model.
In: Interspeech 2022
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.SD 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
CleanCodec reframes audio tokenization as a selective information bottleneck to encode only perceptually important features at 12.5 tokens per second, outperforming prior codecs in efficiency, speaker similarity, and intelligibility.
EntangleCodec unifies semantic and acoustic audio tokenization via caption alignment and flow-matching decoding, reporting competitive reconstruction, +7.4% gains on MMAR understanding, and 0.6B-parameter ALMs surpassing 13B-parameter continuous baselines.
citing papers explorer
-
Sign-to-Speech Prosody Transfer via Sign Reconstruction-based GAN
SignRecGAN trains on separate sign and speech datasets via adversarial and reconstruction objectives to inject sign-derived prosody into TTS output using the S2PFormer model.
-
CleanCodec: Efficient and Robust Speech Tokenization via Perceptually Guided Encoding
CleanCodec reframes audio tokenization as a selective information bottleneck to encode only perceptually important features at 12.5 tokens per second, outperforming prior codecs in efficiency, speaker similarity, and intelligibility.
-
EntangleCodec: A Unified Discrete Audio Tokenizer via Semantic-Acoustic Entanglement
EntangleCodec unifies semantic and acoustic audio tokenization via caption alignment and flow-matching decoding, reporting competitive reconstruction, +7.4% gains on MMAR understanding, and 0.6B-parameter ALMs surpassing 13B-parameter continuous baselines.