SignRecGAN trains on separate sign and speech datasets via adversarial and reconstruction objectives to inject sign-derived prosody into TTS output using the S2PFormer model.
Montreal Forced Aligner: Trainable Text-Speech Alignment Using Kaldi
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4representative citing papers
WorldSpeech supplies 65k hours of multilingual aligned speech data across 76 languages and delivers 63.5% average relative WER reduction after fine-tuning ASR models on 11 typologically diverse languages.
Balalaika is a data-centric annotation pipeline for Russian speech that combines semantic VAD, ASR ensembling, and prosody enrichment to build a 5.1k-hour corpus showing gains in denoising and TTS.
Phonological subspace collapse in SSL speech representations produces aetiology-specific degradation profiles that remain stable in shape across languages and model architectures.
citing papers explorer
-
Sign-to-Speech Prosody Transfer via Sign Reconstruction-based GAN
SignRecGAN trains on separate sign and speech datasets via adversarial and reconstruction objectives to inject sign-derived prosody into TTS output using the S2PFormer model.
-
WorldSpeech: A Multilingual Speech Corpus from Around the World
WorldSpeech supplies 65k hours of multilingual aligned speech data across 76 languages and delivers 63.5% average relative WER reduction after fine-tuning ASR models on 11 typologically diverse languages.
-
Balalaika: Data-Centric, Prosody-Aware Annotation Pipeline for Russian Speech
Balalaika is a data-centric annotation pipeline for Russian speech that combines semantic VAD, ASR ensembling, and prosody enrichment to build a 5.1k-hour corpus showing gains in denoising and TTS.
-
Phonological Subspace Collapse Is Aetiology-Specific and Cross-Lingually Stable: Evidence from 3,374 Speakers
Phonological subspace collapse in SSL speech representations produces aetiology-specific degradation profiles that remain stable in shape across languages and model architectures.