RIVET enforces an idempotency objective during training of voice attribute editing models to improve robustness to noisy labels, outperforming standard training on controlled noise and the GLOBE dataset.
Globe: A high-quality english corpus with global accents for zero-shot speaker adaptive text-to- speech,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
SPARCLE builds speaker-aware grapheme representations by contrastively aligning characters with Wav2Vec2 acoustic embeddings conditioned on speaker identity, replacing G2P for TTS and halving WER in low-resource cases.
citing papers explorer
-
RIVET: Robust Idempotent Voice Attribute Editing
RIVET enforces an idempotency objective during training of voice attribute editing models to improve robustness to noisy labels, outperforming standard training on controlled noise and the GLOBE dataset.