Sarashina2.2-TTS achieves SOTA kanji reading accuracy via data scaling and Joyo-kanji-targeted synthesis, introduces the Joyo Kanji Yomi Benchmark and Kana-CER metric, and shows stable cross-lingual performance.
Prosodic features control by sym- bols as input of sequence-to-sequence acoustic modeling for neural tts
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Sarashina2.2-TTS: Tackling Kanji Polyphony in Japanese Speech Generation via Data Scaling and Targeted Data Synthesis
Sarashina2.2-TTS achieves SOTA kanji reading accuracy via data scaling and Joyo-kanji-targeted synthesis, introduces the Joyo Kanji Yomi Benchmark and Kana-CER metric, and shows stable cross-lingual performance.