Feature reinforcement with word embedding and parsing information in neural TTS

Frank K. Soong; Haohan Guo; Huaiping Ming; Lei He

arxiv: 1901.00707 · v2 · pith:ZGKSFHPKnew · submitted 2019-01-03 · 💻 cs.SD · cs.CL· eess.AS

Feature reinforcement with word embedding and parsing information in neural TTS

Huaiping Ming , Lei He , Haohan Guo , Frank K. Soong This is my paper

classification 💻 cs.SD cs.CLeess.AS

keywords featureinformationmethodneuralwordembeddingimprovesinput

0 comments

read the original abstract

In this paper, we propose a feature reinforcement method under the sequence-to-sequence neural text-to-speech (TTS) synthesis framework. The proposed method utilizes the multiple input encoder to take three levels of text information, i.e., phoneme sequence, pre-trained word embedding, and grammatical structure of sentences from parser as the input feature for the neural TTS system. The added word and sentence level information can be viewed as the feature based pre-training strategy, which clearly enhances the model generalization ability. The proposed method not only improves the system robustness significantly but also improves the synthesized speech to near recording quality in our experiments for out-of-domain text.

This paper has not been read by Pith yet.

Feature reinforcement with word embedding and parsing information in neural TTS

discussion (0)