Emotional End-to-End Neural Speech Synthesizer

Azam Rabiee; Soo-Young Lee; Younggun Lee

arxiv: 1711.05447 · v2 · pith:FSRB5VDPnew · submitted 2017-11-15 · 💻 cs.SD · cs.CL· eess.AS

Emotional End-to-End Neural Speech Synthesizer

Younggun Lee , Azam Rabiee , Soo-Young Lee This is my paper

classification 💻 cs.SD cs.CLeess.AS

keywords neuralspeechemotionalend-to-endmodelproblemsynthesizertacotron

0 comments

read the original abstract

In this paper, we introduce an emotional speech synthesizer based on the recent end-to-end neural model, named Tacotron. Despite its benefits, we found that the original Tacotron suffers from the exposure bias problem and irregularity of the attention alignment. Later, we address the problem by utilization of context vector and residual connection at recurrent neural networks (RNNs). Our experiments showed that the model could successfully train and generate speech for given emotion labels.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

End-to-End Emotional Speech Synthesis Using Style Tokens and Semi-Supervised Training
eess.AS 2019-06 unverdicted novelty 5.0

GST-Tacotron with cross-entropy loss on style tokens outperforms standard Tacotron for emotional speech synthesis with only 5% emotion-labeled data and approaches full-label performance.
Intelligent Agents with Emotional Intelligence: Current Trends, Challenges, and Future Prospects
cs.HC 2025-10 unverdicted novelty 2.0

A holistic survey of affective computing for intelligent agents covering emotion understanding via multimodal data, affective cognition, emotional expression synthesis, key challenges, and future directions emphasizin...