DiDiSpeech: A Large Scale Mandarin Speech Corpus

Cheng Gong; Cheng Wen; Dongwei Jiang; Kun Han; Ne Luo; Ruixiong Zhang; Shuaijiang Zhao; Tingwei Guo; Wei Zou; Wubo Li

arxiv: 2010.09275 · v4 · pith:5SE5Q7W5new · submitted 2020-10-19 · 📡 eess.AS

DiDiSpeech: A Large Scale Mandarin Speech Corpus

Tingwei Guo , Cheng Wen , Dongwei Jiang , Ne Luo , Ruixiong Zhang , Shuaijiang Zhao , Wubo Li , Cheng Gong

show 3 more authors

Wei Zou Kun Han Xiangang Li

This is my paper

classification 📡 eess.AS

keywords speechcorpusdatadidispeechmandarinresearchtasksacademic

0 comments

read the original abstract

This paper introduces a new open-sourced Mandarin speech corpus, called DiDiSpeech. It consists of about 800 hours of speech data at 48kHz sampling rate from 6000 speakers and the corresponding texts. All speech data in the corpus is recorded in quiet environment and is suitable for various speech processing tasks, such as voice conversion, multi-speaker text-to-speech and automatic speech recognition. We conduct experiments with multiple speech tasks and evaluate the performance, showing that it is promising to use the corpus for both academic research and practical application. The corpus is available at https://outreach.didichuxing.com/research/opendata/.

This paper has not been read by Pith yet.

DiDiSpeech: A Large Scale Mandarin Speech Corpus

discussion (0)