pith. sign in

arxiv: 1609.08441 · v2 · pith:LY57ADMWnew · submitted 2016-09-27 · 💻 cs.LG · cs.AI· cs.CL· cs.SD

Weakly Supervised PLDA Training

classification 💻 cs.LG cs.AIcs.CLcs.SD
keywords trainingpldaweakapproachdatacheapdifferenthuman-labelled
0
0 comments X
read the original abstract

PLDA is a popular normalization approach for the i-vector model, and it has delivered state-of-the-art performance in speaker verification. However, PLDA training requires a large amount of labelled development data, which is highly expensive in most cases. We present a cheap PLDA training approach, which assumes that speakers in the same session can be easily separated, and speakers in different sessions are simply different. This results in `weak labels' which are not fully accurate but cheap, leading to a weak PLDA training. Our experimental results on real-life large-scale telephony customer service achieves demonstrated that the weak training can offer good performance when human-labelled data are limited. More interestingly, the weak training can be employed as a discriminative adaptation approach, which is more efficient than the prevailing unsupervised method when human-labelled data are insufficient.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.