pith. sign in

A Deep Neural Network for Short-Segment Speaker Recognition

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

Todays interactive devices such as smart-phone assistants and smart speakers often deal with short-duration speech segments. As a result, speaker recognition systems integrated into such devices will be much better suited with models capable of performing the recognition task with short-duration utterances. In this paper, a new deep neural network, UtterIdNet, capable of performing speaker recognition with short speech segments is proposed. Our proposed model utilizes a novel architecture that makes it suitable for short-segment speaker recognition through an efficiently increased use of information in short speech segments. UtterIdNet has been trained and tested on the VoxCeleb datasets, the latest benchmarks in speaker recognition. Evaluations for different segment durations show consistent and stable performance for short segments, with significant improvement over the previous models for segments of 2 seconds, 1 second, and especially sub-second durations (250 ms and 500 ms).

fields

eess.AS 1

years

2019 1

verdicts

UNVERDICTED 1

representative citing papers

citing papers explorer

Showing 1 of 1 citing paper.

  • A Deep Neural Network for Short-Segment Speaker Recognition eess.AS · 2019-07-22 · unverdicted · none · ref 2 · internal anchor

    UtterIdNet is a DNN that delivers consistent speaker recognition on VoxCeleb for segments down to 250 ms, with reported gains over prior models especially at sub-second lengths.