pith. sign in

arxiv: 1911.01255 · v1 · pith:U7RBMFVBnew · submitted 2019-11-04 · 📡 eess.AS · cs.SD

pyannote.audio: neural building blocks for speaker diarization

classification 📡 eess.AS cs.SD
keywords speakeraudiodetectiondiarizationpyannoteblocksbuildingneural
0
0 comments X
read the original abstract

We introduce pyannote.audio, an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines. pyannote.audio also comes with pre-trained models covering a wide range of domains for voice activity detection, speaker change detection, overlapped speech detection, and speaker embedding -- reaching state-of-the-art performance for most of them.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.