pyannote.audio: neural building blocks for speaker diarization

Diego Fustes; Gregory Gelly; Hadrien Titeux; Herv\'e Bredin; Juan Manuel Coria; Marie-Philippe Gill; Marvin Lavechin; Pavel Korshunov; Ruiqing Yin; Wassim Bouaziz

arxiv: 1911.01255 · v1 · pith:U7RBMFVBnew · submitted 2019-11-04 · 📡 eess.AS · cs.SD

pyannote.audio: neural building blocks for speaker diarization

Herv\'e Bredin , Ruiqing Yin , Juan Manuel Coria , Gregory Gelly , Pavel Korshunov , Marvin Lavechin , Diego Fustes , Hadrien Titeux

show 2 more authors

Wassim Bouaziz Marie-Philippe Gill

This is my paper

classification 📡 eess.AS cs.SD

keywords speakeraudiodetectiondiarizationpyannoteblocksbuildingneural

0 comments

read the original abstract

We introduce pyannote.audio, an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines. pyannote.audio also comes with pre-trained models covering a wide range of domains for voice activity detection, speaker change detection, overlapped speech detection, and speaker embedding -- reaching state-of-the-art performance for most of them.

This paper has not been read by Pith yet.

pyannote.audio: neural building blocks for speaker diarization

discussion (0)