Fine-tuned Whisper and PyAnnote models reach WER 0.2441 and DER 0.2392 on Bangla long-form speech, showing gains over pretrained versions.
Bredin, ``pyannote.audio 2.1 speaker diarization pipeline: principle, benchmark, and recipe,'' in Proc
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Bangla-WhisperDiar: Fine-Tuning Whisper and PyAnnote for Bangla Long-Form Speech Recognition and Speaker Diarization
Fine-tuned Whisper and PyAnnote models reach WER 0.2441 and DER 0.2392 on Bangla long-form speech, showing gains over pretrained versions.