Chime-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings

· 2004 · arXiv 2004.09249

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

ZipVoice-Dialog: Non-Autoregressive Spoken Dialogue Generation with Flow Matching

eess.AS · 2025-07-12 · conditional · novelty 6.0

ZipVoice-Dialog is a flow-matching non-autoregressive model for zero-shot spoken dialogue generation that uses curriculum learning and speaker-turn embeddings, paired with a new 6.8k-hour OpenDialog dataset, and reports better speed and quality than autoregressive baselines.

SoulX-Transcriber: A Robust End-to-End Framework for Multi-Speaker Speech Transcription

eess.AS · 2026-06-01 · unverdicted · novelty 4.0

SoulX-Transcriber is a unified LLM framework for end-to-end multi-speaker transcription using two-stage training (speaker-aware pre-training then supervised fine-tuning) that reports strong results on AliMeeting, AISHELL-4, and AMI.

citing papers explorer

Showing 1 of 1 citing paper after filters.

ZipVoice-Dialog: Non-Autoregressive Spoken Dialogue Generation with Flow Matching eess.AS · 2025-07-12 · conditional · none · ref 44
ZipVoice-Dialog is a flow-matching non-autoregressive model for zero-shot spoken dialogue generation that uses curriculum learning and speaker-turn embeddings, paired with a new 6.8k-hour OpenDialog dataset, and reports better speed and quality than autoregressive baselines.

Chime-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings

fields

years

verdicts

representative citing papers

citing papers explorer