pith. sign in

arxiv: 2101.01902 · v3 · pith:LVQNCKTMnew · submitted 2021-01-06 · 💻 cs.SD · cs.LG· eess.AS

Interspeech 2021 Deep Noise Suppression Challenge

classification 💻 cs.SD cs.LGeess.AS
keywords challengenoisebandinterspeechqualityscenariosspeechsuppression
0
0 comments X
read the original abstract

The Deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTERSPEECH and ICASSP 2020. We open-sourced training and test datasets for the wideband scenario. We also open-sourced a subjective evaluation framework based on ITU-T standard P.808, which was also used to evaluate participants of the challenge. Many researchers from academia and industry made significant contributions to push the field forward, yet even the best noise suppressor was far from achieving superior speech quality in challenging scenarios. In this version of the challenge organized at INTERSPEECH 2021, we are expanding both our training and test datasets to accommodate full band scenarios. The two tracks in this challenge will focus on real-time denoising for (i) wide band, and(ii) full band scenarios. We are also making available a reliable non-intrusive objective speech quality metric called DNSMOS for the participants to use during their development phase.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. DASH: Dual-View Self-Distillation with Multi-Layer Hidden Representations for Robust Speech Recognition

    eess.AS 2026-06 unverdicted novelty 4.0

    DASH applies dual-view self-distillation on multi-layer representations and prototype distributions to boost ASR noise robustness while keeping clean accuracy.

  2. Multimodal Large Language Model-Enabled Video Translation: A Role-Oriented Survey

    cs.CV 2026-04 unverdicted novelty 4.0

    The paper offers the first focused review of MLLM-based video translation organized by a three-role taxonomy of Semantic Reasoner, Expressive Performer, and Visual Synthesizer, plus open challenges.