Speech Quality Embeddings for Improved Detection and Classification of Degradations in Speech Signals

· 2026 · eess.AS · arXiv 2605.21332

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Automatic subjective speech quality assessment (SSQA) traditionally estimates speech quality on an utterance or system level. While this resolution was adequate for older transmission or synthesis systems that produced speech signals of mediocre quality, modern systems generate high-quality speech with degradations that may occur only locally. With suitable model architectures and regularization losses, SSQA models trained with utterance-level targets can also yield useful local predictions of speech quality. In this work, we extend such models to produce frame-level embeddings that cluster by degradation type. Specifically, we employ a partial mix-up strategy on a parallel corpus of clean and degraded utterances and apply a contrastive loss to distinguish between degradation types. Through experiments on both in- and out-of-domain data, we demonstrate that our approach improves degradation detection and enables the identification of degradation types by analyzing embedding clusters.

representative citing papers

Speech Quality Embeddings for Improved Detection and Classification of Degradations in Speech Signals

eess.AS · 2026-05-20 · unverdicted · novelty 6.0

Partial mix-up on clean-degraded speech pairs plus contrastive loss produces frame-level embeddings that cluster by degradation type and improve detection and classification on in- and out-of-domain data.

citing papers explorer

Showing 1 of 1 citing paper.

Speech Quality Embeddings for Improved Detection and Classification of Degradations in Speech Signals eess.AS · 2026-05-20 · unverdicted · none · ref 1 · internal anchor
Partial mix-up on clean-degraded speech pairs plus contrastive loss produces frame-level embeddings that cluster by degradation type and improve detection and classification on in- and out-of-domain data.

Speech Quality Embeddings for Improved Detection and Classification of Degradations in Speech Signals

fields

years

verdicts

representative citing papers

citing papers explorer