The paper delivers a unified framework for fairness in speech technologies by formalizing seven definitions, organizing research into three paradigms, diagnosing pipeline-specific biases, and mapping mitigations to those sources.
Quantifying bias in automatic speech recogni- tion
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2representative citing papers
ASR bias causes users from underrepresented dialects to internalize failures as personal inadequacy and perform extensive emotional and linguistic labor, revealing harms missed by accuracy-only evaluations.
ST models override masculine ILM biases with acoustic input, using first-person pronouns to link terms to the speaker and accessing gender cues across the full frequency spectrum rather than pitch alone.
The authors perform the first systematic bias evaluation in speech continuation tasks across three models, revealing gender interactions in text metrics and stronger reversion to modal phonation for female prompts.
Evaluation of WhisperIPA and ZIPA reveals persistent performance gaps across languages, accents, gender, ethnicity, and age even after allowing for similar phoneme substitutions.
Random phoneme substitutions recover most ASR gains from synthetic accented speech, with targeted edits and ground-truth prosody providing only marginal additional benefits.
Omnimodal models show reduced demographic bias in image and video tasks compared to substantial biases and lower performance in audio tasks.
citing papers explorer
-
"This Wasn't Made for Me": Recentering User Experience and Emotional Impact in the Evaluation of ASR Bias
ASR bias causes users from underrepresented dialects to internalize failures as personal inadequacy and perform extensive emotional and linguistic labor, revealing harms missed by accuracy-only evaluations.
-
Evaluating Bias in Phoneme-Based Automatic Speech Recognition Systems: An Analysis of IPA Transcription Models
Evaluation of WhisperIPA and ZIPA reveals persistent performance gaps across languages, accents, gender, ethnicity, and age even after allowing for similar phoneme substitutions.