pith. machine review for the scientific record. sign in

arxiv: 2510.08047 · v2 · submitted 2025-10-09 · 📡 eess.AS · cs.CL

Recognition: unknown

Pseudo2Real: Task Arithmetic for Pseudo-Label Correction in Automatic Speech Recognition

Authors on Pith no claims yet
classification 📡 eess.AS cs.CL
keywords correctionaccentsbiasesdatadomainmodelpseudo-labelpseudo-labeled
0
0 comments X
read the original abstract

Robust ASR under domain shift is crucial because real-world systems encounter unseen accents and domains with limited labeled data. Although pseudo-labeling offers a practical workaround, it often introduces systematic, accent-specific errors that filtering fails to fix. We ask: How can we correct these recurring biases without target ground truth? We propose a simple parameter-space correction: in a source domain containing both real and pseudo-labeled data, two ASR models are fine-tuned from the same initialization, one on ground-truth labels and the other on pseudo-labels, and their weight difference forms a correction vector that captures pseudo-label biases. When applied to a pseudo-labeled target model, this vector enhances recognition, achieving up to a 35% relative Word Error Rate (WER) reduction on AfriSpeech-200 across ten African accents with the Whisper tiny model.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Toward Fair Speech Technologies: A Comprehensive Survey of Bias and Fairness in Speech AI

    eess.AS 2026-05 accept novelty 7.0

    The paper delivers a unified framework for fairness in speech technologies by formalizing seven definitions, organizing research into three paradigms, diagnosing pipeline-specific biases, and mapping mitigations to th...