Robust Accent Identification via Voice Conversion and Non-Timbral Embeddings

· 2026 · eess.SP · arXiv 2604.25332

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Automatic accent identification (AID) remains a challenging task due to the complex variability of accents, the entanglement of accent cues with speaker traits, and the scarcity of reliable accentlabelled data. To address these challenges, we propose a speaker augmentation strategy using voice conversion (VC), with which we generate additional training data by converting original training utterances into different speaker voices while preserving accentual cues. For this purpose, we select two recent VC systems and evaluate their capability to preserve accent. Alternatively, we also explore the use of non-timbral embeddings in AID, for their ability to convey accent information among other non timbral cues. The effectiveness of both methods is demonstrated on the GenAID benchmark, achieving a new state-of-the-art F1-score of 0.66, compared to the previous score of 0.55. Beyond AID, we show that non-timbral embeddings enable accent-controlled Text-to-Speech, producing high-fidelity speech with accurate accent transfer.

representative citing papers

Robust Accent Identification via Voice Conversion and Non-Timbral Embeddings

eess.SP · 2026-04-28 · unverdicted · novelty 6.0

Voice conversion data augmentation and non-timbral embeddings raise automatic accent identification to a new state-of-the-art F1 of 0.66 on GenAID while supporting accent-controlled TTS.

citing papers explorer

Showing 1 of 1 citing paper.

Robust Accent Identification via Voice Conversion and Non-Timbral Embeddings eess.SP · 2026-04-28 · unverdicted · none · ref 2 · internal anchor
Voice conversion data augmentation and non-timbral embeddings raise automatic accent identification to a new state-of-the-art F1 of 0.66 on GenAID while supporting accent-controlled TTS.

Robust Accent Identification via Voice Conversion and Non-Timbral Embeddings

fields

years

verdicts

representative citing papers

citing papers explorer