pith. sign in

arxiv: 2606.03504 · v1 · pith:O6SPJ5VZnew · submitted 2026-06-02 · 💻 cs.CL · cs.AI

BaltiVoice: A Speech Corpus and Fine-tuned Whisper ASR System for the Balti Language

classification 💻 cs.CL cs.AI
keywords corpusbaltiavailablebaltivoicefine-tunedlanguagepubliclyutterances
0
0 comments X
read the original abstract

We present BaltiVoice, a 16.8-hour read-speech corpus for Balti (ISO 639-3: bft), a Tibetic language spoken in Gilgit-Baltistan, Pakistan, with no prior publicly available ASR resources. The corpus contains 10,060 validated utterances in native Nastaliq script, derived from Mozilla Common Voice recordings. We fine-tune OpenAI Whisper-small on this corpus and report a Word Error Rate (WER) of 30.07% on a held-out validation set of 538 utterances, down from a measured zero-shot baseline of 182.18% for Whisper-small on Balti. The dataset, fine-tuned model, and a live transcription demo are publicly available on HuggingFace.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.