SamaVaani: Auditing and Debiasing Multilingual Clinical ASR for Indian Languages
Pith reviewed 2026-06-26 04:45 UTC · model grok-4.3
The pith
SamaVaani is a unified debiasing technique that simultaneously improves ASR performance and fairness across demographic groups in clinical interviews.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
After auditing models including IndicWhisper, WhisperLargeV3, Sarvam, GoogleS2T, Gemma3n, OmniLingual, Vaani, and Gemini on real psychiatric interview data, the authors fine-tune Gemma3n and OmniLingual and introduce SamaVaani, a unified debiasing technique that simultaneously improves ASR performance and improves fairness across demographic groups.
What carries the argument
SamaVaani, the unified debiasing technique implemented through fairness-aware fine-tuning of ASR models.
If this is right
- Clinical ASR systems can be made more accurate for Kannada, Hindi, and Indian English by the same fine-tuning step that improves fairness.
- Gaps tied to speaker role and gender can be narrowed without sacrificing overall word error rate performance.
- Open-source models can be adapted to reduce the fairness shortfalls observed in both open and commercial systems.
- Equitable deployment of ASR for documenting psychiatric encounters becomes feasible across the three languages.
Where Pith is reading between the lines
- The same audit-plus-fine-tuning pattern could be applied to other medical speech tasks or additional Indian languages.
- Reduced demographic gaps might increase clinician and patient willingness to rely on ASR-generated notes.
- The technique might be tested for side effects on transcription of rare medical terms or code-switching speech.
Load-bearing premise
The performance gaps found in the audit are caused mainly by model-intrinsic biases that fairness-aware fine-tuning can correct, rather than by recording conditions, dataset artifacts, or unmeasured linguistic factors.
What would settle it
Applying SamaVaani fine-tuning to the held-out psychiatric interview test set and finding neither an overall accuracy gain nor a reduction in gender or role performance gaps would falsify the central claim.
Figures
read the original abstract
Automatic Speech Recognition (ASR) is increasingly used to document clinical encounters, yet its reliability in multilingual and demographically diverse Indian healthcare context remains largely unknown. In this study, we first conduct the systematic audit of ASR performance on real-world psychiatric interview data spanning Kannada, Hindi and Indian English, comparing eight state-of-the-art models including IndicWhisper, WhisperLargeV3, Sarvam, GoogleS2T, Gemma3n, OmniLingual, Vaani, and Gemini. Our results reveal substantial variability across models and languages, with some systems performing competitively in Indian English but failing in regional speech. We further fine-tune two of the best performing opensource models, i.e., Gemma3n and OmniLingual, using various methods. With this, we uncover systematic performance gaps tied to speaker role and gender, raising concerns about equitable deployment in clinical settings, which are further mitigated by fairness-aware fine-tuning. To this end, we propose SamaVaani, a unified debiasing technique that simultaneously improves ASR performance and improves fairness across demographic groups.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper audits eight ASR models (including IndicWhisper, WhisperLargeV3, Sarvam, GoogleS2T, Gemma3n, OmniLingual, Vaani, and Gemini) on real-world psychiatric interview data in Kannada, Hindi, and Indian English, identifies substantial performance variability and systematic gaps tied to speaker role and gender, and proposes SamaVaani as a unified fairness-aware fine-tuning technique applied to Gemma3n and OmniLingual that simultaneously boosts overall ASR performance and reduces demographic disparities.
Significance. If the empirical results hold under proper controls, the work would be significant for clinical ASR deployment in multilingual Indian healthcare, where equitable performance across languages and speaker demographics is critical for reliable documentation of psychiatric encounters.
major comments (2)
- [Methods (fine-tuning and evaluation procedure)] The experimental design for both the audit and the SamaVaani fine-tuning uses the same real-world psychiatric interview corpus without explicit held-out sets or controls differing in recording equipment, clinical sub-domain, or unmeasured linguistic factors. This directly undermines the central claim that performance gaps are primarily model-intrinsic biases amenable to the proposed debiasing, rather than dataset or recording artifacts.
- [Results and Experiments] No quantitative results, error bars, dataset sizes, fine-tuning hyperparameters, or validation procedures (e.g., train/test splits, statistical significance tests) are provided, making it impossible to assess whether the reported improvements in performance and fairness are supported by evidence or reproducible.
minor comments (1)
- The abstract lists eight models but does not specify which two were selected for fine-tuning beyond naming Gemma3n and OmniLingual; clarify selection criteria.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which identify key areas where the manuscript requires greater rigor. We respond to each major comment below and commit to revisions that will strengthen the experimental description and evidence presentation.
read point-by-point responses
-
Referee: [Methods (fine-tuning and evaluation procedure)] The experimental design for both the audit and the SamaVaani fine-tuning uses the same real-world psychiatric interview corpus without explicit held-out sets or controls differing in recording equipment, clinical sub-domain, or unmeasured linguistic factors. This directly undermines the central claim that performance gaps are primarily model-intrinsic biases amenable to the proposed debiasing, rather than dataset or recording artifacts.
Authors: We agree that the current manuscript lacks explicit detail on held-out sets and controls for recording equipment or other factors, which weakens the ability to isolate model-intrinsic effects. In revision we will expand the Methods section to specify the train/validation/test splits (including ratios and any demographic stratification), describe controls or post-hoc analyses for recording and sub-domain factors where possible, and discuss limitations regarding unmeasured linguistic variables. These changes will clarify the scope of claims about SamaVaani while acknowledging potential dataset influences. revision: yes
-
Referee: [Results and Experiments] No quantitative results, error bars, dataset sizes, fine-tuning hyperparameters, or validation procedures (e.g., train/test splits, statistical significance tests) are provided, making it impossible to assess whether the reported improvements in performance and fairness are supported by evidence or reproducible.
Authors: We acknowledge that these quantitative elements were omitted from the submitted version. The revised manuscript will add a dedicated Results section containing dataset sizes (audio hours and utterance counts per language and demographic), all WER and fairness metrics with error bars or confidence intervals, fine-tuning hyperparameters, explicit train/test split details, and statistical significance tests (e.g., paired tests with p-values). This will enable assessment of reproducibility and the strength of the reported gains. revision: yes
Circularity Check
No circularity: empirical audit and fine-tuning study
full rationale
The paper reports an empirical audit of eight ASR models on a real-world psychiatric interview corpus in Kannada/Hindi/Indian English, followed by fine-tuning of two models with fairness-aware methods and evaluation of SamaVaani. No equations, derivations, or parameter-fitting steps are described that reduce any claim to its own inputs by construction. No self-citation chains or uniqueness theorems are invoked as load-bearing premises. The central results (performance gaps and mitigation) are presented as direct experimental measurements on the study data, rendering the work self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Eka Care , year =. The
-
[2]
Proceedings of the International AAAI Conference on Web and Social Media , author =
A. Proceedings of the International AAAI Conference on Web and Social Media , author =. 2024 , pages =. doi:10.1609/icwsm.v18i1.31390 , abstract =
-
[3]
Javed, Tahir and Joshi, Sakshi and Nagarajan, Vignesh and Sundaresan, Sai and Nawale, Janki and Raman, Abhigyan and Bhogale, Kaushal and Kumar, Pratyush and Khapra, Mitesh M. , month = may, year =. Svarah:. doi:10.48550/arXiv.2305.15760 , abstract =
-
[4]
Javed, Tahir and Bhogale, Kaushal Santosh and Raman, Abhigyan and Kunchukuttan, Anoop and Kumar, Pratyush and Khapra, Mitesh M. , month = dec, year =. doi:10.48550/arXiv.2208.11761 , abstract =
-
[5]
Barański, Mateusz and Jasiński, Jan and Bartolewska, Julitta and Kacprzak, Stanisław and Witkowski, Marcin and Kowalczyk, Konrad , month = apr, year =. Investigation of. doi:10.1109/ICASSP49660.2025.10890105 , abstract =
-
[6]
Javed, Tahir and Nawale, Janki Atul and George, Eldho Ittan and Joshi, Sakshi and Bhogale, Kaushal Santosh and Mehendale, Deovrat and Sethi, Ishvinder Virender and Ananthanarayanan, Aparna and Faquih, Hafsah and Palit, Pratiti and Ravishankar, Sneha and Sukumaran, Saranya and Panchagnula, Tripura and Murali, Sunjay and Gandhi, Kunal Sharad and R, Ambujava...
-
[7]
Hinsvark, Arthur and Delworth, Natalie and Rio, Miguel Del and McNamara, Quinten and Dong, Joshua and Westerman, Ryan and Huang, Michelle and Palakapilly, Joseph and Drexler, Jennifer and Pirkin, Ilya and Bhandari, Nishchal and Jette, Miguel , month = jun, year =. Accented. doi:10.48550/arXiv.2104.10747 , abstract =
-
[8]
Proceedings of the National Academy of Sciences , author =
Racial disparities in automated speech recognition , volume =. Proceedings of the National Academy of Sciences , author =. 2020 , note =. doi:10.1073/pnas.1915768117 , abstract =
-
[9]
Gender and Dialect Bias in Y ou T ube ' s Automatic Captions
Tatman, Rachael , editor =. Gender and. Proceedings of the. 2017 , pages =. doi:10.18653/v1/W17-1606 , abstract =
-
[10]
Psychiatry Research , author =
Moving beyond word error rate to evaluate automatic speech recognition in clinical samples:. Psychiatry Research , author =. 2025 , keywords =. doi:10.1016/j.psychres.2025.116690 , abstract =
-
[11]
Relative. IEEE Access , author =. 2025 , keywords =. doi:10.1109/ACCESS.2025.3585454 , abstract =
-
[12]
Ferizaj, Drin and Neumann, Susann , editor =. Assessing. Human-. 2024 , keywords =. doi:10.1007/978-3-031-60449-2_2 , abstract =
-
[13]
Journal of Clinical Medicine , author =
Automated. Journal of Clinical Medicine , author =. 2024 , note =. doi:10.3390/jcm13174997 , abstract =
-
[14]
What automatic speech recognition can and cannot do for conversational speech transcription , volume =. Research Methods in Applied Linguistics , author =. 2024 , keywords =. doi:10.1016/j.rmal.2024.100163 , abstract =
-
[15]
Chen, Yangbin and Xu, Chenyang and Liang, Chunfeng and Tao, Yanbao and Shi, Chuan , month = jun, year =. Speech-based. doi:10.48550/arXiv.2406.03510 , abstract =
-
[16]
Psychiatry Research , author =
Combining automatic speech recognition with semantic natural language processing in schizophrenia , volume =. Psychiatry Research , author =. 2023 , keywords =. doi:10.1016/j.psychres.2023.115252 , abstract =
-
[17]
Artificial Intelligence in Medicine , author =
Automatic documentation of professional health interactions:. Artificial Intelligence in Medicine , author =. 2023 , keywords =. doi:10.1016/j.artmed.2023.102487 , abstract =
-
[18]
Revista Colombiana de Psiquiatría , author =
Automatic. Revista Colombiana de Psiquiatría , author =. 2024 , keywords =. doi:10.1016/j.rcp.2023.12.002 , abstract =
-
[19]
Procedia Computer Science , author =
Challenges of. Procedia Computer Science , author =. 2023 , keywords =. doi:10.1016/j.procs.2023.10.101 , abstract =
-
[20]
BMC Medical Education , author =
Do clinical interview transcripts generated by speech recognition software improve clinical reasoning performance in mock patient encounters?. BMC Medical Education , author =. 2023 , keywords =. doi:10.1186/s12909-023-04246-9 , abstract =
-
[21]
Rao, Ravindra and Ambekar, Atul and Agarwal, A , year =. Opioid
-
[22]
Indian Journal of Psychiatry , author =
Sociodemographic and clinical profiles of individuals with opioid dependence syndrome presenting to a tertiary care center in. Indian Journal of Psychiatry , author =. 2024 , pmid =. doi:10.4103/indianjpsychiatry.indianjpsychiatry_907_24 , abstract =
work page doi:10.4103/indianjpsychiatry.indianjpsychiatry_907_24 2024
-
[23]
The New England Journal of Medicine , author =
Management of opioid analgesic overdose , volume =. The New England Journal of Medicine , author =. 2012 , pmid =. doi:10.1056/NEJMra1202561 , language =
-
[24]
doi:10.1111/ajad.12862 , abstract =
Review article:. doi:10.1111/ajad.12862 , abstract =
-
[25]
2018 , pmid =
Medications for. 2018 , pmid =
2018
-
[26]
The American Journal of Drug and Alcohol Abuse , author =
A novel non-opioid protocol for medically supervised opioid withdrawal and transition to antagonist treatment , copyright =. The American Journal of Drug and Alcohol Abuse , author =. 2018 , note =
2018
-
[27]
The Cochrane Database of Systematic Reviews , author =
Alpha2-adrenergic agonists for the management of opioid withdrawal , issn =. The Cochrane Database of Systematic Reviews , author =. 2009 , pmid =. doi:10.1002/14651858.CD002024.pub3 , abstract =
-
[28]
Lancet (London, England) , author =
New directions in the treatment of opioid withdrawal , volume =. Lancet (London, England) , author =. 2020 , pmid =. doi:10.1016/S0140-6736(20)30852-7 , abstract =
-
[29]
The Cochrane Database of Systematic Reviews , author =
Buprenorphine for managing opioid withdrawal , volume =. The Cochrane Database of Systematic Reviews , author =. 2017 , pmid =. doi:10.1002/14651858.CD002025.pub5 , abstract =
-
[30]
Tramadol for the. Cureus , author =. 2020 , note =. doi:10.7759/cureus.9128 , abstract =
-
[31]
Methadone at tapered doses for the management of opioid withdrawal -
-
[32]
Industrial Psychiatry Journal , author =
Prevalence and predictors of internet gaming disorder among adolescents in. Industrial Psychiatry Journal , author =. 2025 , pmid =. doi:10.4103/ipj.ipj_61_25 , abstract =
-
[33]
Opioid withdrawal scales , url =
Guidelines Development Group , year =. Opioid withdrawal scales , url =. Guidelines for the
-
[34]
Advances in Dual Diagnosis , author =
Time to integrated care and social and clinical determinants: secondary analysis of a multicenter study in persons with dual diagnosis , volume =. Advances in Dual Diagnosis , author =. 2025 , pages =. doi:10.1108/ADD-02-2025-0007 , abstract =
-
[35]
, year =
Ruiz, Pedro and Strain, Eric C. , year =. Lowinson and
-
[36]
International Journal of Molecular Sciences , author =
The. International Journal of Molecular Sciences , author =. 2019 , note =. doi:10.3390/ijms20174302 , abstract =
-
[37]
Current anesthesiology reports , author =
Insights into the. Current anesthesiology reports , author =. 2020 , pmid =. doi:10.1007/s40140-020-00420-7 , abstract =
-
[38]
Annual Review of Neuroscience , author =
Endogenous. Annual Review of Neuroscience , author =. 2020 , note =. doi:10.1146/annurev-neuro-110719-095912 , abstract =
-
[39]
The Journal of Clinical Investigation , author =
Opiate addiction and cocaine addiction: underlying molecular neurobiology and genetics , volume =. The Journal of Clinical Investigation , author =. 2012 , pmid =. doi:10.1172/JCI60390 , language =
-
[40]
Current behavioral neuroscience reports , author =
Current understanding of the neurobiology of opioid use disorder:. Current behavioral neuroscience reports , author =. 2019 , pmid =. doi:10.1007/s40473-019-0170-4 , abstract =
-
[41]
Frontiers in Human Neuroscience , author =
The. Frontiers in Human Neuroscience , author =. 2021 , note =. doi:10.3389/fnhum.2021.601905 , abstract =
-
[42]
Rao, Ravindra and Agarwal, A and Ambekar, Atul , year =. Opioid
-
[43]
The Cochrane Database of Systematic Reviews , author =
Opioid agonist treatment for pharmaceutical opioid dependent people , issn =. The Cochrane Database of Systematic Reviews , author =. 2016 , pmid =. doi:10.1002/14651858.CD011117.pub2 , abstract =
-
[44]
Poisoning with. Indian Journal of Critical Care Medicine : Peer-reviewed, Official Publication of Indian Society of Critical Care Medicine , author =. 2019 , pmid =. doi:10.5005/jp-journals-10071-23309 , abstract =
-
[45]
ResearchGate , month = aug, year =
(. ResearchGate , month = aug, year =
-
[46]
International Journal of Nephrology and Renovascular Disease , author =
Hypomagnesemia: a clinical perspective , volume =. International Journal of Nephrology and Renovascular Disease , author =. 2014 , pmid =. doi:10.2147/IJNRD.S42054 , abstract =
-
[47]
Management of. Pharmacotherapy , author =. 2016 , pmid =. doi:10.1002/phar.1770 , abstract =
-
[48]
Journal of Neurology, Neurosurgery, and Psychiatry , author =
The. Journal of Neurology, Neurosurgery, and Psychiatry , author =. 1989 , pmid =
1989
-
[49]
The New England Journal of Medicine , author =
Electrolyte. The New England Journal of Medicine , author =. 2017 , pmid =. doi:10.1056/NEJMra1704724 , language =
-
[50]
Journal of Clinical Medicine , author =
Hyponatremia in. Journal of Clinical Medicine , author =. 2014 , pmid =. doi:10.3390/jcm4010085 , abstract =
-
[51]
Pathogenetic mechanisms of hypomagnesemia in alcoholic patients , volume =. Journal of trace elements in medicine and biology: organ of the Society for Minerals and Trace Elements (GMS) , author =. 1995 , pmid =. doi:10.1016/S0946-672X(11)80026-X , abstract =
-
[52]
New England Journal of Medicine , author =
Recognition and. New England Journal of Medicine , author =. 2014 , note =. doi:10.1056/NEJMra1407298 , abstract =
-
[53]
Pharmacological management of alcohol withdrawal. JAMA , author =. 1997 , pmid =. doi:10.1001/jama.278.2.144 , abstract =
-
[54]
, year =
International Narcotics Control Board. , year =. Narcotic drugs 2024:
2024
-
[55]
National
Ministry of Finance, Government of India , year =. National
-
[56]
Our joint commitment to effectively addressing and countering the world drug problem , url =
United Nations General Assembly , month = apr, year =. Our joint commitment to effectively addressing and countering the world drug problem , url =
-
[57]
Suggestions for changes in the
Addiction Psychiatry Society of India , month = oct, year =. Suggestions for changes in the
-
[58]
American Journal of Public Health , author =
Language and. American Journal of Public Health , author =. 2013 , pages =. doi:10.2105/AJPH.2012.301191 , abstract =
-
[59]
Ministry of Law. The
-
[60]
Statements following the voting on the
United Nations Office on Drugs. Statements following the voting on the
-
[61]
Transnational Institute (TNI)/International Drug Policy Consortium (IDPC) , author =
Scheduling in the international drug control system , language =. Transnational Institute (TNI)/International Drug Policy Consortium (IDPC) , author =
-
[62]
TNI/IDPC Series on Legislative Reform of Drug Policies , author =
Fifty. TNI/IDPC Series on Legislative Reform of Drug Policies , author =
-
[63]
United Nations Office on Drugs. The
-
[64]
Vital. MMWR. Morbidity and Mortality Weekly Report , author =. doi:10.15585/mmwr.mm6709e1 , abstract =
-
[65]
Smoke and
Ghosh, Amitava , year =. Smoke and
-
[66]
, year =
Harm Reduction International. , year =. The death penalty for drug offences:
-
[67]
, year =
Public Health Agency of Canada. , year =. Supervised consumption sites:
-
[68]
, month = oct, year =
Expert Panel on the Legislative Review of the Cannabis Act. , month = oct, year =. Legislative review of the
-
[69]
Funding and delivery of
Oregon Secretary of State, Audits Division , month = dec, year =. Funding and delivery of
-
[70]
As new data dashboard shows overdoses increased in 2023,
Oregon Health Authority , month = dec, year =. As new data dashboard shows overdoses increased in 2023,
2023
-
[71]
Oregon Legislative Assembly , year =. House
-
[72]
Drug decriminalisation in
-
[73]
Harm Reduction Journal , author =
Fifteen years of heroin-assisted treatment in a. Harm Reduction Journal , author =. 2020 , keywords =. doi:10.1186/s12954-020-00412-0 , abstract =
-
[74]
Addiction (Abingdon, England) , author =
Does heroin-assisted treatment reduce crime?. Addiction (Abingdon, England) , author =. 2022 , pmid =. doi:10.1111/add.15601 , abstract =
-
[75]
The Lancet Regional Health - Europe , author =
Germany's cannabis act: a catalyst for. The Lancet Regional Health - Europe , author =. 2024 , pmid =. doi:10.1016/j.lanepe.2024.100929 , abstract =
-
[76]
Effects of a sustained heroin shortage in three. Addiction , author =. 2005 , note =. doi:10.1111/j.1360-0443.2005.01094.x , abstract =
-
[77]
Substance Abuse Treatment, Prevention, and Policy , author =
The "lessons" of the. Substance Abuse Treatment, Prevention, and Policy , author =. 2006 , pmid =. doi:10.1186/1747-597X-1-11 , abstract =
-
[78]
SSRN Electronic Journal , author =
Going after the. SSRN Electronic Journal , author =. doi:10.2139/ssrn.3010673 , language =
-
[79]
Drug and Alcohol Dependence , author =
Mortality in heroin-assisted treatment in. Drug and Alcohol Dependence , author =. 2005 , pmid =. doi:10.1016/j.drugalcdep.2005.01.005 , abstract =
-
[80]
National
Ministry of Finance, Department of Revenue , year =. National
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.