pith. sign in
Pith Number

pith:3OXKAQ6V

pith:2026:3OXKAQ6VYA4BGR6U64NAECXBJT
not attested not anchored not stored refs resolved

Vividh-ASR: A Complexity-Tiered Benchmark and Optimization Dynamics for Robust Indic Speech Recognition

Aditya Srinivas Menon, Arghya Bhattacharya, Kavya Manohar, Kumarmanas Nethil, Kush Juvekar

Reverse multi-stage fine-tuning lets a 244M Whisper model match or exceed 769M counterparts on a tiered Indic speech benchmark.

arxiv:2605.13087 v1 · 2026-05-13 · cs.CL · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{3OXKAQ6VYA4BGR6U64NAECXBJT}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

reverse multi-stage fine-tuning (R-MFT), a training recipe that enables a parameter-efficient 244M Whisper model to match or exceed conventionally fine-tuned 769M counterparts.

C2weakest assumption

That the four complexity tiers in Vividh-ASR sufficiently represent the distribution of real-world usage for Indic ASR and that the observed gains from early large updates and hard-to-easy ordering will hold for other languages, models, and deployment conditions.

C3one line summary

Vividh-ASR benchmark and reverse multi-stage fine-tuning enable smaller Whisper models to match larger ones on complex Indic speech by concentrating adaptation in the decoder.

References

29 extracted · 29 resolved · 1 Pith anchors

[1] However, zero-shot word er- ror rates (WER) for many Indic languages often exceed 100%
[2] Vividh-ASR: A Complexity-Tiered Benchmark and Optimization Dynamics for Robust Indic Speech Recognition 2026 · arXiv:2605.13087
[3] The Vividh-ASR Benchmark Vividh-ASR is a diagnostic benchmark organized byacous- tic and prosodic complexityrather than by domain. It targets Hindi and Malayalam, representing the Indo-Aryan and Dra-
[4] However, when adapting to low-resource languages with complex phonotactics, the model Table 1:Data distribution in hours
[5] Learning Rate Effect Figure 2 shows training loss for the Malayalam Whisper- medium model (representative; Hindi and Whisper-small ex- hibit identical trends)

Formal links

2 machine-checked theorem links

Cited by

1 paper in Pith

Receipt and verification
First computed 2026-05-18T03:08:58.522237Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

dbaea043d5c0381347d4f71a020ae14cf72656b722cb1666668063eadfa84f06

Aliases

arxiv: 2605.13087 · arxiv_version: 2605.13087v1 · doi: 10.48550/arxiv.2605.13087 · pith_short_12: 3OXKAQ6VYA4B · pith_short_16: 3OXKAQ6VYA4BGR6U · pith_short_8: 3OXKAQ6V
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/3OXKAQ6VYA4BGR6U64NAECXBJT \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: dbaea043d5c0381347d4f71a020ae14cf72656b722cb1666668063eadfa84f06
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "42d96ca1991276f52c29b0feee8ceb1d563f5b451840ccbff07e65fa8b32a2e0",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by-sa/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-05-13T06:55:55Z",
    "title_canon_sha256": "4409553652e432a0288fbbc3685db26c0fe22ef9e9445256fb4cce3bad0e9a8e"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13087",
    "kind": "arxiv",
    "version": 1
  }
}