Contextual Earnings-22: A Speech Recognition Benchmark with Custom Vocabulary in the Wild

· 2026 · cs.CL · arXiv 2604.07354

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

The accuracy frontier of speech-to-text systems has plateaued on academic benchmarks.1 In contrast, industrial benchmarks and adoption in high-stakes domains suggest otherwise. We hypothesize that the primary difference between the two is contextual conditioning: Academic benchmarks are dominated by frequently encountered general vocabulary that is relatively easy to recognize compared with rare and context-defined custom vocabulary that has disproportionate impact on the usability of speech transcripts. Despite progress on contextual speech-to-text, there is no standardized benchmark. We introduce Contextual Earnings-22, an open dataset built upon Earnings-22, with realistic custom vocabulary contexts to foster research and reveal latent progress. We set six strong baselines for two dominant approaches: keyword prompting and keyword boosting. Experiments show both reach comparable and significantly improved accuracy when scaled from proof-of-concept to large-scale systems.

representative citing papers

Contextual Earnings-22: A Speech Recognition Benchmark with Custom Vocabulary in the Wild

cs.CL · 2026-03-28 · unverdicted · novelty 7.0

Contextual Earnings-22 is a new benchmark dataset showing that scaled keyword prompting and boosting both deliver significantly better accuracy on custom vocabularies than standard academic tests.

citing papers explorer

Showing 1 of 1 citing paper.

Contextual Earnings-22: A Speech Recognition Benchmark with Custom Vocabulary in the Wild cs.CL · 2026-03-28 · unverdicted · none · ref 1 · internal anchor
Contextual Earnings-22 is a new benchmark dataset showing that scaled keyword prompting and boosting both deliver significantly better accuracy on custom vocabularies than standard academic tests.

Contextual Earnings-22: A Speech Recognition Benchmark with Custom Vocabulary in the Wild

fields

years

verdicts

representative citing papers

citing papers explorer