Small language models are the future of domain-specific nlp

Zhi Zhou et al · 2023 · arXiv 2305.04787

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

DharmaOCR: Specialized Small Language Models for Structured OCR that outperform Open-Source and Commercial Baselines

cs.CV · 2026-04-15 · unverdicted · novelty 7.0

DharmaOCR models reach 0.925 and 0.911 extraction scores with 0.40% and 0.20% degeneration rates on a new benchmark covering printed, handwritten, and legal documents, outperforming open-source and commercial baselines via SFT plus DPO.

citing papers explorer

Showing 1 of 1 citing paper.

DharmaOCR: Specialized Small Language Models for Structured OCR that outperform Open-Source and Commercial Baselines cs.CV · 2026-04-15 · unverdicted · none · ref 10
DharmaOCR models reach 0.925 and 0.911 extraction scores with 0.40% and 0.20% degeneration rates on a new benchmark covering printed, handwritten, and legal documents, outperforming open-source and commercial baselines via SFT plus DPO.

Small language models are the future of domain-specific nlp

fields

years

verdicts

representative citing papers

citing papers explorer