PureDocBench shows document parsing is far from solved, with top models at ~74/100, small specialists competing with large VLMs, and ranking reversals under real degradation.
Multimodal ocr: Parse anything from documents
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4roles
baseline 1polarities
baseline 1representative citing papers
ChartArena is a new benchmark dataset and evaluation protocol for chart parsing by MLLMs that covers numeric and diagrammatic charts in multiple languages and real-world visual conditions.
GlotOCR Bench shows that OCR models perform well on fewer than 10 scripts and fail to generalize beyond about 30, with results tracking pretraining coverage and models hallucinating from known scripts on unfamiliar ones.
MPDocBench-Parse provides 433 annotated multi-page documents and an evaluation protocol covering text/table/formula extraction, merging, figure extraction, reading order, and heading hierarchy for realistic document parsing.
citing papers explorer
No citing papers match the current filters.