MinerU2.5: A Decoupled Vision-Language Model for Efficient High- Resolution Document Parsing, 2025

Junbo Niu, Zheng Liu, Zhuangcheng Gu, et al · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Towards Real-World Document Parsing via Realistic Scene Synthesis and Document-Aware Training

cs.CV · 2026-03-25 · unverdicted · novelty 6.0

A realistic scene synthesis strategy and document-aware training recipe enable a 1B-parameter MLLM to achieve superior accuracy and robustness in end-to-end parsing of real-world captured documents.

citing papers explorer

Showing 1 of 1 citing paper.

Towards Real-World Document Parsing via Realistic Scene Synthesis and Document-Aware Training cs.CV · 2026-03-25 · unverdicted · none · ref 32
A realistic scene synthesis strategy and document-aware training recipe enable a 1B-parameter MLLM to achieve superior accuracy and robustness in end-to-end parsing of real-world captured documents.

MinerU2.5: A Decoupled Vision-Language Model for Efficient High- Resolution Document Parsing, 2025

fields

years

verdicts

representative citing papers

citing papers explorer