pith. machine review for the scientific record. sign in

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it
abstract

Document parsing (DP) transforms unstructured or semi-structured documents into structured, machine-readable representations, enabling downstream applications such as knowledge base construction and retrieval-augmented generation (RAG). This survey provides a comprehensive and timely review of document parsing research. We propose a systematic taxonomy that organizes existing approaches into modular pipeline-based systems and unified models driven by Vision-Language Models (VLMs). We provide a detailed review of key components in pipeline systems, including layout analysis and the recognition of heterogeneous content such as text, tables, mathematical expressions, and visual elements, and then systematically track the evolution of specialized VLMs for document parsing. Additionally, we summarize widely adopted evaluation metrics and high-quality benchmarks that establish current standards for parsing quality. Finally, we discuss key open challenges, including robustness to complex layouts, reliability of VLM-based parsing, and inference efficiency, and outline directions for building more accurate and scalable document intelligence systems.

years

2026 3

verdicts

UNVERDICTED 3

representative citing papers

citing papers explorer

Showing 3 of 3 citing papers.