PaperFit uses rendered page images in a closed loop to diagnose and repair typesetting defects in LaTeX documents, outperforming baselines on a new benchmark of 200 papers.
SimpleDoc: Multi-Modal Document Understanding with Dual-Cue Page Retrieval and Iterative Refinement
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
A systematic survey of Multimodal RAG for document understanding proposing a taxonomy based on domain, retrieval modality, and granularity while reviewing graph structures, agentic frameworks, datasets, benchmarks, applications, and open challenges.
citing papers explorer
-
PaperFit: Vision-in-the-Loop Typesetting Optimization for Scientific Documents
PaperFit uses rendered page images in a closed loop to diagnose and repair typesetting defects in LaTeX documents, outperforming baselines on a new benchmark of 200 papers.
-
Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding
A systematic survey of Multimodal RAG for document understanding proposing a taxonomy based on domain, retrieval modality, and granularity while reviewing graph structures, agentic frameworks, datasets, benchmarks, applications, and open challenges.