LFRAG advances multimodal RAG to block-level retrieval with layout segmentation and cross-attention fusion, reporting SOTA retrieval, 7.20% higher answer accuracy, and 73.07% lower token consumption on the new LFDocQA benchmark.
Retrieval-augmented generation for knowledge-intensive nlp tasks.Advances in neural information processing systems, 33:9459–9474
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.IR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
LFRAG: Layout-oriented Fine-grained Retrieval-Augmented Generation on Multimodal Document Understanding
LFRAG advances multimodal RAG to block-level retrieval with layout segmentation and cross-attention fusion, reporting SOTA retrieval, 7.20% higher answer accuracy, and 73.07% lower token consumption on the new LFDocQA benchmark.