EviProp: Seeded Relevance Diffusion on Chunk-Page Graphs for Long Multimodal Document Retrieval

Botian Shi; Fuke Shen; Guohang Yan; Hongwei Zhang; Pinlong Cai; Ruicheng Zhu; Tongquan Wei; Xiaoman Wang; Yue Zhang; Zehui Ling

arxiv: 2606.08979 · v1 · pith:DC2VUGFUnew · submitted 2026-06-08 · 💻 cs.IR

EviProp: Seeded Relevance Diffusion on Chunk-Page Graphs for Long Multimodal Document Retrieval

Hongwei Zhang , Xiaoman Wang , Zehui Ling , Ruicheng Zhu , Yue Zhang , Pinlong Cai , Fuke Shen , Botian Shi

show 2 more authors

Tongquan Wei Guohang Yan

This is my paper

classification 💻 cs.IR

keywords retrievalevipropdocumentpagesrelevancevisualchunk-pagediffusion

0 comments

read the original abstract

Retrieving evidence pages from visually rich long documents is a key challenge in document question answering. Existing page-level visual retrievers operate under an independent matching paradigm: each page is scored in isolation based on query-page similarity. This paradigm can under-rank evidence pages whose signals are localized in fine-grained chunks or depend on document-internal associations. We propose EviProp, a retrieval method that recovers such pages via seeded relevance diffusion. EviProp models each document as a multimodal Chunk-Page graph with hierarchical, sequential, and similarity links. Given a query, it combines dense visual page priors with sparse chunk seeds, then runs Personalized PageRank to diffuse relevance over the graph. Experiments on MMLongBench-Doc and LongDocURL show consistent gains in evidence-page retrieval over independent visual retrieval and text-visual fusion baselines. Downstream QA results further show that improved retrieval translates into better answer accuracy, with negligible online retrieval overhead. Our code is released at https://github.com/Flyecnu/EviProp.

This paper has not been read by Pith yet.

EviProp: Seeded Relevance Diffusion on Chunk-Page Graphs for Long Multimodal Document Retrieval

discussion (0)