M3DocDep is an LVLM pipeline that extracts multimodal block embeddings, scores parent-child edges with a biaffine head, decodes a valid dependency tree via MST, and produces section-path-annotated chunks, yielding reported gains of 28-39% on STEDS and 1-15% on nDCG/ANLS.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.IR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
M3DocDep: Multi-modal, Multi-page, Multi-document Dependency Chunking with Large Vision-Language Models
M3DocDep is an LVLM pipeline that extracts multimodal block embeddings, scores parent-child edges with a biaffine head, decodes a valid dependency tree via MST, and produces section-path-annotated chunks, yielding reported gains of 28-39% on STEDS and 1-15% on nDCG/ANLS.