A text-supervised global layout embedding augments local patch representations in late-interaction VDR, yielding +2.4 nDCG@5 and +2.3 MAP@5 gains over ColPali/ColQwen baselines on ViDoRe-v2.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
LARGER boosts file localization accuracy for repository-level coding agents by integrating lexically anchored graph expansion directly into standard search loops, yielding gains of up to 13.9 points on LocBench.
Valley3 is an omni MLLM for e-commerce that uses a four-stage pre-training pipeline plus post-training for controllable reasoning and agentic search, outperforming baselines on e-commerce benchmarks while staying competitive on general ones.
citing papers explorer
-
Beyond Bag-of-Patches: Learning Global Layout via Textual Supervision for Late-Interaction Visual Document Retrieval
A text-supervised global layout embedding augments local patch representations in late-interaction VDR, yielding +2.4 nDCG@5 and +2.3 MAP@5 gains over ColPali/ColQwen baselines on ViDoRe-v2.
-
LARGER: Lexically Anchored Repository Graph Exploration and Retrieval
LARGER boosts file localization accuracy for repository-level coding agents by integrating lexically anchored graph expansion directly into standard search loops, yielding gains of up to 13.9 points on LocBench.
-
Valley3: Scaling Omni Foundation Models for E-commerce
Valley3 is an omni MLLM for e-commerce that uses a four-stage pre-training pipeline plus post-training for controllable reasoning and agentic search, outperforming baselines on e-commerce benchmarks while staying competitive on general ones.