arXiv preprint arXiv:2410.08182 , year=

URLhttps://arxiv · 2024 · arXiv 2410.08182

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

unclear 1

representative citing papers

SVFSearch: A Multimodal Knowledge-Intensive Benchmark for Short-Video Frame Search in the Gaming Vertical Domain

cs.AI · 2026-05-18 · unverdicted · novelty 7.0 · 2 refs

SVFSearch is the first open benchmark for short-video frame search in the Chinese gaming domain, providing a frozen retrieval environment and showing performance gaps of 13-29 points between direct QA models, practical agents, and oracle knowledge.

MemLens: Benchmarking Multimodal Long-Term Memory in Large Vision-Language Models

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

MemLens benchmark shows long-context LVLMs lose accuracy with length while memory agents lose visual fidelity, with multi-session reasoning below 30% for most systems and neither approach solving the task alone.

Utility-Oriented Visual Evidence Selection for Multimodal Retrieval-Augmented Generation

cs.CL · 2026-05-13 · unverdicted · novelty 7.0

Evidence utility is defined as information gain on the model's output distribution, with ranking by gain on a latent helpfulness variable shown equivalent to answer-space utility under mild assumptions, enabling a training-free surrogate framework that outperforms baselines.

VisRet: Visualization Improves Knowledge-Intensive Text-to-Image Retrieval

cs.CV · 2025-05-26 · conditional · novelty 6.0

VisRet improves text-to-image retrieval by generating images from text queries and then retrieving within the image modality, reporting average nDCG@30 gains of 0.125 with CLIP and 0.121 with E5-V across four benchmarks.

CoGR-MoE: Concept-Guided Expert Routing with Consistent Selection and Flexible Reasoning for Visual Question Answering

cs.CV · 2026-04-18 · unverdicted · novelty 5.0

CoGR-MoE improves VQA by using concept-guided expert routing with option feature reweighting and contrastive learning to achieve consistent yet flexible reasoning across answer options.

R3G: A Reasoning-Retrieval-Reranking Framework for Vision-Centric Answer Generation

cs.CV · 2026-01-25

citing papers explorer

Showing 1 of 1 citing paper after filters.

R3G: A Reasoning-Retrieval-Reranking Framework for Vision-Centric Answer Generation cs.CV · 2026-01-25 · unreviewed · ref 18

arXiv preprint arXiv:2410.08182 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer