arXiv preprint arXiv:2410.08182 , year=

Mrag-bench: Vision-centric evaluation for retrieval-augmented multimodal models , author= · 2024 · arXiv 2410.08182

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

unclear 1

representative citing papers

SVFSearch: A Multimodal Knowledge-Intensive Benchmark for Short-Video Frame Search in the Gaming Vertical Domain

cs.AI · 2026-05-18 · unverdicted · novelty 7.0 · 2 refs

SVFSearch is the first open benchmark for short-video frame search in the Chinese gaming domain, providing a frozen retrieval environment and showing performance gaps of 13-29 points between direct QA models, practical agents, and oracle knowledge.

Utility-Oriented Visual Evidence Selection for Multimodal Retrieval-Augmented Generation

cs.CL · 2026-05-13 · unverdicted · novelty 7.0

Evidence utility is defined as information gain on the model's output distribution, with ranking by gain on a latent helpfulness variable shown equivalent to answer-space utility under mild assumptions, enabling a training-free surrogate framework that outperforms baselines.

R3G: A Reasoning--Retrieval--Reranking Framework for Vision-Centric Answer Generation

cs.CV · 2026-01-25 · unverdicted · novelty 6.0

R3G improves vision-centric visual question answering by generating reasoning plans to guide two-stage image retrieval and reranking, achieving state-of-the-art results on MRAG-Bench across six MLLM backbones.

VisRet: Visualization Improves Knowledge-Intensive Text-to-Image Retrieval

cs.CV · 2025-05-26 · conditional · novelty 6.0

VisRet improves text-to-image retrieval by generating images from text queries and then retrieving within the image modality, reporting average nDCG@30 gains of 0.125 with CLIP and 0.121 with E5-V across four benchmarks.

CoGR-MoE: Concept-Guided Expert Routing with Consistent Selection and Flexible Reasoning for Visual Question Answering

cs.CV · 2026-04-18 · unverdicted · novelty 5.0

CoGR-MoE improves VQA by using concept-guided expert routing with option feature reweighting and contrastive learning to achieve consistent yet flexible reasoning across answer options.

citing papers explorer

Showing 5 of 5 citing papers.

SVFSearch: A Multimodal Knowledge-Intensive Benchmark for Short-Video Frame Search in the Gaming Vertical Domain cs.AI · 2026-05-18 · unverdicted · none · ref 53 · 2 links
SVFSearch is the first open benchmark for short-video frame search in the Chinese gaming domain, providing a frozen retrieval environment and showing performance gaps of 13-29 points between direct QA models, practical agents, and oracle knowledge.
Utility-Oriented Visual Evidence Selection for Multimodal Retrieval-Augmented Generation cs.CL · 2026-05-13 · unverdicted · none · ref 11
Evidence utility is defined as information gain on the model's output distribution, with ranking by gain on a latent helpfulness variable shown equivalent to answer-space utility under mild assumptions, enabling a training-free surrogate framework that outperforms baselines.
R3G: A Reasoning--Retrieval--Reranking Framework for Vision-Centric Answer Generation cs.CV · 2026-01-25 · unverdicted · none · ref 18
R3G improves vision-centric visual question answering by generating reasoning plans to guide two-stage image retrieval and reranking, achieving state-of-the-art results on MRAG-Bench across six MLLM backbones.
VisRet: Visualization Improves Knowledge-Intensive Text-to-Image Retrieval cs.CV · 2025-05-26 · conditional · none · ref 4
VisRet improves text-to-image retrieval by generating images from text queries and then retrieving within the image modality, reporting average nDCG@30 gains of 0.125 with CLIP and 0.121 with E5-V across four benchmarks.
CoGR-MoE: Concept-Guided Expert Routing with Consistent Selection and Flexible Reasoning for Visual Question Answering cs.CV · 2026-04-18 · unverdicted · none · ref 11
CoGR-MoE improves VQA by using concept-guided expert routing with option feature reweighting and contrastive learning to achieve consistent yet flexible reasoning across answer options.

arXiv preprint arXiv:2410.08182 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer