pith. machine review for the scientific record. sign in

arxiv: 2505.22095 · v2 · submitted 2025-05-28 · 💻 cs.CL

Recognition: unknown

Mixture-of-Retrieval Experts for Reasoning-Guided Multimodal Knowledge Exploitation

Authors on Pith no claims yet
classification 💻 cs.CL
keywords expertsknowledgeretrievalmllmsmultimodalcapabilitydiversedynamically
0
0 comments X
read the original abstract

Multimodal Retrieval-Augmented Generation (MRAG) has shown promise in mitigating hallucinations in Multimodal Large Language Models (MLLMs) by incorporating external knowledge. However, existing methods typically adhere to rigid retrieval paradigms by mimicking fixed retrieval trajectories and thus fail to fully exploit the knowledge of different retrieval experts through dynamic interaction based on the model's knowledge needs or evolving reasoning states. To overcome this limitation, we introduce Mixture-of-Retrieval Experts (MoRE), a novel framework that enables MLLMs to collaboratively interact with diverse retrieval experts for more effective knowledge exploitation. Specifically, MoRE learns to dynamically determine which expert to engage with, conditioned on the evolving reasoning state. To effectively train this capability, we propose Stepwise Group Relative Policy Optimization (Step-GRPO), which goes beyond sparse outcome-based supervision by encouraging MLLMs to interact with multiple retrieval experts and synthesize fine-grained rewards, thereby teaching the MLLM to fully coordinate all experts when answering a given query. Experimental results on diverse open-domain QA benchmarks demonstrate the effectiveness of MoRE, achieving average performance gains of over 7% compared to competitive baselines. Notably, MoRE exhibits strong adaptability by dynamically coordinating heterogeneous experts to precisely locate relevant information, validating its capability for robust, reasoning-driven expert collaboration. All codes and data are released on https://github.com/OpenBMB/MoRE.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Self-Induced Outcome Potential: Turn-Level Credit Assignment for Agents without Verifiers

    cs.LG 2026-05 unverdicted novelty 7.0

    SIOP enables turn-level credit assignment in LLM agents via semantic clustering of final answers as latent outcomes, improving performance on reasoning benchmarks without verifiers.

  2. VISOR: Agentic Visual Retrieval-Augmented Generation via Iterative Search and Over-horizon Reasoning

    cs.CV 2026-04 unverdicted novelty 7.0

    VISOR is a unified agentic VRAG framework with Evidence Space structuring, visual action evaluation/correction, and dynamic sliding-window trajectories trained via GRPO-based RL that achieves SOTA performance on long-...

  3. Learning Agent Routing From Early Experience

    cs.CL 2026-05 unverdicted novelty 6.0

    BoundaryRouter routes queries to LLM or agent using early experience memory from a seed set, cutting inference time 60.6% versus always using agents and raising performance 28.6% versus always using direct LLM inference.