Process supervision via RAG-Gym produces more reliable and generalizable search agents, with gains driven by higher-quality queries on out-of-domain multi-hop tasks.
Reward-rag: Enhancing rag with reward driven supervision
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2025 2verdicts
UNVERDICTED 2representative citing papers
RioRAG uses nugget-centric verification with cross-source checks to create dense verifiable rewards for RL-based optimization of long-form RAG, yielding higher factual recall and faithfulness on LongFact and RAGChecker.
citing papers explorer
-
Supervising the search process produces reliable and generalizable information-seeking agents
Process supervision via RAG-Gym produces more reliable and generalizable search agents, with gains driven by higher-quality queries on out-of-domain multi-hop tasks.
-
Reinforced Informativeness Optimization for Long-Form Retrieval-Augmented Generation
RioRAG uses nugget-centric verification with cross-source checks to create dense verifiable rewards for RL-based optimization of long-form RAG, yielding higher factual recall and faithfulness on LongFact and RAGChecker.