hub Canonical reference

Corrective Retrieval Augmented Generation

Shi-Qi Yan, Jia-Chen Gu, Yun Zhu, Zhen-Hua Ling · 2024 · cs.CL · arXiv 2401.15884

Canonical reference. 86% of citing Pith papers cite this work as background.

44 Pith papers citing it

Background 86% of classified citations

open full Pith review browse 44 citing papers arXiv PDF

abstract

Large language models (LLMs) inevitably exhibit hallucinations since the accuracy of generated texts cannot be secured solely by the parametric knowledge they encapsulate. Although retrieval-augmented generation (RAG) is a practicable complement to LLMs, it relies heavily on the relevance of retrieved documents, raising concerns about how the model behaves if retrieval goes wrong. To this end, we propose the Corrective Retrieval Augmented Generation (CRAG) to improve the robustness of generation. Specifically, a lightweight retrieval evaluator is designed to assess the overall quality of retrieved documents for a query, returning a confidence degree based on which different knowledge retrieval actions can be triggered. Since retrieval from static and limited corpora can only return sub-optimal documents, large-scale web searches are utilized as an extension for augmenting the retrieval results. Besides, a decompose-then-recompose algorithm is designed for retrieved documents to selectively focus on key information and filter out irrelevant information in them. CRAG is plug-and-play and can be seamlessly coupled with various RAG-based approaches. Experiments on four datasets covering short- and long-form generation tasks show that CRAG can significantly improve the performance of RAG-based approaches.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 6 method 1

citation-polarity summary

background 6 use method 1

representative citing papers

X-SYNTH: Beyond Retrieval -- Enterprise Context Synthesis from Observed Digital Human Attention

cs.AI · 2026-05-15 · unverdicted · novelty 7.0 · 2 refs

X-SYNTH synthesizes enterprise context from digital human attention using Digital Twin Signatures and seven attention filters, raising true lead rate from 9.5% to 61.9% while cutting false lead rate to 18.8%.

EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

EvolveMem enables autonomous self-evolution of LLM memory retrieval configurations via LLM diagnosis and safeguards, delivering 25.7% gains over strong baselines on LoCoMo and 18.9% on MemBench with positive cross-benchmark transfer.

Route Before Retrieve: Activating Latent Routing Abilities of LLMs for RAG vs. Long-Context Selection

cs.CL · 2026-05-11 · unverdicted · novelty 7.0 · 2 refs

Pre-Route elicits LLMs' latent routing skills via structured prompts on metadata to proactively choose RAG or long-context, outperforming reactive baselines on cost-effectiveness.

The Context Gathering Decision Process: A POMDP Framework for Agentic Search

cs.AI · 2026-05-07 · accept · novelty 7.0

Framing LLM agent loops as a Context Gathering Decision Process POMDP yields a predicate-based belief state that boosts multi-hop reasoning up to 11.4% and an exhaustion gate that cuts token use up to 39% with no performance loss.

SCOUT: Active Information Foraging for Long-Text Understanding with Decoupled Epistemic States

cs.CL · 2026-05-06 · unverdicted · novelty 7.0

SCOUT achieves state-of-the-art long-text understanding with up to 8x lower token use by actively foraging for sparse query-relevant information and updating a compact provenance-grounded epistemic state.

MemFlow: Intent-Driven Memory Orchestration for Small Language Model Agents

cs.MA · 2026-05-05 · unverdicted · novelty 7.0

MemFlow routes queries by intent to tiered memory operations, nearly doubling accuracy of a 1.7B SLM on long-horizon benchmarks compared to full-context baselines.

AdaGATE: Adaptive Gap-Aware Token-Efficient Evidence Assembly for Multi-Hop Retrieval-Augmented Generation

cs.CL · 2026-05-04 · unverdicted · novelty 7.0

AdaGATE improves evidence F1 scores on HotpotQA for multi-hop RAG under clean, redundant, and noisy conditions by framing selection as gap-aware token-constrained repair, outperforming baselines while using 2.6x fewer tokens.

HaS: Accelerating RAG through Homology-Aware Speculative Retrieval

cs.IR · 2026-04-22 · unverdicted · novelty 7.0

HaS accelerates RAG retrieval via homology-aware speculative retrieval and homologous query re-identification validation, cutting latency 24-37% with 1-2% accuracy drop on tested datasets.

ArbGraph: Conflict-Aware Evidence Arbitration for Reliable Long-Form Retrieval-Augmented Generation

cs.CL · 2026-04-20 · unverdicted · novelty 7.0

ArbGraph resolves conflicts in RAG evidence by constructing a conflict-aware graph of atomic claims and applying intensity-driven iterative arbitration to suppress unreliable claims prior to generation.

IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning

cs.AI · 2026-04-16 · unverdicted · novelty 7.0

IG-Search computes step-level information gain rewards from policy probabilities to improve credit assignment in RL training for search-augmented QA, yielding 1.6-point gains over trajectory-level baselines on multi-hop tasks.

Credo: Declarative Control of LLM Pipelines via Beliefs and Policies

cs.AI · 2026-04-15 · unverdicted · novelty 7.0

Credo proposes representing LLM agent state as beliefs and regulating pipeline behavior with declarative policies stored in a database for adaptive, auditable control.

REGREACT: Self-Correcting Multi-Agent Pipelines for Structured Regulatory Information Extraction

cs.MA · 2026-04-13 · unverdicted · novelty 7.0

RegReAct deploys self-correcting multi-agent pipelines across seven stages to extract hierarchical compliance criteria from regulatory texts, outperforming single-pass GPT-4o on EU Taxonomy documents.

Retrieval Augmented Conversational Recommendation with Reinforcement Learning

cs.IR · 2026-04-06 · unverdicted · novelty 7.0

RAR retrieves candidate items from a 300k-movie corpus then uses LLM generation with RL feedback to produce context-aware recommendations that outperform baselines on benchmarks.

BELIEF: Structured Evidence Modeling and Uncertainty-Aware Fusion for Biomedical Question Answering

cs.CL · 2026-05-17 · unverdicted · novelty 6.0

BELIEF improves closed-set biomedical QA by converting documents to structured evidence objects and fusing D-S symbolic belief estimation with LLM inference through reliability-aware arbitration.

PiCA: Pivot-Based Credit Assignment for Search Agentic Reinforcement Learning

cs.AI · 2026-05-10 · unverdicted · novelty 6.0 · 2 refs

PiCA uses pivot-based potential rewards derived from historical sub-queries to supply trajectory-aware step guidance in agentic RL, delivering 15% gains on QA benchmarks for 3B/7B models.

Agentic Retrieval-Augmented Generation for Financial Document Question Answering

cs.AI · 2026-05-06 · unverdicted · novelty 6.0

FinAgent-RAG achieves 76.81-78.46% execution accuracy on financial QA benchmarks by combining contrastive retrieval, program-of-thought code generation, and adaptive strategy routing, outperforming baselines by 5.62-9.32 points.

CAR: Query-Guided Confidence-Aware Reranking for Retrieval-Augmented Generation

cs.CL · 2026-05-06 · unverdicted · novelty 6.0

CAR reranks documents in RAG by promoting those that increase generator confidence (via answer consistency sampling) and demoting those that decrease it, yielding NDCG@5 gains on BEIR datasets that correlate with F1 improvements.

EviMem: Evidence-Gap-Driven Iterative Retrieval for Long-Term Conversational Memory

cs.CV · 2026-04-30 · unverdicted · novelty 6.0

EviMem improves accuracy on temporal and multi-hop questions in long-term conversational memory by iteratively diagnosing and filling evidence gaps, achieving 81.6% and 85.2% judge accuracy on LoCoMo at 4.5x lower latency than MIRIX.

Faithfulness-QA: A Counterfactual Entity Substitution Dataset for Training Context-Faithful RAG Models

cs.CL · 2026-04-28 · accept · novelty 6.0

Faithfulness-QA is a 99k-sample dataset created via counterfactual entity substitution on existing QA benchmarks to train and evaluate context-faithful RAG models.

Agentic Multi-Source Grounding for Enhanced Query Intent Understanding: A DoorDash Case Study

cs.AI · 2026-03-02 · unverdicted · novelty 6.0

An agentic multi-source grounding system for marketplace query intent achieves 90.7% accuracy on long-tail queries at DoorDash by combining catalog grounding, web search, and deterministic disambiguation, outperforming baselines by up to 13pp.

Towards Effective Long Video Understanding of Multimodal Large Language Models via One-shot Clip Retrieval

cs.CV · 2025-12-09 · unverdicted · novelty 6.0

OneClip-RAG enables MLLMs to handle long videos via one-shot clip retrieval and unified chunking-retrieval, delivering performance gains like matching GPT-5 level on MLVU with high efficiency on standard GPUs.

ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning

cs.AI · 2025-03-25 · unverdicted · novelty 6.0

ReSearch trains LLMs via RL to integrate search operations into reasoning steps, achieving strong generalization across benchmarks and eliciting reflection and self-correction without supervised reasoning data.

Retrieval-Augmented Generation for Natural Language Processing: A Survey

cs.CL · 2024-07-18 · accept · novelty 6.0

The survey organizes RAG methods via a taxonomy of query-based, logits-based, latent, and parametric fusion with comparisons on accessibility, efficiency, applications, and challenges.

CRAFT: Critic-Refined Adaptive Key-Frame Targeting for Multimodal Video Question Answering

cs.CV · 2026-05-18 · unverdicted · novelty 5.0

CRAFT introduces a query-conditioned pipeline with dynamic keyframe selection, ASR, and a hybrid critic loop that achieves top scores on MAGMaR 2026 for grounded multi-video question answering.

citing papers explorer

Showing 6 of 6 citing papers after filters.

MemFlow: Intent-Driven Memory Orchestration for Small Language Model Agents cs.MA · 2026-05-05 · unverdicted · none · ref 45 · internal anchor
MemFlow routes queries by intent to tiered memory operations, nearly doubling accuracy of a 1.7B SLM on long-horizon benchmarks compared to full-context baselines.
PiCA: Pivot-Based Credit Assignment for Search Agentic Reinforcement Learning cs.AI · 2026-05-10 · unverdicted · none · ref 45 · 2 links · internal anchor
PiCA uses pivot-based potential rewards derived from historical sub-queries to supply trajectory-aware step guidance in agentic RL, delivering 15% gains on QA benchmarks for 3B/7B models.
Retrieval-Augmented Generation for Natural Language Processing: A Survey cs.CL · 2024-07-18 · accept · none · ref 189 · internal anchor
The survey organizes RAG methods via a taxonomy of query-based, logits-based, latent, and parametric fusion with comparisons on accessibility, efficiency, applications, and challenges.
Retrieval-Augmented Generation for AI-Generated Content: A Survey cs.CV · 2024-02-29 · accept · none · ref 146 · internal anchor
A survey classifying RAG foundations for AIGC, summarizing enhancements, cross-modal applications, benchmarks, limitations, and future directions.
Token Economics for LLM Agents: A Dual-View Study from Computing and Economics cs.AI · 2026-05-09 · unverdicted · none · ref 77 · internal anchor
The paper delivers a unified survey of token economics for LLM agents, conceptualizing tokens as production factors, exchange mediums, and units of account across micro, meso, macro, and security dimensions using established economic theories.
Retrieval-Augmented Generation for Large Language Models: A Survey cs.CL · 2023-12-18 · unverdicted · none · ref 67 · internal anchor
A survey of RAG paradigms, components, benchmarks, and challenges for improving LLMs on knowledge-intensive tasks.

Corrective Retrieval Augmented Generation

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer