Does RAG Know When Retrieval Is Wrong? Diagnosing Context Compliance under Knowledge Conflict
Pith reviewed 2026-05-19 16:33 UTC · model grok-4.3
The pith
Context-Driven Decomposition diagnoses when RAG follows conflicting retrieved context instead of its own knowledge and intervenes to improve robustness.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that the Context-Compliance Regime occurs when retrieved context dominates the final answer even under direct conflict with parametric knowledge, and shows that Context-Driven Decomposition serves as an inference-time belief-decomposition probe and intervention mechanism. On TruthfulQA misconception injection, standard RAG reaches only 15.0 percent accuracy; CDD lifts results on temporal shifts to 71.3 percent and on distractor evidence to 69.9 percent. Accuracy gains transfer to Gemini-2.5-Flash and Claude variants, but explicit rationale-answer causal sensitivity transfers only partially, indicating that context compliance forms a measurable axis distinct from single-
What carries the argument
Context-Driven Decomposition (CDD), an inference-time belief-decomposition probe that isolates causal influence of context versus parametric knowledge and enables controlled intervention on retrieval conflicts.
If this is right
- Standard RAG reaches only 15.0 percent accuracy when context injects misconceptions on TruthfulQA.
- CDD accuracy gains transfer across Gemini and Claude model families, though causal sensitivity of rationale to answer does not.
- Explicit conflict decomposition raises robustness to 71.3 percent under temporal drift and 69.9 percent with noisy distractors.
- Context compliance operates as a structural axis separate from retrieval quality or single-method robustness.
Where Pith is reading between the lines
- CDD-style checks could be inserted into existing RAG pipelines as a lightweight conflict detector before final generation.
- Model-family differences in causal coupling suggest that internal architecture affects how conflicts are resolved beyond surface accuracy.
- The Epi-Scale benchmark release could encourage systematic comparison of conflict-handling methods across retrieval pipelines.
- Similar decomposition approaches might extend to other generation tasks that combine external documents with pretrained knowledge.
Load-bearing premise
The CDD decomposition performed at inference time accurately isolates the causal contribution of context versus parametric knowledge without introducing its own artifacts or requiring model-specific tuning.
What would settle it
Measure whether turning CDD on or off on a fixed set of conflicting question-context pairs produces answer changes that match the decomposed context and parametric scores in a held-out test set.
Figures
read the original abstract
The Context-Compliance Regime in Retrieval-Augmented Generation (RAG) occurs when retrieved context dominates the final answer even when it conflicts with the model's parametric knowledge. Accuracy alone does not reveal how retrieved context causally shapes answers under such conflict. We introduce Context-Driven Decomposition (CDD), a belief-decomposition probe that operates at inference time and serves as an intervention mechanism for controlled retrieval conflict. Across Epi-Scale stress tests, TruthfulQA misconception injection, and cross-model reruns, CDD exposes three patterns. P1: context compliance is measurable in an upper-bound adversarial setting, where Standard RAG reaches 15.0% accuracy on TruthfulQA misconception injection (N=500). P2: adversarial accuracy gains transfer across model families -- CDD improves accuracy on Gemini-2.5-Flash and on Claude Haiku/Sonnet/Opus -- but rationale-answer causal coupling does not transfer. CDD reaches 64.1% mistake-injection causal sensitivity on Gemini-2.5-Flash, while sensitivities for all three Claude variants fall in the [-3%, +7%] range, suggesting that the Claude-side accuracy gains operate through a mechanism distinct from the explicit conflict-resolution trace. P3: explicit conflict decomposition improves robustness under temporal drift and noisy distractors, with CDD reaching 71.3% on temporal shifts and 69.9% on distractor evidence on the full Epi-Scale adversarial benchmark. These three patterns identify context-compliance as a structural axis along which standard RAG can be probed and intervened on, distinct from retrieval-quality or single-method robustness questions, and motivate releasing Epi-Scale for systematic study across model families and retrieval pipelines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Context-Driven Decomposition (CDD), an inference-time belief-decomposition probe and intervention for RAG systems. It diagnoses context compliance under knowledge conflicts with parametric knowledge via experiments on TruthfulQA misconception injection (N=500) and Epi-Scale adversarial benchmarks across model families. The central claims are three patterns: standard RAG shows low adversarial accuracy (15.0%), CDD yields transferable accuracy gains but non-transferable rationale-answer causal coupling (64.1% sensitivity on Gemini-2.5-Flash vs. [-3%, +7%] on Claude variants), and CDD improves robustness to temporal drift (71.3%) and noisy distractors (69.9%).
Significance. If the CDD probe validly isolates causal context influence without introducing model-specific artifacts, the work identifies context compliance as a distinct structural axis in RAG separate from retrieval quality or single-method robustness. The cross-family transfer analysis and release of Epi-Scale for systematic study would be useful contributions to diagnosing and mitigating knowledge conflicts in retrieval-augmented systems.
major comments (2)
- [§3 (CDD method and inference-time decomposition)] The claim that CDD accurately isolates causal influence of context versus parametric knowledge (central to P2) is load-bearing but under-supported: the large divergence in causal sensitivity (64.1% Gemini-2.5-Flash vs. near-zero for all Claude variants) while accuracy gains transfer is consistent with probe interacting differently with each model's instruction-following or rationale behavior. No ablations of the decomposition step itself, fixed prompt templates, or identical intervention formats are described to rule out artifacts.
- [§5 (Epi-Scale stress tests and results)] Table or results section reporting Epi-Scale outcomes: the 71.3% on temporal shifts and 69.9% on distractor evidence are presented as robustness gains, but without details on how temporal drift is constructed, statistical significance of improvements over baselines, or controls for post-hoc benchmark choices, it is unclear whether these numbers establish the claimed improvement under drift and noise.
minor comments (2)
- [Abstract] Abstract and §4 could more explicitly separate the accuracy metric from the causal sensitivity metric when describing P2 to prevent readers from conflating transferable performance with transferable causal structure.
- [§3] The manuscript would benefit from a clearer operational definition or pseudocode for the CDD intervention format to allow replication across model families.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback. The comments identify key areas where additional evidence and clarification can strengthen the claims about the CDD probe's ability to isolate causal context influence and the robustness results on Epi-Scale. We address each major comment below, providing our strongest honest defense while noting planned revisions.
read point-by-point responses
-
Referee: [§3 (CDD method and inference-time decomposition)] The claim that CDD accurately isolates causal influence of context versus parametric knowledge (central to P2) is load-bearing but under-supported: the large divergence in causal sensitivity (64.1% Gemini-2.5-Flash vs. near-zero for all Claude variants) while accuracy gains transfer is consistent with probe interacting differently with each model's instruction-following or rationale behavior. No ablations of the decomposition step itself, fixed prompt templates, or identical intervention formats are described to rule out artifacts.
Authors: The observed divergence in causal sensitivity is not presented as a flaw but as a central finding supporting P2: accuracy gains transfer across families while rationale-answer causal coupling does not, indicating that context compliance can be resolved through distinct mechanisms depending on model architecture. This pattern is consistent with the paper's argument that context compliance is a structural axis separate from general instruction-following. That said, we acknowledge the concern that unablated prompt or intervention choices could introduce model-specific artifacts. In the revision we will add explicit ablations of the decomposition step, including alternative prompt templates and matched intervention formats across models, to more directly test whether the sensitivity differences persist independently of these factors. revision: yes
-
Referee: [§5 (Epi-Scale stress tests and results)] Table or results section reporting Epi-Scale outcomes: the 71.3% on temporal shifts and 69.9% on distractor evidence are presented as robustness gains, but without details on how temporal drift is constructed, statistical significance of improvements over baselines, or controls for post-hoc benchmark choices, it is unclear whether these numbers establish the claimed improvement under drift and noise.
Authors: We agree that the current presentation leaves the construction details and statistical grounding implicit. The reported figures reflect CDD's performance on the full Epi-Scale adversarial benchmark under the described stress conditions, but additional documentation is required to allow readers to evaluate the improvements. In the revised manuscript we will expand the Epi-Scale section with a precise description of temporal-drift construction, report statistical significance of gains relative to the standard-RAG baseline, and document the controls applied to benchmark selection and evaluation order. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper introduces Context-Driven Decomposition (CDD) as a new inference-time belief-decomposition probe and reports three empirical patterns (P1–P3) from direct measurements on external benchmarks including TruthfulQA misconception injection (15.0% accuracy) and Epi-Scale (71.3% on temporal shifts). These outcomes are obtained via accuracy, sensitivity, and robustness metrics across model families without any equations, fitted parameters, or self-citations that reduce the central claims to quantities defined by construction within the paper. The derivation chain is therefore self-contained and relies on observable experimental results rather than internal redefinitions or imported ansatzes.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Accuracy and causal sensitivity metrics on injected misconceptions accurately reflect context compliance behavior.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We introduce Context-Driven Decomposition (CDD), a belief-decomposition probe that operates at inference time... five-step reasoning trace (Step 1: Contextual Extraction... Step 5: Resolution)
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Mistake-injection causal sensitivity... 64.1% on Gemini-2.5-Flash... [-3%, +7%] on Claude variants
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.