RADAR: Defending RAG Dynamically against Retrieval Corruption

Caifeng Shan; Jing Dong; Tieniu Tan; Weixiang Han; Yi Liu; Yueming Lyu; Ziyuan Chen

arxiv: 2605.22041 · v1 · pith:4I4UGPDYnew · submitted 2026-05-21 · 💻 cs.CR · cs.LG

RADAR: Defending RAG Dynamically against Retrieval Corruption

Ziyuan Chen , Yueming Lyu , Yi Liu , Weixiang Han , Jing Dong , Caifeng Shan , Tieniu Tan This is my paper

Pith reviewed 2026-05-22 05:53 UTC · model grok-4.3

classification 💻 cs.CR cs.LG

keywords RAG defenseretrieval corruptiondynamic defenseMax-Flow Min-CutBayesian updatesadversarial attackscontext selectionenergy minimization

0 comments

The pith

RADAR defends RAG systems in dynamic web environments by modeling context selection as graph-based energy minimization solved via Max-Flow Min-Cut and using recursive Bayesian belief updates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes RADAR to address the vulnerability of RAG systems to adversarial attacks that worsen with temporal volatility in web search. Existing static defenses either fail to adapt to evolving threats or require prohibitive storage for historical data. RADAR formulates reliable context selection as a graph-based energy minimization problem solved exactly with the Max-Flow Min-Cut algorithm. A Bayesian memory node enables recursive updates to a belief state rather than archiving raw documents, keeping storage low while balancing attack resistance with adaptability to genuine knowledge shifts. Experiments on a novel dynamic dataset show superior robustness and response quality compared to baselines.

Core claim

RADAR models reliable context selection as a graph-based energy minimization problem, solved exactly via Max-Flow Min-Cut. By incorporating a Bayesian memory node, RADAR recursively updates a belief state instead of archiving raw historical documents, effectively balancing stability against attacks with adaptability to genuine knowledge shifts.

What carries the argument

Graph-based energy minimization solved exactly via Max-Flow Min-Cut, combined with a Bayesian memory node for recursive belief updates.

If this is right

RAG systems can resist evolving attacks in changing environments without high storage costs by avoiding raw document archives.
Context selection can be optimized exactly using Max-Flow Min-Cut on a graph model of reliability.
Bayesian recursive updates allow adaptation to real knowledge changes while maintaining stability.
Response quality improves when attack resistance is paired with low-overhead belief tracking.
Dynamic datasets can expose limitations in static defenses that this approach addresses.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The combination of graph optimization and probabilistic memory could extend to other retrieval-based generative systems facing real-time data volatility.
This balance between security and adaptability might apply to long-running AI agents that must handle both threats and fact updates.
Approximations to the exact Max-Flow Min-Cut solver could be tested for scaling to very large context graphs.
Layering this method with existing security techniques might create more comprehensive protections for deployed RAG applications.

Load-bearing premise

That modeling reliable context selection as a graph-based energy minimization problem solved exactly via Max-Flow Min-Cut, combined with recursive Bayesian belief updates, will correctly balance attack resistance against genuine knowledge shifts in dynamic web environments.

What would settle it

An experiment on the dynamic dataset where RADAR fails to show better robustness or response quality than baselines or incurs higher storage overhead than claimed.

Figures

Figures reproduced from arXiv: 2605.22041 by Caifeng Shan, Jing Dong, Tieniu Tan, Weixiang Han, Yi Liu, Yueming Lyu, Ziyuan Chen.

**Figure 1.** Figure 1: Overview of RADAR. It generates an atomic answer for each retrieved document, scores entailment and contradiction using an NLI model, and applies an s-t Min-Cut to select a consistent, reliable subset for final answer generation. The dynamic variant augments the graph with a memory node to balance stability and plasticity across time steps. sources (e.g., Wikipedia) through repeated interactions, enabling… view at source ↗

**Figure 2.** Figure 2: Sensitivity of the post-processing threshold λ [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

read the original abstract

While RAG systems are increasingly deployed in dynamic web search, temporal volatility amplifies their vulnerability to adversarial attacks. Existing static-oriented defenses struggle to handle evolving threats and incur prohibitive storage costs in dynamic settings. We propose RADAR, a framework that models reliable context selection as a graph-based energy minimization problem, solved exactly via Max-Flow Min-Cut. By incorporating a Bayesian memory node, RADAR recursively updates a belief state instead of archiving raw historical documents, effectively balancing stability against attacks with adaptability to genuine knowledge shifts. Experiments on a novel dynamic dataset show that RADAR achieves superior robustness and response quality with minimal storage overhead compared to the baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RADAR frames RAG defense as exact max-flow min-cut on a graph plus recursive Bayesian belief updates to cut storage while handling dynamic attacks, but the abstract supplies no numbers or model specifics to check if it works.

read the letter

RADAR's main contribution is framing dynamic RAG defense as a graph energy minimization problem solved exactly via max-flow min-cut, combined with a Bayesian memory node that updates beliefs recursively instead of archiving raw documents. This targets the vulnerability in deployed RAG systems where temporal changes open up attack surfaces without ballooning storage needs. The paper does well in identifying why static defenses fall short in evolving settings and proposes a modeling choice that could be more efficient. Building a new dynamic dataset for testing is a solid step, and the focus on balancing attack resistance with adaptability to genuine shifts is a practical concern. The approach earns credit for avoiding high storage by using belief states rather than full history. If the likelihood model in the Bayesian updates works as intended, it might offer a clean separation in the min-cut selection. However, without metrics, baseline specifics, or implementation details in the provided abstract, it's hard to verify the experimental superiority. The central claim rests on the energy function and recursive updates correctly handling noisy observations, but the abstract doesn't show how the model encodes the difference between adversarial corruption and legitimate web evolution. That distinction is key, and the stress-test concern about belief drift or over-stabilization is reasonable. The paper needs to demonstrate that its formulation avoids these pitfalls in practice. This paper is for researchers and engineers working on secure and reliable RAG pipelines in dynamic environments like web search. Someone looking for graph-based or Bayesian techniques in AI security could find the framework worth examining or extending. I think it merits peer review to allow full scrutiny of the math, the dataset, and the results.

Referee Report

2 major / 1 minor

Summary. The paper proposes RADAR, a framework for defending RAG systems against retrieval corruption in dynamic web environments. It models reliable context selection as a graph-based energy minimization problem solved exactly via Max-Flow Min-Cut, incorporates a Bayesian memory node for recursive belief-state updates instead of archiving raw documents, and claims this balances attack resistance with adaptability to genuine knowledge shifts. Experiments on a novel dynamic dataset reportedly demonstrate superior robustness, response quality, and minimal storage overhead relative to baselines.

Significance. If the core modeling and experimental claims hold, the work could meaningfully advance defenses for deployed RAG systems facing temporal volatility and evolving threats, offering an exact optimization approach with low storage costs that static defenses lack. The use of max-flow min-cut for exact solvability and the memory-efficient Bayesian node are notable technical strengths.

major comments (2)

[§3] §3 (Energy Function and Min-Cut Formulation): The energy function and likelihood model treat retrievals as noisy observations of underlying truth but contain no explicit term or mechanism to distinguish adversarial corruptions from legitimate knowledge shifts. This is load-bearing for the central claim of correctly trading off attack resistance against adaptability in dynamic settings; without it, the min-cut may accept corrupted contexts when beliefs drift or reject valid updates when beliefs are over-stabilized.
[§4] §4 (Bayesian Memory Node and Recursive Updates): The recursive Bayesian update rules assume a likelihood that can separate change sources, yet the construction provides no change-source modeling. This directly affects whether the claimed balance is achieved, as noted in the abstract's description of handling temporal volatility.

minor comments (1)

[Abstract and §5] Abstract and §5 (Experiments): While the abstract summarizes superiority on a novel dynamic dataset, the experimental section should explicitly report all metrics, baseline implementations, dataset construction details (including how attacks and genuine shifts are simulated), and statistical significance to support reproducibility and verification of the robustness claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which identify an important modeling consideration for our central claims. We respond to each major comment below and indicate where revisions will be made to strengthen the presentation.

read point-by-point responses

Referee: [§3] §3 (Energy Function and Min-Cut Formulation): The energy function and likelihood model treat retrievals as noisy observations of underlying truth but contain no explicit term or mechanism to distinguish adversarial corruptions from legitimate knowledge shifts. This is load-bearing for the central claim of correctly trading off attack resistance against adaptability in dynamic settings; without it, the min-cut may accept corrupted contexts when beliefs drift or reject valid updates when beliefs are over-stabilized.

Authors: We agree that the energy function as formulated does not contain an explicit term that directly labels or separates adversarial corruptions from legitimate knowledge shifts. The distinction is instead realized implicitly: the Bayesian memory node maintains a belief state that serves as the reference for consistency; retrievals that deviate from this state (whether transient corruption or persistent shift) raise the associated energy. The exact min-cut then selects the lowest-cost partition, which in practice rejects isolated corruptions while permitting the belief to evolve when shifts persist across multiple updates. We will revise §3 to add an explicit paragraph describing this implicit separation mechanism and its relation to the claimed tradeoff. revision: yes
Referee: [§4] §4 (Bayesian Memory Node and Recursive Updates): The recursive Bayesian update rules assume a likelihood that can separate change sources, yet the construction provides no change-source modeling. This directly affects whether the claimed balance is achieved, as noted in the abstract's description of handling temporal volatility.

Authors: The likelihood used in the recursive updates models all observations as noisy relative to the current belief; source separation is not modeled explicitly but emerges from the temporal dynamics. Persistent genuine shifts produce consistent updates that gradually revise the belief state, whereas adversarial corruptions tend to be inconsistent with the evolving belief and are therefore penalized by the subsequent min-cut. We acknowledge that an explicit change-source variable could make the separation more transparent. We will therefore add a short limitations paragraph in §4 discussing this design choice and how the current recursive formulation approximates the desired balance. revision: partial

Circularity Check

0 steps flagged

No circularity in derivation chain

full rationale

The paper presents RADAR as a modeling framework that casts reliable context selection as a graph-based energy minimization problem solved exactly by Max-Flow Min-Cut and augments it with recursive Bayesian belief updates via a memory node. These are explicit design choices for balancing attack resistance and adaptability rather than any derivation that reduces to its own inputs by construction. No fitted parameters are renamed as predictions, no self-citations serve as load-bearing uniqueness theorems, and no ansatzes are smuggled via prior work. Experimental claims rest on a novel dynamic dataset and baseline comparisons, rendering the derivation self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the domain assumption that context reliability can be captured by an energy function on a graph and that Bayesian updates will distinguish attacks from genuine shifts; no free parameters or invented entities beyond the Bayesian memory node are mentioned.

axioms (1)

domain assumption Reliable context selection in dynamic RAG can be modeled as a graph-based energy minimization problem that is solved exactly by Max-Flow Min-Cut.
This modeling choice is the core of the proposed framework as stated in the abstract.

invented entities (1)

Bayesian memory node no independent evidence
purpose: Recursively updates a belief state instead of archiving raw historical documents to balance stability against attacks with adaptability to genuine shifts.
Introduced in the abstract as the mechanism enabling low-storage dynamic operation.

pith-pipeline@v0.9.0 · 5646 in / 1378 out tokens · 50098 ms · 2026-05-22T05:53:56.878926+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

min E(y) = Σ ψu(yi) + Σ ψp(yi,yj) with ψu=yi·Fi+(1−yi)·Si, ψp=Mij|yi−yj|; solved exactly via Max-Flow Min-Cut
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Bayesian memory node: S_old^(t) = π_S^(t−1)·L_S^(t) / [π_S^(t−1)·L_S^(t)+(1−π_S^(t−1))·(1−L_S^(t))]

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages · 1 internal anchor

[1]

L., Choudhary, S., Moon, S., Zhang, X., Sagar, A., Appini, S

Arora, S., Khan, H., Sun, K., Dong, X. L., Choudhary, S., Moon, S., Zhang, X., Sagar, A., Appini, S. T., Patnaik, K., et al. Stream rag: Instant and accurate spoken dialogue systems with streaming tool usage. InarXiv preprint arXiv:2510.02044,

work page arXiv
[2]

and Teglia, Y

Clop, C. and Teglia, Y . Backdoored retrievers for prompt in- jection attacks on retrieval augmented generation of large language models. InarXiv preprint arXiv:2410.14479,

work page arXiv
[4]

DeepSeek-V3 Technical Report

URL https://arxiv.org/abs/2412.19437. Ac- cessed: 2026-01-27. Elias, P., Feinstein, A., and Shannon, C. A note on the maximum flow through a network.IRE Transactions on Information Theory, 2(4):117–119,

work page internal anchor Pith review Pith/arXiv arXiv 2026
[5]

Topic-fliprag: Topic-orientated adversar- ial opinion manipulation attacks to retrieval-augmented generation models.arXiv preprint arXiv:2502.01386,

Gong, Y ., Chen, Z., Chen, M., Yu, F., Lu, W., Wang, X., Liu, X., and Liu, J. Topic-fliprag: Topic-orientated adversar- ial opinion manipulation attacks to retrieval-augmented generation models.arXiv preprint arXiv:2502.01386,

work page arXiv
[6]

Confundo: Learning to generate robust poison for prac- tical rag systems

Hu, H., Jiang, Z., Lyu, Y ., Zhang, J., Liu, Y ., and Chow, K.- H. Confundo: Learning to generate robust poison for prac- tical rag systems. InarXiv preprint arXiv:2602.06616,

work page arXiv
[7]

openai.com/gpt-4o-system-card.pdf

URL https://cdn. openai.com/gpt-4o-system-card.pdf . Ac- cessed: 2026-01-27. Prompt Security. The hidden parrot: Stealthy prompt injection and poisoning in rag systems via vector database embeddings. GitHub repository,

work page 2026
[8]

G., Dixit, T., Qin, J., Qian, C., Lee, D., Han, J., Small, K., Fan, X., Sarikaya, R., and Ji, H

Reddy, R. G., Dixit, T., Qin, J., Qian, C., Lee, D., Han, J., Small, K., Fan, X., Sarikaya, R., and Ji, H. Winell: wikipedia never-ending updating with llm agents. In arXiv preprint arXiv:2508.03728,

work page arXiv
[9]

Shen, Z., Imana, B

Accessed: 2026-01-10. Shen, Z., Imana, B. Y ., Wu, T., Xiang, C., Mittal, P., and Ko- rolova, A. Reliabilityrag: Effective and provably robust defense for rag-based web-search. InProc. of NeurIPS,

work page 2026
[10]

Accessed: 2026-01-27; API model id: grok-4-fast (and variants grok-4-fast-reasoning / grok-4-fast-non-reasoning)

URL https://x.ai/news/grok-4-fast. Accessed: 2026-01-27; API model id: grok-4-fast (and variants grok-4-fast-reasoning / grok-4-fast-non-reasoning). Xiang, C., Wu, T., Zhong, Z., Wagner, D., Chen, D., and Mit- tal, P. Certifiably robust rag against retrieval corruption. InICML 2024 Next Generation of AI Safety Workshop,

work page 2026
[11]

Badrag: Identifying vulnerabilities in retrieval aug- mented generation of large language models,

Xue, J., Zheng, M., Hu, Y ., Liu, F., Chen, X., and Lou, Q. Badrag: Identifying vulnerabilities in retrieval augmented generation of large language models. InarXiv preprint arXiv:2406.00083,

work page arXiv
[12]

From static to dynamic: A streaming rag ap- proach to real-time knowledge base

Zhu, Y . From static to dynamic: A streaming rag ap- proach to real-time knowledge base. InarXiv preprint arXiv:2508.05662,

work page arXiv
[13]

Overview This appendix provides supplementary technical details, derivations, and experimental results supporting theRADAR method presented in the main paper

11 RADAR: Defending RAG Dynamically against Retrieval Corruption A. Overview This appendix provides supplementary technical details, derivations, and experimental results supporting theRADAR method presented in the main paper. The sections are organized as follows: Appendix B derives the submodularity of the proposed energy function and explains its exact...

work page 2004
[14]

= Θ(k2) = Θ(n2),(42) Therefore, the graph is dense. We compute max flow using the highest-label preflow-push (HLPP) algorithm, which is a push-relabel method that always selects an active vertex of maximum height label and discharges it via a sequence of local push and relabel operations. Unlike augmenting-path methods that repeatedly search for full s-t ...

work page 1994
[15]

Who won the FIF A Men’s World Cup?

These results demonstrate thatRADAR consistently outperforms baselines in terms of accuracy, particularly in high-ranking evidence positions. Furthermore, RADAR shows superior robustness to evolving attacks, outperforming competing methods in both benign and attack scenarios. Our dynamic setting does not inject year-specific context into queries, though r...

work page 2015
[16]

Who won the Nobel Peace Prize?

Attacking Pos 1 lets the poisoned document dominate centrality; the new correct evidence is filtered out and the memory node preserves the stale answer. Failure Case 2: Query: “Who won the Nobel Peace Prize?” • 2021 (ground truth:Maria Ressa and Dmitry Muratov): Many documents describe the winners unclearly, causing some atomic answers to be generated wit...

work page 2021
[17]

question

This shows that answer volatility is substantial, suggesting the dataset better reflects dynamic retrieval settings rather than a mostly static benchmark. P. Examples of Dynamic Dataset Our Dynamic Dataset contains 500 QA questions, each associated with several different years. For each year, we retrieved the top 50 relevant documents from Google. Using D...

work page 2015
[18]

, "content

For his time as president-elect, see the presidential ...", "content": "Timeline of the Barack Obama presidency (2015) - Wikipedia..." }, { "title": "Get Ready: President Obama’s 2015 State of the Union Address", "url": "https://obamawhitehouse.archives.gov/blog/2015/01/11/get-ready- president-obamas-2015-state-union-address", "snippet": "On Tuesday, Janu...

work page 2015
[19]

, "content

These midterm elections occurred during incumbent Republican president Donald Trump’s first ...", "content": "2018 United States elections - Wikipedia..." }, { "title": "President Donald J. Trump Proclaims January 16, 2018, as Religious Freedom Day", "url": "https://trumpwhitehouse.archives.gov/presidential-actions/ president-donald-j-trump-proclaims-janu...

work page 2018

[1] [1]

L., Choudhary, S., Moon, S., Zhang, X., Sagar, A., Appini, S

Arora, S., Khan, H., Sun, K., Dong, X. L., Choudhary, S., Moon, S., Zhang, X., Sagar, A., Appini, S. T., Patnaik, K., et al. Stream rag: Instant and accurate spoken dialogue systems with streaming tool usage. InarXiv preprint arXiv:2510.02044,

work page arXiv

[2] [2]

and Teglia, Y

Clop, C. and Teglia, Y . Backdoored retrievers for prompt in- jection attacks on retrieval augmented generation of large language models. InarXiv preprint arXiv:2410.14479,

work page arXiv

[3] [4]

DeepSeek-V3 Technical Report

URL https://arxiv.org/abs/2412.19437. Ac- cessed: 2026-01-27. Elias, P., Feinstein, A., and Shannon, C. A note on the maximum flow through a network.IRE Transactions on Information Theory, 2(4):117–119,

work page internal anchor Pith review Pith/arXiv arXiv 2026

[4] [5]

Topic-fliprag: Topic-orientated adversar- ial opinion manipulation attacks to retrieval-augmented generation models.arXiv preprint arXiv:2502.01386,

Gong, Y ., Chen, Z., Chen, M., Yu, F., Lu, W., Wang, X., Liu, X., and Liu, J. Topic-fliprag: Topic-orientated adversar- ial opinion manipulation attacks to retrieval-augmented generation models.arXiv preprint arXiv:2502.01386,

work page arXiv

[5] [6]

Confundo: Learning to generate robust poison for prac- tical rag systems

Hu, H., Jiang, Z., Lyu, Y ., Zhang, J., Liu, Y ., and Chow, K.- H. Confundo: Learning to generate robust poison for prac- tical rag systems. InarXiv preprint arXiv:2602.06616,

work page arXiv

[6] [7]

openai.com/gpt-4o-system-card.pdf

URL https://cdn. openai.com/gpt-4o-system-card.pdf . Ac- cessed: 2026-01-27. Prompt Security. The hidden parrot: Stealthy prompt injection and poisoning in rag systems via vector database embeddings. GitHub repository,

work page 2026

[7] [8]

G., Dixit, T., Qin, J., Qian, C., Lee, D., Han, J., Small, K., Fan, X., Sarikaya, R., and Ji, H

Reddy, R. G., Dixit, T., Qin, J., Qian, C., Lee, D., Han, J., Small, K., Fan, X., Sarikaya, R., and Ji, H. Winell: wikipedia never-ending updating with llm agents. In arXiv preprint arXiv:2508.03728,

work page arXiv

[8] [9]

Shen, Z., Imana, B

Accessed: 2026-01-10. Shen, Z., Imana, B. Y ., Wu, T., Xiang, C., Mittal, P., and Ko- rolova, A. Reliabilityrag: Effective and provably robust defense for rag-based web-search. InProc. of NeurIPS,

work page 2026

[9] [10]

Accessed: 2026-01-27; API model id: grok-4-fast (and variants grok-4-fast-reasoning / grok-4-fast-non-reasoning)

URL https://x.ai/news/grok-4-fast. Accessed: 2026-01-27; API model id: grok-4-fast (and variants grok-4-fast-reasoning / grok-4-fast-non-reasoning). Xiang, C., Wu, T., Zhong, Z., Wagner, D., Chen, D., and Mit- tal, P. Certifiably robust rag against retrieval corruption. InICML 2024 Next Generation of AI Safety Workshop,

work page 2026

[10] [11]

Badrag: Identifying vulnerabilities in retrieval aug- mented generation of large language models,

Xue, J., Zheng, M., Hu, Y ., Liu, F., Chen, X., and Lou, Q. Badrag: Identifying vulnerabilities in retrieval augmented generation of large language models. InarXiv preprint arXiv:2406.00083,

work page arXiv

[11] [12]

From static to dynamic: A streaming rag ap- proach to real-time knowledge base

Zhu, Y . From static to dynamic: A streaming rag ap- proach to real-time knowledge base. InarXiv preprint arXiv:2508.05662,

work page arXiv

[12] [13]

Overview This appendix provides supplementary technical details, derivations, and experimental results supporting theRADAR method presented in the main paper

11 RADAR: Defending RAG Dynamically against Retrieval Corruption A. Overview This appendix provides supplementary technical details, derivations, and experimental results supporting theRADAR method presented in the main paper. The sections are organized as follows: Appendix B derives the submodularity of the proposed energy function and explains its exact...

work page 2004

[13] [14]

= Θ(k2) = Θ(n2),(42) Therefore, the graph is dense. We compute max flow using the highest-label preflow-push (HLPP) algorithm, which is a push-relabel method that always selects an active vertex of maximum height label and discharges it via a sequence of local push and relabel operations. Unlike augmenting-path methods that repeatedly search for full s-t ...

work page 1994

[14] [15]

Who won the FIF A Men’s World Cup?

These results demonstrate thatRADAR consistently outperforms baselines in terms of accuracy, particularly in high-ranking evidence positions. Furthermore, RADAR shows superior robustness to evolving attacks, outperforming competing methods in both benign and attack scenarios. Our dynamic setting does not inject year-specific context into queries, though r...

work page 2015

[15] [16]

Who won the Nobel Peace Prize?

Attacking Pos 1 lets the poisoned document dominate centrality; the new correct evidence is filtered out and the memory node preserves the stale answer. Failure Case 2: Query: “Who won the Nobel Peace Prize?” • 2021 (ground truth:Maria Ressa and Dmitry Muratov): Many documents describe the winners unclearly, causing some atomic answers to be generated wit...

work page 2021

[16] [17]

question

This shows that answer volatility is substantial, suggesting the dataset better reflects dynamic retrieval settings rather than a mostly static benchmark. P. Examples of Dynamic Dataset Our Dynamic Dataset contains 500 QA questions, each associated with several different years. For each year, we retrieved the top 50 relevant documents from Google. Using D...

work page 2015

[17] [18]

, "content

For his time as president-elect, see the presidential ...", "content": "Timeline of the Barack Obama presidency (2015) - Wikipedia..." }, { "title": "Get Ready: President Obama’s 2015 State of the Union Address", "url": "https://obamawhitehouse.archives.gov/blog/2015/01/11/get-ready- president-obamas-2015-state-union-address", "snippet": "On Tuesday, Janu...

work page 2015

[18] [19]

, "content

These midterm elections occurred during incumbent Republican president Donald Trump’s first ...", "content": "2018 United States elections - Wikipedia..." }, { "title": "President Donald J. Trump Proclaims January 16, 2018, as Religious Freedom Day", "url": "https://trumpwhitehouse.archives.gov/presidential-actions/ president-donald-j-trump-proclaims-janu...

work page 2018