BiRD: A Bidirectional Ranking Defense Mechanism for Retrieval Augmented Generation

Chao Liang; Chengcai Gao; Qiufeng Wang; Xiaochuan Shi; Zhihong Sun

arxiv: 2605.20123 · v1 · pith:OTTCEHA4new · submitted 2026-05-19 · 💻 cs.CR · cs.IR

BiRD: A Bidirectional Ranking Defense Mechanism for Retrieval Augmented Generation

Chengcai Gao , Zhihong Sun , Xiaochuan Shi , Qiufeng Wang , Chao Liang This is my paper

Pith reviewed 2026-05-20 03:41 UTC · model grok-4.3

classification 💻 cs.CR cs.IR

keywords retrieval augmented generationadversarial defensepoisoning attacksbidirectional rankingRAG securityranking structuresadversarial robustness

0 comments

The pith

BiRD defends RAG by spotting poisoned documents through unusually strong alignment between their backward rankings and the query's forward ranking.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to prove that poisoned documents in retrieval-augmented generation systems display a consistent ranking pattern that benign documents lack, allowing a lightweight defense to filter them without heavy semantic analysis. This matters because existing defenses either demand high computation or lose effectiveness against strong poisoning attacks, restricting reliable use of RAG. By combining forward ranking for content relevance with backward ranking for context consistency, the approach aims to cut attack success while raising overall accuracy and keeping added delay under one second. If the pattern holds, RAG deployments could become safer in practice without trading performance for security.

Core claim

The central claim is that poisoned documents exhibit significantly stronger alignment between their backward rankings and the query's forward ranking than benign ones do. The authors build BiRD on a dual-signal framework that uses forward ranking to judge semantic relevance and backward ranking to measure ranking context consistency, directly addressing the prior focus on content alone. Experiments across three datasets, three retrievers, three LLMs, and two attack scenarios show this reduces PoisonedRAG attack success by up to 54 percent while lifting task accuracy by up to 56 percent with under one second of extra latency on average.

What carries the argument

The bidirectional ranking defense mechanism that pairs forward ranking for semantic content relevance with backward ranking for ranking context consistency.

Load-bearing premise

Poisoned documents will reliably show stronger alignment between their backward rankings and the query's forward ranking across varied datasets, retrievers, models, and attacks.

What would settle it

A test on a fresh dataset or attack method where poisoned documents no longer display the claimed stronger backward-forward alignment would falsify the central pattern.

Figures

Figures reproduced from arXiv: 2605.20123 by Chao Liang, Chengcai Gao, Qiufeng Wang, Xiaochuan Shi, Zhihong Sun.

**Figure 2.** Figure 2: Rank Position Poison Frequency Statistical [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Rank position poison frequency comparison: (Top) different retrievers on HotpotQA; (Middle) Contriever across [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: t-SNE visualization of textual embeddings for be [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Overview of BiRD method. The process is mainly divided into three stages: forward retrieval, backward retrieval, and [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: The variation of ASR and ACC across different datasets in relation to the parameter [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 7.** Figure 7: ASR and ACC versus the filtering threshold [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

**Figure 9.** Figure 9: Comprehensive comparison of rank position poisoned frequency heatmaps across three datasets (rows) and three [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗

**Figure 10.** Figure 10: Rank position poison frequency for Topic-Flip attacks: PRO strategy (top row) and CON strategy (bottom row) across [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗

read the original abstract

The growing adoption of Retrieval-Augmented Generation (RAG) has led to a rise in adversarial attacks. Existing defenses, relying on semantic analysis or voting, face a trade-off between high computational cost and limited robustness under strong poisoning attacks. Their fundamental limitation is the exclusive focus on semantic content relevance, while neglecting the retrieval context that is critically defined by ranking structures. To this end, we investigate the bidirectional ranking behavior of poisoned and benign documents, and discover a key discriminative pattern: poisoned documents exhibit significantly stronger alignment between their backward rankings and the query's forward ranking. Capitalizing on this, we propose BiRD, a bidirectional ranking defense mechanism built upon a dual-signal framework that leverages forward ranking to assess semantic content relevance and backward ranking to quantify ranking context consistency. This design directly addresses the fundamental limitation of prior approaches, enabling simultaneous efficiency and robustness. Extensive evaluation across 3 datasets with 3 retrievers and 3 LLMs under 2 attack scenarios validates BiRD's effectiveness. Notably, BiRD reduces the attack success rate of PoisonedRAG by up to 54% while simultaneously improving task accuracy by up to 56%, with average additional latency under 1 second.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

BiRD spots a bidirectional ranking alignment pattern in poisoned RAG documents and turns it into a low-overhead dual-signal defense that reports solid gains on tested attacks.

read the letter

The key takeaway is that this paper identifies a bidirectional ranking alignment pattern that distinguishes poisoned documents from benign ones in RAG systems and builds a simple dual-signal defense around it. Forward ranking handles semantic relevance while backward ranking checks consistency in the ranking context, which directly targets a gap in prior semantic-only or voting defenses. The approach is straightforward and avoids heavy computation, which fits real deployment needs. They test across three datasets, three retrievers, and three LLMs under two attack scenarios, and the numbers show up to 54% lower attack success on PoisonedRAG plus up to 56% better task accuracy with under one second added latency on average. That combination of robustness and speed is the practical strength here. The evaluation coverage gives a reasonable picture of behavior under the conditions they chose. Where it could be softer is on how general the alignment pattern really is. It might tie more closely to the way PoisonedRAG constructs poisoned documents or to the specific retrievers and scoring methods used in the tests. If the signal weakens with adapted attacks or different embedding models, the reported gains could shrink. The summary does not include variance numbers, statistical tests, or ablations that isolate the backward-ranking component, so it is hard to tell exactly how much that signal drives the results versus incidental filtering. This paper is for researchers and engineers working on security for deployed RAG systems. Anyone looking at lightweight ways to harden retrieval against poisoning would find the core idea and efficiency results useful. It engages honestly with the limitations of existing methods and presents a new angle worth checking. I would send this to peer review.

Referee Report

3 major / 2 minor

Summary. The paper claims to have identified a key discriminative pattern in RAG systems: poisoned documents exhibit significantly stronger alignment between their backward rankings and the query's forward ranking. Building on this observation, it proposes BiRD, a bidirectional ranking defense that uses a dual-signal framework—forward ranking to assess semantic content relevance and backward ranking to quantify ranking context consistency. The approach is evaluated across 3 datasets, 3 retrievers, and 3 LLMs under 2 attack scenarios, reporting up to 54% reduction in PoisonedRAG attack success rate, up to 56% improvement in task accuracy, and average added latency under 1 second.

Significance. If the bidirectional ranking alignment pattern is shown to be a robust, generalizable property of poisoning rather than tied to specific attack constructions or retriever choices, BiRD would represent a meaningful advance by resolving the efficiency-robustness trade-off in prior semantic or voting-based defenses. The multi-configuration evaluation across datasets/retrievers/LLMs is a strength that supports broader applicability claims.

major comments (3)

[Abstract] Abstract: The central performance claims (up to 54% ASR reduction and 56% accuracy improvement) are stated as maxima without identifying the exact dataset/retriever/LLM configuration, reporting variance across runs, or including statistical significance tests; this directly affects whether the data support the claimed effectiveness of the dual-signal mechanism.
[Method] Method (dual-signal framework): The defense rests on the assumption that stronger backward-forward ranking alignment is intrinsic to poisoned documents and a reliable discriminator; without an ablation that isolates or removes the backward-ranking signal, it remains unclear whether reported gains derive from the proposed mechanism or from incidental top-k filtering effects.
[Experiments] Experiments: Results are shown across 3 retrievers, but the evaluation does not test adapted variants of PoisonedRAG or alternative embedding models (dense vs. sparse); this leaves open whether the alignment pattern persists when the attack or retrieval setup changes, which is load-bearing for the generalizability of the defense.

minor comments (2)

[Method] Notation for forward and backward rankings could be introduced with a small illustrative example early in the method section to improve readability.
[Abstract] The abstract mentions '2 attack scenarios' but does not name them; adding the names (e.g., PoisonedRAG and the second scenario) would aid quick assessment.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for major revision. We address each major comment point by point below, indicating where we agree and the revisions we will make to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: The central performance claims (up to 54% ASR reduction and 56% accuracy improvement) are stated as maxima without identifying the exact dataset/retriever/LLM configuration, reporting variance across runs, or including statistical significance tests; this directly affects whether the data support the claimed effectiveness of the dual-signal mechanism.

Authors: We agree that the abstract would benefit from greater specificity. In the revised version, we will update the abstract to identify the exact dataset/retriever/LLM configuration that achieves the reported maxima. We will also ensure the experimental results section reports variance across runs and includes statistical significance tests to more rigorously support the effectiveness of the dual-signal mechanism. revision: yes
Referee: [Method] Method (dual-signal framework): The defense rests on the assumption that stronger backward-forward ranking alignment is intrinsic to poisoned documents and a reliable discriminator; without an ablation that isolates or removes the backward-ranking signal, it remains unclear whether reported gains derive from the proposed mechanism or from incidental top-k filtering effects.

Authors: This is a fair and important point. To clarify that the gains stem from the bidirectional mechanism rather than incidental top-k effects, we will add an ablation study in the revised manuscript. The ablation will compare the full BiRD dual-signal framework against a forward-ranking-only variant, thereby isolating the contribution of the backward-ranking signal. revision: yes
Referee: [Experiments] Experiments: Results are shown across 3 retrievers, but the evaluation does not test adapted variants of PoisonedRAG or alternative embedding models (dense vs. sparse); this leaves open whether the alignment pattern persists when the attack or retrieval setup changes, which is load-bearing for the generalizability of the defense.

Authors: We acknowledge the value of broader testing for generalizability. While our evaluation already spans three retrievers under two attack scenarios, we agree that adapted PoisonedRAG variants and explicit dense-versus-sparse comparisons would provide stronger evidence. We will revise the discussion section to explicitly address this limitation and outline it as future work; however, we cannot perform these additional experiments within the current revision timeline. revision: partial

Circularity Check

0 steps flagged

No significant circularity; defense rests on empirical pattern discovery and direct implementation.

full rationale

The paper derives BiRD from an empirical observation of bidirectional ranking alignment in poisoned documents, discovered via investigation across datasets and setups, then applies a dual-signal framework using forward and backward rankings. No equations reduce performance claims to fitted parameters by construction, no self-citations form load-bearing premises, and no ansatzes or uniqueness theorems are imported from prior author work. The central claim remains an independent empirical finding applied to defense design, with results reported across multiple retrievers and LLMs without reducing to self-referential inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The approach relies on an empirical discovery of ranking behavior differences and standard retrieval ranking mechanics; no free parameters, domain axioms beyond ordinary IR assumptions, or new invented entities are introduced.

pith-pipeline@v0.9.0 · 5748 in / 1164 out tokens · 36012 ms · 2026-05-20T03:41:25.246468+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

poisoned documents exhibit significantly stronger alignment between their backward rankings and the query's forward ranking
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

S(dq_i) = r_i_cr / (1 - r_i_cc) with Spearman rank correlation

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages

[1]

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions,

L. Huang, W. Yu, W. Ma, W. Zhong, Z. Feng, H. Wang, Q. Chen, W. Peng, X. Feng, B. Qin, and T. Liu, “A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions,” ACM Trans. Inf. Syst., vol. 43, no. 2, pp. 42:1–42:55, 2025

work page 2025
[2]

A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly,

Y . Yao, J. Duan, K. Xu, Y . Cai, Z. Sun, and Y . Zhang, “A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly,”High- Confidence Computing, vol. 4, no. 2, p. 100211, 2024

work page 2024
[3]

Retrieval-Augmented Generation for Large Language Models: A Survey,

Y . Gao, Y . Xiong, X. Gao, K. Jia, J. Pan, Y . Bi, Y . Dai, J. Sun, M. Wang, and H. Wang, “Retrieval-Augmented Generation for Large Language Models: A Survey,” 2024

work page 2024
[4]

PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models,

W. Zou, R. Geng, B. Wang, and J. Jia, “PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models,” 2024

work page 2024
[5]

Poisoning Retrieval Corpora by Injecting Adversarial Passages,

Z. Zhong, Z. Huang, A. Wettig, and D. Chen, “Poisoning Retrieval Corpora by Injecting Adversarial Passages,” inProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, H. Bouamor, J. Pino, and K. Bali, Eds. Singapore: Association for Computational Linguistics, 2023, pp. 13 764–13 775

work page 2023
[6]

GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based Search,

M. Ben-Tov and M. Sharif, “GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based Search,” 2025

work page 2025
[7]

The Silent Saboteur: Impercepti- ble Adversarial Attacks against Black-Box Retrieval- Augmented Generation Systems,

H. Song, Y .-a. Liu, R. Zhang, J. Guo, J. Lv, M. de Ri- jke, and X. Cheng, “The Silent Saboteur: Impercepti- ble Adversarial Attacks against Black-Box Retrieval- Augmented Generation Systems,” 2025

work page 2025
[8]

Topic-FlipRAG: Topic-Orientated Adversarial Opinion Manipulation Attacks to Retrieval- Augmented Generation Models,

Y . Gong, Z. Chen, M. Chen, F. Yu, W. Lu, X. Wang, X. Liu, and J. Liu, “Topic-FlipRAG: Topic-Orientated Adversarial Opinion Manipulation Attacks to Retrieval- Augmented Generation Models,” 2025

work page 2025
[9]

Your RAG is Unfair: Exposing Fairness Vulnerabilities in Retrieval-Augmented Gener- ation via Backdoor Attacks,

G. Bagwe, S. S. Chaturvedi, X. Ma, X. Yuan, K.-C. Wang, and L. Zhang, “Your RAG is Unfair: Exposing Fairness Vulnerabilities in Retrieval-Augmented Gener- ation via Backdoor Attacks,” 2025

work page 2025
[10]

SeCon-RAG: A Two- Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG,

X. Si, M. Zhu, S. Qin, L. Yu, L. Zhang, S. Liu, X. Li, R. Duan, Y . Liu, and X. Jia, “SeCon-RAG: A Two- Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG,” 2025

work page 2025
[11]

TrustRAG: Enhancing Ro- bustness and Trustworthiness in Retrieval-Augmented Generation,

H. Zhou, K.-H. Lee, Z. Zhan, Y . Chen, Z. Li, Z. Wang, H. Haddadi, and E. Yilmaz, “TrustRAG: Enhancing Ro- bustness and Trustworthiness in Retrieval-Augmented Generation,” 2025

work page 2025
[12]

ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search,

Z. Shen, B. Imana, T. Wu, C. Xiang, P. Mittal, and A. Korolova, “ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search,” 2025

work page 2025
[13]

Certifiably Robust RAG against Retrieval Corruption,

C. Xiang, T. Wu, Z. Zhong, D. Wagner, D. Chen, and P. Mittal, “Certifiably Robust RAG against Retrieval Corruption,” 2024

work page 2024
[14]

On the Vulnerabil- ity of Applying Retrieval-Augmented Generation within Knowledge-Intensive Application Domains,

X. Xian, G. Wang, X. Bi, J. Srinivasa, A. Kundu, C. Fleming, M. Hong, and J. Ding, “On the Vulnerabil- ity of Applying Retrieval-Augmented Generation within Knowledge-Intensive Application Domains,” 2025

work page 2025
[15]

Astute RAG: Overcoming Imperfect Retrieval Augmen- tation and Knowledge Conflicts for Large Language Models,

F. Wang, X. Wan, R. Sun, J. Chen, and S. O. Arik, “Astute RAG: Overcoming Imperfect Retrieval Augmen- tation and Knowledge Conflicts for Large Language Models,” inProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), W. Che, J. Nabende, E. Shutova, and M. T. Pilehvar, Eds. Vienna, Austria: Associa...

work page 2025
[16]

InstructRAG: Instructing Retrieval-Augmented Generation via Self- Synthesized Rationales,

Z. Wei, W.-L. Chen, and Y . Meng, “InstructRAG: Instructing Retrieval-Augmented Generation via Self- Synthesized Rationales,” inThe Thirteenth International Conference on Learning Representations, 2024. 13

work page 2024
[17]

Query Rewriting in Retrieval-Augmented Large Language Models,

X. Ma, Y . Gong, P. He, H. Zhao, and N. Duan, “Query Rewriting in Retrieval-Augmented Large Language Models,” inProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, H. Bouamor, J. Pino, and K. Bali, Eds. Singapore: Association for Computational Linguistics, 2023, pp. 5303–5315

work page 2023
[18]

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,

P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rock- täschel, S. Riedel, and D. Kiela, “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” in Advances in Neural Information Processing Systems, vol. 33. Curran Associates, Inc., 2020, pp. 9459–9474

work page 2020
[19]

Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models,

Y . Zhang, Y . Li, L. Cui, D. Cai, L. Liu, T. Fu, X. Huang, E. Zhao, Y . Zhang, C. Xu, Y . Chen, L. Wang, A. T. Luu, W. Bi, F. Shi, and S. Shi, “Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models,” 2025

work page 2025
[20]

SimRAG: Self- Improving Retrieval-Augmented Generation for Adapt- ing Large Language Models to Specialized Domains,

R. Xu, H. Liu, S. Nag, Z. Dai, Y . Xie, X. Tang, C. Luo, Y . Li, J. C. Ho, C. Yang, and Q. He, “SimRAG: Self- Improving Retrieval-Augmented Generation for Adapt- ing Large Language Models to Specialized Domains,” in Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Te...

work page 2025
[21]

HijackRAG: Hijacking Attacks against Retrieval-Augmented Large Language Models,

Y . Zhang, Q. Li, T. Du, X. Zhang, X. Zhao, Z. Feng, and J. Yin, “HijackRAG: Hijacking Attacks against Retrieval-Augmented Large Language Models,” 2024

work page 2024
[22]

PR-Attack: Coordi- nated Prompt-RAG Attacks on Retrieval-Augmented Generation in Large Language Models via Bilevel Opti- mization,

Y . Jiao, X. Wang, and K. Yang, “PR-Attack: Coordi- nated Prompt-RAG Attacks on Retrieval-Augmented Generation in Large Language Models via Bilevel Opti- mization,” 2025

work page 2025
[23]

Maximal independent sets in bipartite graphs,

J. Liu, “Maximal independent sets in bipartite graphs,” Journal of Graph Theory, vol. 17, no. 4, pp. 495–507,

work page
[24]

Available: https://onlinelibrary.wiley

[Online]. Available: https://onlinelibrary.wiley. com/doi/abs/10.1002/jgt.3190170407

work page doi:10.1002/jgt.3190170407
[25]

Bidi- rectional ranking for person re-identification,

Q. Leng, R. Hu, C. Liang, Y . Wang, and J. Chen, “Bidi- rectional ranking for person re-identification,” in2013 IEEE International Conference on Multimedia and Expo (ICME), 2013, pp. 1–6

work page 2013
[26]

Bidirectional Attention Flow for Machine Comprehen- sion,

M. Seo, A. Kembhavi, A. Farhadi, and H. Hajishirzi, “Bidirectional Attention Flow for Machine Comprehen- sion,” 2018

work page 2018
[27]

Query2doc: Query Expansion with Large Language Models,

L. Wang, N. Yang, and F. Wei, “Query2doc: Query Expansion with Large Language Models,” 2023

work page 2023
[28]

Hello neighbor: Accurate object retrieval with k-reciprocal nearest neighbors,

D. Qin, S. Gammeter, L. Bossard, T. Quack, and L. van Gool, “Hello neighbor: Accurate object retrieval with k-reciprocal nearest neighbors,” inCVPR 2011, 2011, pp. 777–784

work page 2011
[29]

A contextual dissimilarity measure for accurate and efficient image search,

H. Jegou, H. Harzallah, and C. Schmid, “A contextual dissimilarity measure for accurate and efficient image search,” in2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–8

work page 2007
[30]

Re-ranking Person Re-identification with k-reciprocal Encoding,

Z. Zhong, L. Zheng, D. Cao, and S. Li, “Re-ranking Person Re-identification with k-reciprocal Encoding,” 2017

work page 2017
[31]

Natural Questions: A Benchmark for Question Answer- ing Research,

T. Kwiatkowski, J. Palomaki, O. Redfield, M. Collins, A. Parikh, C. Alberti, D. Epstein, I. Polosukhin, J. De- vlin, K. Lee, K. Toutanova, L. Jones, M. Kelcey, M.-W. Chang, A. M. Dai, J. Uszkoreit, Q. Le, and S. Petrov, “Natural Questions: A Benchmark for Question Answer- ing Research,”Transactions of the Association for Com- putational Linguistics, vol. ...

work page 2019
[32]

MS MARCO: A Human Generated MAchine Reading COmprehension Dataset,

P. Bajaj, D. Campos, N. Craswell, L. Deng, J. Gao, X. Liu, R. Majumder, A. McNamara, B. Mitra, T. Nguyen, M. Rosenberg, X. Song, A. Stoica, S. Ti- wary, and T. Wang, “MS MARCO: A Human Generated MAchine Reading COmprehension Dataset,” 2018

work page 2018
[33]

HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering,

Z. Yang, P. Qi, S. Zhang, Y . Bengio, W. Cohen, R. Salakhutdinov, and C. D. Manning, “HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering,” inProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, E. Riloff, D. Chiang, J. Hockenmaier, and J. Tsujii, Eds. Brussels, Belgium: Association for Computationa...

work page 2018
[34]

Unsupervised Dense Information Retrieval with Contrastive Learning,

G. Izacard, M. Caron, L. Hosseini, S. Riedel, P. Bo- janowski, A. Joulin, and E. Grave, “Unsupervised Dense Information Retrieval with Contrastive Learning,” 2022

work page 2022
[35]

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval,

L. Xiong, C. Xiong, Y . Li, K.-F. Tang, J. Liu, P. N. Bennett, J. Ahmed, and A. Overwijk, “Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval,” inInternational Conference on Learning Representations, 2020

work page 2020
[36]

Dense Passage Re- trieval for Open-Domain Question Answering,

V . Karpukhin, B. Oguz, S. Min, P. Lewis, L. Wu, S. Edunov, D. Chen, and W.-t. Yih, “Dense Passage Re- trieval for Open-Domain Question Answering,” inPro- ceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), B. Webber, T. Cohn, Y . He, and Y . Liu, Eds. Online: Association for Computational Linguistics, 2020, pp. 6769–6781

work page 2020
[37]

Qwen2.5 Technical Re- port,

Qwen, A. Yang, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Li, D. Liu, F. Huang, H. Wei, H. Lin, J. Yang, 14 J. Tu, J. Zhang, J. Yang, J. Yang, J. Zhou, J. Lin, K. Dang, K. Lu, K. Bao, K. Yang, L. Yu, M. Li, M. Xue, P. Zhang, Q. Zhu, R. Men, R. Lin, T. Li, T. Tang, T. Xia, X. Ren, X. Ren, Y . Fan, Y . Su, Y . Zhang, Y . Wan, Y . Liu, Z. Cui, Z. Zhang, ...

work page 2025
[38]

Mistral 7B,

A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. de las Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnier, L. R. Lavaud, M.-A. Lachaux, P. Stock, T. L. Scao, T. Lavril, T. Wang, T. Lacroix, and W. E. Sayed, “Mistral 7B,” 2023

work page 2023
[39]

The distribution of the flora in the alpine zone

P. Jaccard, “The distribution of the flora in the alpine zone.”New Phytologist, vol. 11, no. 2, pp. 37–50, 1912. [Online]. Available: https://nph.onlinelibrary.wiley. com/doi/abs/10.1111/j.1469-8137.1912.tb05611.x

work page doi:10.1111/j.1469-8137.1912.tb05611.x 1912
[40]

TESTS FOR RANK CORRELATION COEFFI- CIENTS. I,

E. C. FIELLER, H. O. HARTLEY , and E. S. PEAR- SON, “TESTS FOR RANK CORRELATION COEFFI- CIENTS. I,”Biometrika, vol. 44, no. 3-4, pp. 470–481, 1957

work page 1957
[41]

A similarity mea- sure for indefinite rankings,

W. Webber, A. Moffat, and J. Zobel, “A similarity mea- sure for indefinite rankings,”ACM Transactions on In- formation Systems, vol. 28, no. 4, pp. 1–38, 2010. A Appendix A.1 Formulation of Bidirectional Ranking De- fense To facilitate a clear understanding of the proposed framework, we provide a comprehensive summary of the mathematical notations and var...

work page 2010

[1] [1]

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions,

L. Huang, W. Yu, W. Ma, W. Zhong, Z. Feng, H. Wang, Q. Chen, W. Peng, X. Feng, B. Qin, and T. Liu, “A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions,” ACM Trans. Inf. Syst., vol. 43, no. 2, pp. 42:1–42:55, 2025

work page 2025

[2] [2]

A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly,

Y . Yao, J. Duan, K. Xu, Y . Cai, Z. Sun, and Y . Zhang, “A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly,”High- Confidence Computing, vol. 4, no. 2, p. 100211, 2024

work page 2024

[3] [3]

Retrieval-Augmented Generation for Large Language Models: A Survey,

Y . Gao, Y . Xiong, X. Gao, K. Jia, J. Pan, Y . Bi, Y . Dai, J. Sun, M. Wang, and H. Wang, “Retrieval-Augmented Generation for Large Language Models: A Survey,” 2024

work page 2024

[4] [4]

PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models,

W. Zou, R. Geng, B. Wang, and J. Jia, “PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models,” 2024

work page 2024

[5] [5]

Poisoning Retrieval Corpora by Injecting Adversarial Passages,

Z. Zhong, Z. Huang, A. Wettig, and D. Chen, “Poisoning Retrieval Corpora by Injecting Adversarial Passages,” inProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, H. Bouamor, J. Pino, and K. Bali, Eds. Singapore: Association for Computational Linguistics, 2023, pp. 13 764–13 775

work page 2023

[6] [6]

GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based Search,

M. Ben-Tov and M. Sharif, “GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based Search,” 2025

work page 2025

[7] [7]

The Silent Saboteur: Impercepti- ble Adversarial Attacks against Black-Box Retrieval- Augmented Generation Systems,

H. Song, Y .-a. Liu, R. Zhang, J. Guo, J. Lv, M. de Ri- jke, and X. Cheng, “The Silent Saboteur: Impercepti- ble Adversarial Attacks against Black-Box Retrieval- Augmented Generation Systems,” 2025

work page 2025

[8] [8]

Topic-FlipRAG: Topic-Orientated Adversarial Opinion Manipulation Attacks to Retrieval- Augmented Generation Models,

Y . Gong, Z. Chen, M. Chen, F. Yu, W. Lu, X. Wang, X. Liu, and J. Liu, “Topic-FlipRAG: Topic-Orientated Adversarial Opinion Manipulation Attacks to Retrieval- Augmented Generation Models,” 2025

work page 2025

[9] [9]

Your RAG is Unfair: Exposing Fairness Vulnerabilities in Retrieval-Augmented Gener- ation via Backdoor Attacks,

G. Bagwe, S. S. Chaturvedi, X. Ma, X. Yuan, K.-C. Wang, and L. Zhang, “Your RAG is Unfair: Exposing Fairness Vulnerabilities in Retrieval-Augmented Gener- ation via Backdoor Attacks,” 2025

work page 2025

[10] [10]

SeCon-RAG: A Two- Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG,

X. Si, M. Zhu, S. Qin, L. Yu, L. Zhang, S. Liu, X. Li, R. Duan, Y . Liu, and X. Jia, “SeCon-RAG: A Two- Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG,” 2025

work page 2025

[11] [11]

TrustRAG: Enhancing Ro- bustness and Trustworthiness in Retrieval-Augmented Generation,

H. Zhou, K.-H. Lee, Z. Zhan, Y . Chen, Z. Li, Z. Wang, H. Haddadi, and E. Yilmaz, “TrustRAG: Enhancing Ro- bustness and Trustworthiness in Retrieval-Augmented Generation,” 2025

work page 2025

[12] [12]

ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search,

Z. Shen, B. Imana, T. Wu, C. Xiang, P. Mittal, and A. Korolova, “ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search,” 2025

work page 2025

[13] [13]

Certifiably Robust RAG against Retrieval Corruption,

C. Xiang, T. Wu, Z. Zhong, D. Wagner, D. Chen, and P. Mittal, “Certifiably Robust RAG against Retrieval Corruption,” 2024

work page 2024

[14] [14]

On the Vulnerabil- ity of Applying Retrieval-Augmented Generation within Knowledge-Intensive Application Domains,

X. Xian, G. Wang, X. Bi, J. Srinivasa, A. Kundu, C. Fleming, M. Hong, and J. Ding, “On the Vulnerabil- ity of Applying Retrieval-Augmented Generation within Knowledge-Intensive Application Domains,” 2025

work page 2025

[15] [15]

Astute RAG: Overcoming Imperfect Retrieval Augmen- tation and Knowledge Conflicts for Large Language Models,

F. Wang, X. Wan, R. Sun, J. Chen, and S. O. Arik, “Astute RAG: Overcoming Imperfect Retrieval Augmen- tation and Knowledge Conflicts for Large Language Models,” inProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), W. Che, J. Nabende, E. Shutova, and M. T. Pilehvar, Eds. Vienna, Austria: Associa...

work page 2025

[16] [16]

InstructRAG: Instructing Retrieval-Augmented Generation via Self- Synthesized Rationales,

Z. Wei, W.-L. Chen, and Y . Meng, “InstructRAG: Instructing Retrieval-Augmented Generation via Self- Synthesized Rationales,” inThe Thirteenth International Conference on Learning Representations, 2024. 13

work page 2024

[17] [17]

Query Rewriting in Retrieval-Augmented Large Language Models,

X. Ma, Y . Gong, P. He, H. Zhao, and N. Duan, “Query Rewriting in Retrieval-Augmented Large Language Models,” inProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, H. Bouamor, J. Pino, and K. Bali, Eds. Singapore: Association for Computational Linguistics, 2023, pp. 5303–5315

work page 2023

[18] [18]

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,

P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rock- täschel, S. Riedel, and D. Kiela, “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” in Advances in Neural Information Processing Systems, vol. 33. Curran Associates, Inc., 2020, pp. 9459–9474

work page 2020

[19] [19]

Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models,

Y . Zhang, Y . Li, L. Cui, D. Cai, L. Liu, T. Fu, X. Huang, E. Zhao, Y . Zhang, C. Xu, Y . Chen, L. Wang, A. T. Luu, W. Bi, F. Shi, and S. Shi, “Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models,” 2025

work page 2025

[20] [20]

SimRAG: Self- Improving Retrieval-Augmented Generation for Adapt- ing Large Language Models to Specialized Domains,

R. Xu, H. Liu, S. Nag, Z. Dai, Y . Xie, X. Tang, C. Luo, Y . Li, J. C. Ho, C. Yang, and Q. He, “SimRAG: Self- Improving Retrieval-Augmented Generation for Adapt- ing Large Language Models to Specialized Domains,” in Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Te...

work page 2025

[21] [21]

HijackRAG: Hijacking Attacks against Retrieval-Augmented Large Language Models,

Y . Zhang, Q. Li, T. Du, X. Zhang, X. Zhao, Z. Feng, and J. Yin, “HijackRAG: Hijacking Attacks against Retrieval-Augmented Large Language Models,” 2024

work page 2024

[22] [22]

PR-Attack: Coordi- nated Prompt-RAG Attacks on Retrieval-Augmented Generation in Large Language Models via Bilevel Opti- mization,

Y . Jiao, X. Wang, and K. Yang, “PR-Attack: Coordi- nated Prompt-RAG Attacks on Retrieval-Augmented Generation in Large Language Models via Bilevel Opti- mization,” 2025

work page 2025

[23] [23]

Maximal independent sets in bipartite graphs,

J. Liu, “Maximal independent sets in bipartite graphs,” Journal of Graph Theory, vol. 17, no. 4, pp. 495–507,

work page

[24] [24]

Available: https://onlinelibrary.wiley

[Online]. Available: https://onlinelibrary.wiley. com/doi/abs/10.1002/jgt.3190170407

work page doi:10.1002/jgt.3190170407

[25] [25]

Bidi- rectional ranking for person re-identification,

Q. Leng, R. Hu, C. Liang, Y . Wang, and J. Chen, “Bidi- rectional ranking for person re-identification,” in2013 IEEE International Conference on Multimedia and Expo (ICME), 2013, pp. 1–6

work page 2013

[26] [26]

Bidirectional Attention Flow for Machine Comprehen- sion,

M. Seo, A. Kembhavi, A. Farhadi, and H. Hajishirzi, “Bidirectional Attention Flow for Machine Comprehen- sion,” 2018

work page 2018

[27] [27]

Query2doc: Query Expansion with Large Language Models,

L. Wang, N. Yang, and F. Wei, “Query2doc: Query Expansion with Large Language Models,” 2023

work page 2023

[28] [28]

Hello neighbor: Accurate object retrieval with k-reciprocal nearest neighbors,

D. Qin, S. Gammeter, L. Bossard, T. Quack, and L. van Gool, “Hello neighbor: Accurate object retrieval with k-reciprocal nearest neighbors,” inCVPR 2011, 2011, pp. 777–784

work page 2011

[29] [29]

A contextual dissimilarity measure for accurate and efficient image search,

H. Jegou, H. Harzallah, and C. Schmid, “A contextual dissimilarity measure for accurate and efficient image search,” in2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–8

work page 2007

[30] [30]

Re-ranking Person Re-identification with k-reciprocal Encoding,

Z. Zhong, L. Zheng, D. Cao, and S. Li, “Re-ranking Person Re-identification with k-reciprocal Encoding,” 2017

work page 2017

[31] [31]

Natural Questions: A Benchmark for Question Answer- ing Research,

T. Kwiatkowski, J. Palomaki, O. Redfield, M. Collins, A. Parikh, C. Alberti, D. Epstein, I. Polosukhin, J. De- vlin, K. Lee, K. Toutanova, L. Jones, M. Kelcey, M.-W. Chang, A. M. Dai, J. Uszkoreit, Q. Le, and S. Petrov, “Natural Questions: A Benchmark for Question Answer- ing Research,”Transactions of the Association for Com- putational Linguistics, vol. ...

work page 2019

[32] [32]

MS MARCO: A Human Generated MAchine Reading COmprehension Dataset,

P. Bajaj, D. Campos, N. Craswell, L. Deng, J. Gao, X. Liu, R. Majumder, A. McNamara, B. Mitra, T. Nguyen, M. Rosenberg, X. Song, A. Stoica, S. Ti- wary, and T. Wang, “MS MARCO: A Human Generated MAchine Reading COmprehension Dataset,” 2018

work page 2018

[33] [33]

HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering,

Z. Yang, P. Qi, S. Zhang, Y . Bengio, W. Cohen, R. Salakhutdinov, and C. D. Manning, “HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering,” inProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, E. Riloff, D. Chiang, J. Hockenmaier, and J. Tsujii, Eds. Brussels, Belgium: Association for Computationa...

work page 2018

[34] [34]

Unsupervised Dense Information Retrieval with Contrastive Learning,

G. Izacard, M. Caron, L. Hosseini, S. Riedel, P. Bo- janowski, A. Joulin, and E. Grave, “Unsupervised Dense Information Retrieval with Contrastive Learning,” 2022

work page 2022

[35] [35]

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval,

L. Xiong, C. Xiong, Y . Li, K.-F. Tang, J. Liu, P. N. Bennett, J. Ahmed, and A. Overwijk, “Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval,” inInternational Conference on Learning Representations, 2020

work page 2020

[36] [36]

Dense Passage Re- trieval for Open-Domain Question Answering,

V . Karpukhin, B. Oguz, S. Min, P. Lewis, L. Wu, S. Edunov, D. Chen, and W.-t. Yih, “Dense Passage Re- trieval for Open-Domain Question Answering,” inPro- ceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), B. Webber, T. Cohn, Y . He, and Y . Liu, Eds. Online: Association for Computational Linguistics, 2020, pp. 6769–6781

work page 2020

[37] [37]

Qwen2.5 Technical Re- port,

Qwen, A. Yang, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Li, D. Liu, F. Huang, H. Wei, H. Lin, J. Yang, 14 J. Tu, J. Zhang, J. Yang, J. Yang, J. Zhou, J. Lin, K. Dang, K. Lu, K. Bao, K. Yang, L. Yu, M. Li, M. Xue, P. Zhang, Q. Zhu, R. Men, R. Lin, T. Li, T. Tang, T. Xia, X. Ren, X. Ren, Y . Fan, Y . Su, Y . Zhang, Y . Wan, Y . Liu, Z. Cui, Z. Zhang, ...

work page 2025

[38] [38]

Mistral 7B,

A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. de las Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnier, L. R. Lavaud, M.-A. Lachaux, P. Stock, T. L. Scao, T. Lavril, T. Wang, T. Lacroix, and W. E. Sayed, “Mistral 7B,” 2023

work page 2023

[39] [39]

The distribution of the flora in the alpine zone

P. Jaccard, “The distribution of the flora in the alpine zone.”New Phytologist, vol. 11, no. 2, pp. 37–50, 1912. [Online]. Available: https://nph.onlinelibrary.wiley. com/doi/abs/10.1111/j.1469-8137.1912.tb05611.x

work page doi:10.1111/j.1469-8137.1912.tb05611.x 1912

[40] [40]

TESTS FOR RANK CORRELATION COEFFI- CIENTS. I,

E. C. FIELLER, H. O. HARTLEY , and E. S. PEAR- SON, “TESTS FOR RANK CORRELATION COEFFI- CIENTS. I,”Biometrika, vol. 44, no. 3-4, pp. 470–481, 1957

work page 1957

[41] [41]

A similarity mea- sure for indefinite rankings,

W. Webber, A. Moffat, and J. Zobel, “A similarity mea- sure for indefinite rankings,”ACM Transactions on In- formation Systems, vol. 28, no. 4, pp. 1–38, 2010. A Appendix A.1 Formulation of Bidirectional Ranking De- fense To facilitate a clear understanding of the proposed framework, we provide a comprehensive summary of the mathematical notations and var...

work page 2010