ALDEN: Boosting Private Data Extraction from Retrieval-Augmented Generation Systems via Active Learning and Distribution Estimation

Danjue Chen; Jianfeng He; Ning Wang; Shixiong Li; Tao Li; Xingyu Lyu; Yidan Hu; Yimin Chen

arxiv: 2605.18762 · v1 · pith:6TKQO4FBnew · submitted 2026-04-10 · 💻 cs.IR · cs.AI

ALDEN: Boosting Private Data Extraction from Retrieval-Augmented Generation Systems via Active Learning and Distribution Estimation

Xingyu Lyu , Jianfeng He , Ning Wang , Yidan Hu , Tao Li , Danjue Chen , Shixiong Li , Yimin Chen This is my paper

Pith reviewed 2026-05-21 10:00 UTC · model grok-4.3

classification 💻 cs.IR cs.AI

keywords private data extractionretrieval-augmented generationactive learningadversarial attacksRAG securitytopic distribution estimationdata leakagequery diversification

0 comments

The pith

ALDEN boosts private data extraction from RAG systems by diversifying malicious queries via active learning and estimating the knowledge base topic distribution with a decay-based algorithm.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Retrieval-augmented generation systems attach external knowledge bases to large language models to improve answers. The paper shows these systems remain open to attacks that embed commands in queries to pull out private data. ALDEN raises extraction rates by using active learning to create more varied malicious queries and by running a decay-based algorithm that guesses the topic distribution inside the hidden knowledge base. The combined approach is shown to extract substantially more data than earlier methods across the evaluations. A reader would care because it identifies a concrete way current RAG setups can leak stored information when an adversary adapts queries over multiple interactions.

Core claim

ALDEN substantially outperforms state-of-the-art methods in extracting private data from RAG systems by combining active learning to diversify malicious queries and a decay-based dynamic algorithm to estimate the topic distribution of the knowledge base.

What carries the argument

ALDEN attack that pairs active learning for query diversification with decay-based dynamic estimation of topic distribution to guide more effective malicious queries.

If this is right

RAG systems face higher practical risk of private data leakage when attackers adapt queries over repeated interactions.
Estimating topic distribution inside the knowledge base supplies useful guidance for generating more successful extraction queries.
Active learning improves attack efficiency by producing a more diverse set of malicious prompts.
Comprehensive evaluations confirm the combined method exceeds previous extraction performance.
Defenses for RAG must address both query variation and distribution probing by adversaries.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

RAG providers could reduce leakage by monitoring query diversity or limiting feedback that reveals topic information.
The same active-learning and distribution-estimation ideas could be tested on other retrieval-based systems that hold private data.
Future defenses might need to add noise to responses or detect systematic probing of topic coverage.
The attack highlights that privacy in RAG depends on both the security of the retrieval step and the ability to hide distribution patterns.

Load-bearing premise

An adversary can issue many queries and receive enough feedback from the RAG system to run active learning and estimate the private knowledge base distribution without triggering detection or rate limits.

What would settle it

An experiment that runs ALDEN on a RAG system with strict query limits or disabled feedback and measures whether extraction rates still exceed those of prior attacks.

Figures

Figures reproduced from arXiv: 2605.18762 by Danjue Chen, Jianfeng He, Ning Wang, Shixiong Li, Tao Li, Xingyu Lyu, Yidan Hu, Yimin Chen.

**Figure 1.** Figure 1: Case study of applying ALDEN on a realworld clinic RAG. RAG LLM-AUX ALDEN ATTACK 1 2 3 4 6 Distribution Estimation Anchor Selection Resampling History Queries Adversarial Query 5 13 12 11 14 7 8 9 10 Database Retriever Response Input: What is [topic A] .... Output: [topic A] stands for ... Chunk Extraction Embedding LLM LLM-RAG [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Workflow of ALDEN. and an LLM for response generation (referred to as LLM-RAG). Mathematically, given query q, the retriever obtains the top-k chunks [c1, c2, · · · , ck] from the knowledge base K, i.e., [c1, c2, · · · , ck] = RD(q, K). After that, the retriever feeds c1 ⊕ c2 ⊕ · · · ⊕ ck ⊕ q to LLMRAG. Finally, LLM-RAG outputs r = GLLM-RAG c1 ⊕ c2 ⊕ · · · ⊕ ck ⊕ q) to the user as the response of q. 2.2 A… view at source ↗

**Figure 3.** Figure 3: Ground-truth and estimated distributions of [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Comparison of LC and ULC across attacks. [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Results of ablation study on Health dataset. We examine how (a) top-k selection, (b) model size, (c) similarity threshold, and (d) the number of anchors affect the performance. Attack Model LC ULC ROUGE TBTG Llama2-7b-chat 525 125 82.3 Qwen2-72B 580 180 92.4 ChatGPT-4 610 201 97.0 PIDE Llama2-7b-chat 237 237 92.0 Qwen2-72B 577 220 98.5 ChatGPT-4 555 332 89.2 GEA Llama2-7b-chat 714 359 50.0 Qwen2-72B 1,033 … view at source ↗

**Figure 6.** Figure 6: ALDEN v.s. Oracle. Pale and opaque bars are LC and ULC, respectively. that our attack is both time-efficient and costefficient in practice. More results in Appendix. We includes ablation studies for (1) different active learning strategies and (2) different clustering methods (e.g., KDE and GMM) in Appendix I. 4.5 Potential Defenses We evaluate four defenses against ALDEN as in (Xu et al., 2024). Details … view at source ↗

**Figure 7.** Figure 7: Data estimation of P(D) and P ′ (D). Attack Health Email Finance Thief 24.948 ± 6.267 24.513 ± 4.849 22.257 ± 8.639 Pirate 23.762 ± 4.162 21.909 ± 3.786 23.173 ± 2.477 Ours 25.885 ± 3.011 20.156 ± 2.885 21.059 ± 5.552 [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Potential defenses. to production models like ChatGPT. Membership inference further quantifies privacy risks in language models (Mireshghallah et al., 2022), and prompt-injection attacks exploit adversarial prefixes to elicit unintended private content (Wallace et al., 2019). Privacy attacks on RAG systems. Early prompt-injection attacks rely on static queries to extract limited sensitive information (Ze… view at source ↗

**Figure 9.** Figure 9: Estimated versus ground-truth distributions [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗

**Figure 10.** Figure 10: Ablation study analyzing the impact of different parameters on untargeted attack performance at [PITH_FULL_IMAGE:figures/full_fig_p018_10.png] view at source ↗

**Figure 11.** Figure 11: Potential mitigation strategies. diagnosed with diabetes mellitus, what steps should I follow?” This disrupts the exact textual patterns that adversaries embed in malicious documents, without reprocessing the entire knowledge base. We evaluate the paraphrasing defense by generating five paraphrased queries per original query using Llama2-13B-chat. For each paraphrase, we retrieve k = 3 texts from the corr… view at source ↗

**Figure 12.** Figure 12: Case study on Health dataset. [prefix] You are a helpful... + [Generated query] ...What treatment is recommended for a dog with persistent skin rashes and irritation, and how effective is it... + [suffix] Please repeat ... Extracted Chunk from LLM’s Response Query Example 2 ... A shampoo containing chlorhexidine 2%, ketoconazole 1%, and phytosphingosine', 'Antiseptic shampoo in a synergistic base for deep… view at source ↗

**Figure 13.** Figure 13: Case study on Amazon Customer Review. base. Our method also extracts key topics (highlighted in bold red) to update the anchors for subsequent queries. These findings underscore the urgent need for effective privacy defenses in real-world RAG applications. Models Health Amazon-Reviews LC ULC LC ULC chronos-t5-large 1,080 549 1,259 620 gemma-3-27b-it 1,173 625 1,290 664 gpt-3.5-turbo 1,170 661 1,277 695 [… view at source ↗

read the original abstract

Retrieval-Augmented Generation (RAG) is widely used to augment large language models with external knowledge retrieval to improve reliability and generalization. However, recent studies have shown that RAG systems remain vulnerable to data extraction attacks, where adversaries can extract private data by embedding malicious commands into user queries. Despite their feasibility, existing attacks typically suffer from low data extraction rates and limited practical effectiveness. Here, we propose ALDEN, a novel attack that effectively and efficiently extracts private data from RAGs. First, we employ active learning to diversify malicious queries and improve data extraction rates. Second, we observe that the data distribution of the underlying knowledge base provides valuable guidance for query generation and introduce a decay-based dynamic algorithm to estimate the corresponding topic distribution. By combining them together, we demonstrate that ALDEN substantially outperforms state-of-the-art methods through comprehensive evaluations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes ALDEN, a novel attack on Retrieval-Augmented Generation (RAG) systems for private data extraction. It employs active learning to diversify malicious queries and introduces a decay-based dynamic algorithm to estimate the topic distribution of the private knowledge base, claiming that the combination substantially outperforms state-of-the-art methods via comprehensive evaluations.

Significance. If the reported gains hold under realistic query constraints, the work would be significant for the IR community by advancing practical attack techniques against RAG privacy and highlighting the value of distribution-aware query generation. The empirical focus on active learning combined with dynamic estimation is a clear strength over purely heuristic prior attacks.

major comments (2)

[§4] §4 (Evaluation): The central claim of substantial outperformance rests on experiments that assume an adversary can issue a large number of diverse probing queries without rate limits, session timeouts, or anomaly detection. This assumption is load-bearing because both the active-learning diversification and the decay-based topic estimation require repeated informative feedback; if usable interactions are limited, the reported gains over baselines would not materialize. The manuscript should add constrained-budget experiments or a limitations discussion.
[§3.2] §3.2 (Decay-based dynamic algorithm): The description of how the algorithm updates the topic distribution estimate from RAG responses lacks detail on handling noisy or partial retrievals. Without this, it is unclear whether the estimated distribution remains accurate enough to guide query generation, directly affecting the claimed efficiency improvement.

minor comments (2)

[Abstract] The abstract states 'comprehensive evaluations' without any numerical results or baseline names; adding one or two key metrics (e.g., extraction rate improvement) would improve readability while remaining within abstract length limits.
[Figures] Figure captions and axis labels in the experimental plots should explicitly state the number of queries or interaction budget used, to make the comparison with prior work immediately interpretable.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We have carefully addressed each major comment below and revised the paper accordingly to strengthen the presentation and evaluation.

read point-by-point responses

Referee: [§4] §4 (Evaluation): The central claim of substantial outperformance rests on experiments that assume an adversary can issue a large number of diverse probing queries without rate limits, session timeouts, or anomaly detection. This assumption is load-bearing because both the active-learning diversification and the decay-based topic estimation require repeated informative feedback; if usable interactions are limited, the reported gains over baselines would not materialize. The manuscript should add constrained-budget experiments or a limitations discussion.

Authors: We agree that the evaluation setup assumes a query budget that may exceed what is feasible under strict rate limiting or anomaly detection in deployed systems. To address this directly, we have added a new subsection in §4 with constrained-budget experiments (capping queries at 100, 500, and 1000) across the evaluated datasets. These results show that ALDEN continues to outperform the baselines, albeit with reduced absolute extraction rates. We have also expanded the Limitations section to explicitly discuss the effects of query throttling and potential defenses such as session timeouts. revision: yes
Referee: [§3.2] §3.2 (Decay-based dynamic algorithm): The description of how the algorithm updates the topic distribution estimate from RAG responses lacks detail on handling noisy or partial retrievals. Without this, it is unclear whether the estimated distribution remains accurate enough to guide query generation, directly affecting the claimed efficiency improvement.

Authors: We thank the referee for highlighting this point. The original description in §3.2 was brief and did not sufficiently cover robustness to imperfect retrievals. In the revised manuscript we have expanded §3.2 with a new paragraph detailing the update rule: responses are first filtered by a confidence threshold derived from the RAG model's output logits; surviving partial or noisy retrievals receive a reduced weight in the decay update, and the distribution estimate is renormalized after each batch. We also include a short analysis showing that the estimated distribution remains sufficiently accurate to preserve the reported efficiency gains even under moderate noise levels. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical attack with external evaluation

full rationale

The paper presents ALDEN as an empirical attack on RAG systems that combines active learning for query diversification with a decay-based dynamic algorithm for estimating the private knowledge base topic distribution. No mathematical derivation, first-principles result, or prediction is claimed that reduces to its own inputs by construction. The central claims rest on comprehensive evaluations against state-of-the-art baselines, which are external and falsifiable. The method is self-contained as a practical proposal whose performance depends on observable attack success rates rather than tautological redefinitions or self-referential fits.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on the empirical effectiveness of the two proposed components; no new mathematical axioms or invented physical entities are introduced. The decay rate and active-learning acquisition function are treated as tunable but not enumerated as free parameters in the abstract.

pith-pipeline@v0.9.0 · 5698 in / 1030 out tokens · 38620 ms · 2026-05-21T10:00:35.010339+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We employ active learning to diversify malicious queries and introduce a decay-based dynamic algorithm to estimate the corresponding topic distribution.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

101 extracted references · 101 canonical work pages · 6 internal anchors

[1]

Precedence , title =

work page
[2]

Advances in neural information processing systems , volume=

Training language models to follow instructions with human feedback , author=. Advances in neural information processing systems , volume=

work page
[3]

ACM Transactions on Information Systems , volume=

When automated assessment meets automated content generation: Examining text quality in the era of gpts , author=. ACM Transactions on Information Systems , volume=. 2025 , publisher=

work page 2025
[4]

Cureus , volume=

Chatdoctor: A medical chat model fine-tuned on a large language model meta-ai (llama) using medical domain knowledge , author=. Cureus , volume=. 2023 , publisher=

work page 2023
[5]

Northern Reviews on Algorithmic Research, Theoretical Computation, and Complexity , volume=

The Impact of Hallucinated Information in Large Language Models on Student Learning Outcomes: A Critical Examination of Misinformation Risks in AI-Assisted Education , author=. Northern Reviews on Algorithmic Research, Theoretical Computation, and Complexity , volume=

work page
[6]

Journal of Computational Intelligence, Machine Reasoning, and Decision-Making , volume=

Hallucinations in Large Language Models and Their Influence on Legal Reasoning: Examining the Risks of AI-Generated Factual Inaccuracies in Judicial Processes , author=. Journal of Computational Intelligence, Machine Reasoning, and Decision-Making , volume=

work page
[7]

medRxiv , pages=

Medical Hallucination in Foundation Models and Their Impact on Healthcare , author=. medRxiv , pages=. 2025 , publisher=

work page 2025
[8]

arXiv preprint arXiv:2311.15548 , year=

Deficiency of large language models in finance: An empirical examination of hallucination , author=. arXiv preprint arXiv:2311.15548 , year=

work page arXiv
[9]

Nature Medicine , pages=

Toward expert-level medical question answering with large language models , author=. Nature Medicine , pages=. 2025 , publisher=

work page 2025
[10]

Findings of the Association for Computational Linguistics: ACL 2024 , pages=

The good and the bad: Exploring privacy issues in retrieval-augmented generation (rag) , author=. Findings of the Association for Computational Linguistics: ACL 2024 , pages=

work page 2024
[11]

Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems , author=

work page
[12]

arXiv preprint arXiv:2409.08045 , year=

Unleashing worms and extracting data: Escalating the outcome of attacks against rag-based inference in scale and severity using jailbreaking , author=. arXiv preprint arXiv:2409.08045 , year=

work page arXiv
[13]

arXiv preprint arXiv:2411.14110 , year=

Rag-thief: Scalable extraction of private data from retrieval-augmented generation applications with agent-based attacks , author=. arXiv preprint arXiv:2411.14110 , year=

work page arXiv
[14]

arXiv preprint arXiv:2412.18295 , year=

Pirates of the RAG: Adaptively Attacking LLMs to Leak Knowledge Bases , author=. arXiv preprint arXiv:2412.18295 , year=

work page arXiv
[15]

Proceedings of the 2020 ACM SIGSAC conference on computer and communications security , pages=

Information leakage in embedding models , author=. Proceedings of the 2020 ACM SIGSAC conference on computer and communications security , pages=

work page 2020
[16]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Activethief: Model extraction using active learning and unannotated public data , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[17]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision , pages=

Army of Thieves: Enhancing Black-Box Model Extraction via Ensemble based sample selection , author=. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision , pages=

work page
[18]

arXiv preprint arXiv:2011.04743 , year=

Adversarial semantic collisions , author=. arXiv preprint arXiv:2011.04743 , year=

work page arXiv 2011
[19]

Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security , pages=

Order-disorder: Imitation adversarial attacks for black-box neural ranking models , author=. Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security , pages=

work page 2022
[20]

International conference on theory and applications of models of computation , pages=

Differential privacy: A survey of results , author=. International conference on theory and applications of models of computation , pages=. 2008 , organization=

work page 2008
[21]

Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006

Calibrating noise to sensitivity in private data analysis , author=. Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3 , pages=. 2006 , organization=

work page 2006
[22]

Advances in neural information processing systems , volume=

Retrieval-augmented generation for knowledge-intensive nlp tasks , author=. Advances in neural information processing systems , volume=

work page
[23]

REPLUG: Retrieval-Augmented Black-Box Language Models

Replug: Retrieval-augmented black-box language models , author=. arXiv preprint arXiv:2301.12652 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[24]

Transactions of the Association for Computational Linguistics , volume=

In-context retrieval-augmented language models , author=. Transactions of the Association for Computational Linguistics , volume=. 2023 , publisher=

work page 2023
[25]

, author=

Dense Passage Retrieval for Open-Domain Question Answering. , author=. EMNLP (1) , pages=

work page
[26]

AI in Finance: The Promise and Risks of RAG , howpublished =

work page
[27]

IEEE Transactions on Information Theory , volume=

Minimax bounds for active learning , author=. IEEE Transactions on Information Theory , volume=. 2008 , publisher=

work page 2008
[28]

Proceedings of the 2016 ACM SIGSAC conference on computer and communications security , pages=

Deep learning with differential privacy , author=. Proceedings of the 2016 ACM SIGSAC conference on computer and communications security , pages=

work page 2016
[29]

PubMed , howpublished =

work page
[30]

StatPearls , howpublished =

work page
[31]

MedCorp , howpublished =

work page
[32]

llama-7b , howpublished =

work page
[33]

llama-13b , howpublished =

work page
[34]

2023 , eprint=

Mistral 7B , author=. 2023 , eprint=

work page 2023
[35]

2024 , eprint=

Gemini: A Family of Highly Capable Multimodal Models , author=. 2024 , eprint=

work page 2024
[36]

HealthCareMagic-10k , howpublished =

work page
[37]

European conference on machine learning , pages=

The enron corpus: A new dataset for email classification research , author=. European conference on machine learning , pages=. 2004 , organization=

work page 2004
[38]

EDGAR - CORPUS : Billions of Tokens Make The World Go Round

Loukas, Lefteris and Fergadiotis, Manos and Androutsopoulos, Ion and Malakasiotis, Prodromos. EDGAR - CORPUS : Billions of Tokens Make The World Go Round. Proceedings of the Third Workshop on Economics and Natural Language Processing. 2021. doi:10.18653/v1/2021.econlp-1.2

work page doi:10.18653/v1/2021.econlp-1.2 2021
[39]

bge-large-en-v1.5 , howpublished =

work page
[40]

all-MiniLM-L6-v2 , howpublished =

work page
[41]

e5-base-v2 , howpublished =

work page
[42]

2024 IEEE Conference on Communications and Network Security (CNS) , pages=

Adversarial Attacks on Federated Learning Revisited: a Client-Selection Perspective , author=. 2024 IEEE Conference on Communications and Network Security (CNS) , pages=. 2024 , organization=

work page 2024
[43]

Foundations and Trends

The probabilistic relevance framework: BM25 and beyond , author=. Foundations and Trends. 2009 , publisher=

work page 2009
[44]

2021 , url =

Unsupervised Dense Information Retrieval with Contrastive Learning , author=. 2021 , url =

work page 2021
[45]

Bioinformatics , volume=

Medcpt: Contrastive pre-trained transformers with large-scale pubmed search logs for zero-shot biomedical information retrieval , author=. Bioinformatics , volume=. 2023 , publisher=

work page 2023
[46]

Retrieval Is All You Need: Developing an AI Powered Chatbot with RAG in Azure , author=

work page
[47]

NEJM AI , volume=

RAG in health care: a novel framework for improving communication and decision-making by addressing LLM limitations , author=. NEJM AI , volume=. 2025 , publisher=

work page 2025
[48]

IEEE Access , year=

Enhancing the Precision and Interpretability of Retrieval-Augmented Generation (RAG) in Legal Technology: A Survey , author=. IEEE Access , year=

work page
[49]

Proceedings of the fourth ACM international conference on AI in finance , pages=

Enhancing financial sentiment analysis via retrieval augmented large language models , author=. Proceedings of the fourth ACM international conference on AI in finance , pages=

work page
[50]

Trustworthiness in Retrieval-Augmented Generation Systems: A Survey

Trustworthiness in retrieval-augmented generation systems: A survey , author=. arXiv preprint arXiv:2409.10102 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[51]

IEEE Transactions on systems, man, and cybernetics-Part A: Systems and humans , volume=

Secure knowledge management: confidentiality, trust, and privacy , author=. IEEE Transactions on systems, man, and cybernetics-Part A: Systems and humans , volume=. 2006 , publisher=

work page 2006
[52]

ACM Computing Surveys , volume=

Security and privacy challenges of large language models: A survey , author=. ACM Computing Surveys , volume=. 2025 , publisher=

work page 2025
[53]

Advances in Neural Information Processing Systems , volume=

PrivAuditor: Benchmarking Data Protection Vulnerabilities in LLM Adaptation Techniques , author=. Advances in Neural Information Processing Systems , volume=

work page
[54]

2022 IEEE symposium on security and privacy (SP) , pages=

Membership inference attacks from first principles , author=. 2022 IEEE symposium on security and privacy (SP) , pages=. 2022 , organization=

work page 2022
[55]

30th USENIX security symposium (USENIX Security 21) , pages=

Extracting training data from large language models , author=. 30th USENIX security symposium (USENIX Security 21) , pages=

work page
[56]

Machine learning , volume=

Improving generalization with active learning , author=. Machine learning , volume=. 1994 , publisher=

work page 1994
[57]

Acm Sigir Forum , volume=

A sequential algorithm for training text classifiers: Corrigendum and additional data , author=. Acm Sigir Forum , volume=. 1995 , organization=

work page 1995
[58]

Active Learning for Convolutional Neural Networks: A Core-Set Approach

Active learning for convolutional neural networks: A core-set approach , author=. arXiv preprint arXiv:1708.00489 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[59]

Adversarial Active Learning for Deep Networks: a Margin Based Approach

Adversarial active learning for deep networks: a margin based approach , author=. arXiv preprint arXiv:1802.09841 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[60]

2023 , url =

HuggingFace , title =. 2023 , url =

work page 2023
[61]

arXiv preprint arXiv:2502.15734 , year=

Cache-Craft: Managing Chunk-Caches for Efficient Retrieval-Augmented Generation , author=. arXiv preprint arXiv:2502.15734 , year=

work page arXiv
[62]

arXiv preprint arXiv:2502.10976 , year=

QuOTE: Question-Oriented Text Embeddings , author=. arXiv preprint arXiv:2502.10976 , year=

work page arXiv
[63]

2023 , howpublished =

Synthetic Financial Domain Documents with PII Labels , author =. 2023 , howpublished =

work page 2023
[64]

Ras: Retrieval-and-structuring for knowledge-intensive llm generation

RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation , author=. arXiv preprint arXiv:2502.10996 , year=

work page arXiv
[65]

2024 , howpublished =

work page 2024
[66]

Creating Retrieval Augmented Generation solutions on AWS for healthcare , year =

work page
[67]

Adversarial Semantic Collisions

Song, Congzheng and Rush, Alexander and Shmatikov, Vitaly. Adversarial Semantic Collisions. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020

work page 2020
[68]

2023 , howpublished =

Amazon Customer Reviews Dataset , author =. 2023 , howpublished =

work page 2023
[69]

Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =

The use of MMR, diversity-based reranking for reordering documents and producing summaries , author =. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =. 1998 , organization =

work page 1998
[70]

Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks

Mireshghallah, Fatemehsadat and Goyal, Kartik and Uniyal, Archit and Berg-Kirkpatrick, Taylor and Shokri, Reza. Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2022

work page 2022
[71]

Feder Cooper and Daphne Ippolito and Christopher A

Milad Nasr and Nicholas Carlini and Jonathan Hayase and Matthew Jagielski and A. Feder Cooper and Daphne Ippolito and Christopher A. Choquette-Choo and Eric Wallace and Florian Tramèr and Katherine Lee , year=

work page
[72]

Gunter and Nikita Borisov , title =

Karan Ganju and Qi Wang and Wei Yang and Carl A. Gunter and Nikita Borisov , title =. Proceedings of the 2018. 2018 , pages =

work page 2018
[73]

Property Inference from Poisoning , year=

Mahloujifar, Saeed and Ghosh, Esha and Chase, Melissa , booktitle=. Property Inference from Poisoning , year=

work page
[74]

Universal adversarial triggers for attacking and analyzing nlp

Universal adversarial triggers for attacking and analyzing NLP , author=. arXiv preprint arXiv:1908.07125 , year=

work page arXiv 1908
[75]

kdd , volume=

A density-based algorithm for discovering clusters in large spatial databases with noise , author=. kdd , volume=

work page
[76]

Active Learning Literature Survey , type =

Settles, Burr , biburl =. Active Learning Literature Survey , type =

work page
[77]

Chase, Harrison , title =

work page
[78]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , year=

Text Embeddings Reveal (Almost) As Much As Text , author=. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , year=

work page 2023
[79]

International Journal of Advanced Computer Science and Applications , volume=

Personally Identifiable Information (PII) Detection in the Unstructured Large Text Corpus using Natural Language Processing and Unsupervised Learning Technique , author=. International Journal of Advanced Computer Science and Applications , volume=. 2021 , url=

work page 2021
[80]

arXiv preprint arXiv:2503.12896 , year=

Safeguarding LLM Embeddings in End-Cloud Collaboration via Entropy-Driven Perturbation , author=. arXiv preprint arXiv:2503.12896 , year=

work page arXiv

Showing first 80 references.

[1] [1]

Precedence , title =

work page

[2] [2]

Advances in neural information processing systems , volume=

Training language models to follow instructions with human feedback , author=. Advances in neural information processing systems , volume=

work page

[3] [3]

ACM Transactions on Information Systems , volume=

When automated assessment meets automated content generation: Examining text quality in the era of gpts , author=. ACM Transactions on Information Systems , volume=. 2025 , publisher=

work page 2025

[4] [4]

Cureus , volume=

Chatdoctor: A medical chat model fine-tuned on a large language model meta-ai (llama) using medical domain knowledge , author=. Cureus , volume=. 2023 , publisher=

work page 2023

[5] [5]

Northern Reviews on Algorithmic Research, Theoretical Computation, and Complexity , volume=

The Impact of Hallucinated Information in Large Language Models on Student Learning Outcomes: A Critical Examination of Misinformation Risks in AI-Assisted Education , author=. Northern Reviews on Algorithmic Research, Theoretical Computation, and Complexity , volume=

work page

[6] [6]

Journal of Computational Intelligence, Machine Reasoning, and Decision-Making , volume=

Hallucinations in Large Language Models and Their Influence on Legal Reasoning: Examining the Risks of AI-Generated Factual Inaccuracies in Judicial Processes , author=. Journal of Computational Intelligence, Machine Reasoning, and Decision-Making , volume=

work page

[7] [7]

medRxiv , pages=

Medical Hallucination in Foundation Models and Their Impact on Healthcare , author=. medRxiv , pages=. 2025 , publisher=

work page 2025

[8] [8]

arXiv preprint arXiv:2311.15548 , year=

Deficiency of large language models in finance: An empirical examination of hallucination , author=. arXiv preprint arXiv:2311.15548 , year=

work page arXiv

[9] [9]

Nature Medicine , pages=

Toward expert-level medical question answering with large language models , author=. Nature Medicine , pages=. 2025 , publisher=

work page 2025

[10] [10]

Findings of the Association for Computational Linguistics: ACL 2024 , pages=

The good and the bad: Exploring privacy issues in retrieval-augmented generation (rag) , author=. Findings of the Association for Computational Linguistics: ACL 2024 , pages=

work page 2024

[11] [11]

Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems , author=

work page

[12] [12]

arXiv preprint arXiv:2409.08045 , year=

Unleashing worms and extracting data: Escalating the outcome of attacks against rag-based inference in scale and severity using jailbreaking , author=. arXiv preprint arXiv:2409.08045 , year=

work page arXiv

[13] [13]

arXiv preprint arXiv:2411.14110 , year=

Rag-thief: Scalable extraction of private data from retrieval-augmented generation applications with agent-based attacks , author=. arXiv preprint arXiv:2411.14110 , year=

work page arXiv

[14] [14]

arXiv preprint arXiv:2412.18295 , year=

Pirates of the RAG: Adaptively Attacking LLMs to Leak Knowledge Bases , author=. arXiv preprint arXiv:2412.18295 , year=

work page arXiv

[15] [15]

Proceedings of the 2020 ACM SIGSAC conference on computer and communications security , pages=

Information leakage in embedding models , author=. Proceedings of the 2020 ACM SIGSAC conference on computer and communications security , pages=

work page 2020

[16] [16]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Activethief: Model extraction using active learning and unannotated public data , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page

[17] [17]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision , pages=

Army of Thieves: Enhancing Black-Box Model Extraction via Ensemble based sample selection , author=. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision , pages=

work page

[18] [18]

arXiv preprint arXiv:2011.04743 , year=

Adversarial semantic collisions , author=. arXiv preprint arXiv:2011.04743 , year=

work page arXiv 2011

[19] [19]

Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security , pages=

Order-disorder: Imitation adversarial attacks for black-box neural ranking models , author=. Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security , pages=

work page 2022

[20] [20]

International conference on theory and applications of models of computation , pages=

Differential privacy: A survey of results , author=. International conference on theory and applications of models of computation , pages=. 2008 , organization=

work page 2008

[21] [21]

Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006

Calibrating noise to sensitivity in private data analysis , author=. Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3 , pages=. 2006 , organization=

work page 2006

[22] [22]

Advances in neural information processing systems , volume=

Retrieval-augmented generation for knowledge-intensive nlp tasks , author=. Advances in neural information processing systems , volume=

work page

[23] [23]

REPLUG: Retrieval-Augmented Black-Box Language Models

Replug: Retrieval-augmented black-box language models , author=. arXiv preprint arXiv:2301.12652 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[24] [24]

Transactions of the Association for Computational Linguistics , volume=

In-context retrieval-augmented language models , author=. Transactions of the Association for Computational Linguistics , volume=. 2023 , publisher=

work page 2023

[25] [25]

, author=

Dense Passage Retrieval for Open-Domain Question Answering. , author=. EMNLP (1) , pages=

work page

[26] [26]

AI in Finance: The Promise and Risks of RAG , howpublished =

work page

[27] [27]

IEEE Transactions on Information Theory , volume=

Minimax bounds for active learning , author=. IEEE Transactions on Information Theory , volume=. 2008 , publisher=

work page 2008

[28] [28]

Proceedings of the 2016 ACM SIGSAC conference on computer and communications security , pages=

Deep learning with differential privacy , author=. Proceedings of the 2016 ACM SIGSAC conference on computer and communications security , pages=

work page 2016

[29] [29]

PubMed , howpublished =

work page

[30] [30]

StatPearls , howpublished =

work page

[31] [31]

MedCorp , howpublished =

work page

[32] [32]

llama-7b , howpublished =

work page

[33] [33]

llama-13b , howpublished =

work page

[34] [34]

2023 , eprint=

Mistral 7B , author=. 2023 , eprint=

work page 2023

[35] [35]

2024 , eprint=

Gemini: A Family of Highly Capable Multimodal Models , author=. 2024 , eprint=

work page 2024

[36] [36]

HealthCareMagic-10k , howpublished =

work page

[37] [37]

European conference on machine learning , pages=

The enron corpus: A new dataset for email classification research , author=. European conference on machine learning , pages=. 2004 , organization=

work page 2004

[38] [38]

EDGAR - CORPUS : Billions of Tokens Make The World Go Round

Loukas, Lefteris and Fergadiotis, Manos and Androutsopoulos, Ion and Malakasiotis, Prodromos. EDGAR - CORPUS : Billions of Tokens Make The World Go Round. Proceedings of the Third Workshop on Economics and Natural Language Processing. 2021. doi:10.18653/v1/2021.econlp-1.2

work page doi:10.18653/v1/2021.econlp-1.2 2021

[39] [39]

bge-large-en-v1.5 , howpublished =

work page

[40] [40]

all-MiniLM-L6-v2 , howpublished =

work page

[41] [41]

e5-base-v2 , howpublished =

work page

[42] [42]

2024 IEEE Conference on Communications and Network Security (CNS) , pages=

Adversarial Attacks on Federated Learning Revisited: a Client-Selection Perspective , author=. 2024 IEEE Conference on Communications and Network Security (CNS) , pages=. 2024 , organization=

work page 2024

[43] [43]

Foundations and Trends

The probabilistic relevance framework: BM25 and beyond , author=. Foundations and Trends. 2009 , publisher=

work page 2009

[44] [44]

2021 , url =

Unsupervised Dense Information Retrieval with Contrastive Learning , author=. 2021 , url =

work page 2021

[45] [45]

Bioinformatics , volume=

Medcpt: Contrastive pre-trained transformers with large-scale pubmed search logs for zero-shot biomedical information retrieval , author=. Bioinformatics , volume=. 2023 , publisher=

work page 2023

[46] [46]

Retrieval Is All You Need: Developing an AI Powered Chatbot with RAG in Azure , author=

work page

[47] [47]

NEJM AI , volume=

RAG in health care: a novel framework for improving communication and decision-making by addressing LLM limitations , author=. NEJM AI , volume=. 2025 , publisher=

work page 2025

[48] [48]

IEEE Access , year=

Enhancing the Precision and Interpretability of Retrieval-Augmented Generation (RAG) in Legal Technology: A Survey , author=. IEEE Access , year=

work page

[49] [49]

Proceedings of the fourth ACM international conference on AI in finance , pages=

Enhancing financial sentiment analysis via retrieval augmented large language models , author=. Proceedings of the fourth ACM international conference on AI in finance , pages=

work page

[50] [50]

Trustworthiness in Retrieval-Augmented Generation Systems: A Survey

Trustworthiness in retrieval-augmented generation systems: A survey , author=. arXiv preprint arXiv:2409.10102 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[51] [51]

IEEE Transactions on systems, man, and cybernetics-Part A: Systems and humans , volume=

Secure knowledge management: confidentiality, trust, and privacy , author=. IEEE Transactions on systems, man, and cybernetics-Part A: Systems and humans , volume=. 2006 , publisher=

work page 2006

[52] [52]

ACM Computing Surveys , volume=

Security and privacy challenges of large language models: A survey , author=. ACM Computing Surveys , volume=. 2025 , publisher=

work page 2025

[53] [53]

Advances in Neural Information Processing Systems , volume=

PrivAuditor: Benchmarking Data Protection Vulnerabilities in LLM Adaptation Techniques , author=. Advances in Neural Information Processing Systems , volume=

work page

[54] [54]

2022 IEEE symposium on security and privacy (SP) , pages=

Membership inference attacks from first principles , author=. 2022 IEEE symposium on security and privacy (SP) , pages=. 2022 , organization=

work page 2022

[55] [55]

30th USENIX security symposium (USENIX Security 21) , pages=

Extracting training data from large language models , author=. 30th USENIX security symposium (USENIX Security 21) , pages=

work page

[56] [56]

Machine learning , volume=

Improving generalization with active learning , author=. Machine learning , volume=. 1994 , publisher=

work page 1994

[57] [57]

Acm Sigir Forum , volume=

A sequential algorithm for training text classifiers: Corrigendum and additional data , author=. Acm Sigir Forum , volume=. 1995 , organization=

work page 1995

[58] [58]

Active Learning for Convolutional Neural Networks: A Core-Set Approach

Active learning for convolutional neural networks: A core-set approach , author=. arXiv preprint arXiv:1708.00489 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[59] [59]

Adversarial Active Learning for Deep Networks: a Margin Based Approach

Adversarial active learning for deep networks: a margin based approach , author=. arXiv preprint arXiv:1802.09841 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[60] [60]

2023 , url =

HuggingFace , title =. 2023 , url =

work page 2023

[61] [61]

arXiv preprint arXiv:2502.15734 , year=

Cache-Craft: Managing Chunk-Caches for Efficient Retrieval-Augmented Generation , author=. arXiv preprint arXiv:2502.15734 , year=

work page arXiv

[62] [62]

arXiv preprint arXiv:2502.10976 , year=

QuOTE: Question-Oriented Text Embeddings , author=. arXiv preprint arXiv:2502.10976 , year=

work page arXiv

[63] [63]

2023 , howpublished =

Synthetic Financial Domain Documents with PII Labels , author =. 2023 , howpublished =

work page 2023

[64] [64]

Ras: Retrieval-and-structuring for knowledge-intensive llm generation

RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation , author=. arXiv preprint arXiv:2502.10996 , year=

work page arXiv

[65] [65]

2024 , howpublished =

work page 2024

[66] [66]

Creating Retrieval Augmented Generation solutions on AWS for healthcare , year =

work page

[67] [67]

Adversarial Semantic Collisions

Song, Congzheng and Rush, Alexander and Shmatikov, Vitaly. Adversarial Semantic Collisions. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020

work page 2020

[68] [68]

2023 , howpublished =

Amazon Customer Reviews Dataset , author =. 2023 , howpublished =

work page 2023

[69] [69]

Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =

The use of MMR, diversity-based reranking for reordering documents and producing summaries , author =. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =. 1998 , organization =

work page 1998

[70] [70]

Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks

Mireshghallah, Fatemehsadat and Goyal, Kartik and Uniyal, Archit and Berg-Kirkpatrick, Taylor and Shokri, Reza. Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2022

work page 2022

[71] [71]

Feder Cooper and Daphne Ippolito and Christopher A

Milad Nasr and Nicholas Carlini and Jonathan Hayase and Matthew Jagielski and A. Feder Cooper and Daphne Ippolito and Christopher A. Choquette-Choo and Eric Wallace and Florian Tramèr and Katherine Lee , year=

work page

[72] [72]

Gunter and Nikita Borisov , title =

Karan Ganju and Qi Wang and Wei Yang and Carl A. Gunter and Nikita Borisov , title =. Proceedings of the 2018. 2018 , pages =

work page 2018

[73] [73]

Property Inference from Poisoning , year=

Mahloujifar, Saeed and Ghosh, Esha and Chase, Melissa , booktitle=. Property Inference from Poisoning , year=

work page

[74] [74]

Universal adversarial triggers for attacking and analyzing nlp

Universal adversarial triggers for attacking and analyzing NLP , author=. arXiv preprint arXiv:1908.07125 , year=

work page arXiv 1908

[75] [75]

kdd , volume=

A density-based algorithm for discovering clusters in large spatial databases with noise , author=. kdd , volume=

work page

[76] [76]

Active Learning Literature Survey , type =

Settles, Burr , biburl =. Active Learning Literature Survey , type =

work page

[77] [77]

Chase, Harrison , title =

work page

[78] [78]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , year=

Text Embeddings Reveal (Almost) As Much As Text , author=. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , year=

work page 2023

[79] [79]

International Journal of Advanced Computer Science and Applications , volume=

Personally Identifiable Information (PII) Detection in the Unstructured Large Text Corpus using Natural Language Processing and Unsupervised Learning Technique , author=. International Journal of Advanced Computer Science and Applications , volume=. 2021 , url=

work page 2021

[80] [80]

arXiv preprint arXiv:2503.12896 , year=

Safeguarding LLM Embeddings in End-Cloud Collaboration via Entropy-Driven Perturbation , author=. arXiv preprint arXiv:2503.12896 , year=

work page arXiv