Query-Efficient Agentic Graph Extraction Attacks on GraphRAG Systems
Pith reviewed 2026-05-16 12:48 UTC · model grok-4.3
The pith
AGEA recovers up to 90% of entities and relationships from GraphRAG systems under fixed query budgets.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that under identical query budgets, their AGEA attack significantly outperforms prior baselines by recovering up to 90% of the entities and relationships from the latent graph while maintaining high precision, demonstrating that GraphRAG systems are vulnerable to structured agentic extraction attacks even under strict query limits.
What carries the argument
The AGEA framework, which leverages a novelty-guided exploration-exploitation strategy, external graph memory modules, and a two-stage graph extraction pipeline of lightweight discovery followed by LLM-based filtering.
If this is right
- GraphRAG systems leak retrieved subgraphs through their responses.
- Budget-constrained attacks can achieve high recovery rates.
- The vulnerability holds for medical, agriculture, and literary datasets on Microsoft-GraphRAG and LightRAG.
Where Pith is reading between the lines
- Defenses could focus on reducing the structural information in responses.
- This attack approach might apply to other retrieval systems that expose graph-like information.
- Maintaining memory across queries is key to the efficiency, so session-based defenses may help.
Load-bearing premise
The target GraphRAG systems expose enough response structure for the two-stage discovery-plus-LLM-filtering pipeline to succeed and the adversary maintains persistent external graph memory.
What would settle it
A test that limits the detail in GraphRAG responses about retrieved entities and relations or removes the ability to maintain cross-query memory would determine if recovery rates stay near 90%.
read the original abstract
Graph-based retrieval-augmented generation (GraphRAG) systems construct knowledge graphs over document collections to support multi-hop reasoning. While prior work shows that GraphRAG responses may leak retrieved subgraphs, the feasibility of query-efficient reconstruction of the hidden graph structure remains unexplored under realistic query budgets. We study a budget-constrained black-box setting where an adversary adaptively queries the system to steal its latent entity-relation graph. We propose AGEA (Agentic Graph Extraction Attack), a framework that leverages a novelty-guided exploration-exploitation strategy, external graph memory modules, and a two-stage graph extraction pipeline combining lightweight discovery with LLM-based filtering. We evaluate AGEA on medical, agriculture, and literary datasets across Microsoft-GraphRAG and LightRAG systems. Under identical query budgets, AGEA significantly outperforms prior attack baselines, recovering up to 90% of entities and relationships while maintaining high precision. These results demonstrate that modern GraphRAG systems are highly vulnerable to structured, agentic extraction attacks, even under strict query limits. The code is available at https://github.com/shuashua0608/AGEA.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes AGEA, an agentic black-box attack framework for reconstructing hidden entity-relation graphs from GraphRAG systems under strict query budgets. It combines novelty-guided exploration-exploitation, persistent external graph memory, and a two-stage pipeline (lightweight discovery followed by LLM-based filtering) and evaluates the approach on medical, agriculture, and literary datasets against Microsoft-GraphRAG and LightRAG, claiming up to 90% recovery of entities and relations while outperforming prior baselines under identical budgets.
Significance. If the empirical results hold under rigorous verification, the work provides concrete evidence that modern GraphRAG systems are vulnerable to structured, query-efficient extraction attacks. This has direct implications for privacy and security in RAG deployments. The public release of code is a clear strength that supports reproducibility.
major comments (3)
- [§5] §5 (Evaluation): the central claim of significant outperformance and up to 90% recovery is presented without exact query-budget values, baseline implementation details, statistical significance tests, or error bars/variance across runs, rendering the quantitative results unverifiable from the text.
- [§3.2] §3.2 (Two-stage extraction pipeline): the reported recovery rates depend on GraphRAG responses exposing extractable entity-relation structure; no ablation isolates the LLM filter's contribution versus raw extraction, and no experiments test performance when responses are heavily summarized or paraphrased.
- [§4] §4 (Agentic components): the persistent external graph memory and adaptive novelty-guided strategy are load-bearing assumptions, yet the manuscript provides no stress tests against systems that restrict context length or alter response formatting.
minor comments (2)
- Table and figure captions should explicitly state the query budget used for each reported metric.
- The abstract states 'high precision' without defining the precise metric (e.g., precision@K or edge-level precision); this should be clarified in §5.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below, indicating where revisions will be made to improve clarity and rigor.
read point-by-point responses
-
Referee: [§5] §5 (Evaluation): the central claim of significant outperformance and up to 90% recovery is presented without exact query-budget values, baseline implementation details, statistical significance tests, or error bars/variance across runs, rendering the quantitative results unverifiable from the text.
Authors: We agree that additional quantitative details are required for verifiability. In the revised manuscript we will report the precise query budgets employed (200, 500, and 1000 queries), move baseline implementation details to an appendix, and present all recovery metrics as means with standard deviations over five independent runs together with paired t-test p-values against each baseline. revision: yes
-
Referee: [§3.2] §3.2 (Two-stage extraction pipeline): the reported recovery rates depend on GraphRAG responses exposing extractable entity-relation structure; no ablation isolates the LLM filter's contribution versus raw extraction, and no experiments test performance when responses are heavily summarized or paraphrased.
Authors: We will add an ablation comparing the full two-stage pipeline against raw extraction without the LLM filter. Experiments that require the target system to produce heavily summarized or paraphrased outputs fall outside the strict black-box threat model we adopt; we will therefore discuss this scenario explicitly as a limitation rather than claim to have evaluated it. revision: partial
-
Referee: [§4] §4 (Agentic components): the persistent external graph memory and adaptive novelty-guided strategy are load-bearing assumptions, yet the manuscript provides no stress tests against systems that restrict context length or alter response formatting.
Authors: We will add controlled experiments that progressively limit the capacity of the external memory module to simulate context-length restrictions and measure the resulting degradation in recovery. For response-formatting variations we will include a robustness section that tests the agent under common output perturbations and describe how the novelty-guided policy can be adapted. revision: yes
Circularity Check
No circularity: empirical results rest on external ground-truth comparison
full rationale
The paper presents an empirical attack framework (AGEA) whose performance is measured by direct recall/precision against independent ground-truth graphs on public datasets. No equations, fitted parameters, or self-referential definitions appear in the provided text; success metrics are not defined in terms of the attack's own outputs or prior self-citations. The two-stage pipeline and novelty-guided strategy are algorithmic choices evaluated experimentally, not derived quantities that reduce to their inputs by construction. The evaluation is therefore self-contained against external benchmarks.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.