Recognition: no theorem link
LogicPoison: Logical Attacks on Graph Retrieval-Augmented Generation
Pith reviewed 2026-05-13 19:40 UTC · model grok-4.3
The pith
LogicPoison breaks GraphRAG reasoning by swapping same-type entities to sever graph connections without changing text meaning.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
LogicPoison employs a type-preserving entity swapping mechanism to perturb both global logic hubs for disrupting overall graph connectivity and query-specific reasoning bridges for severing essential multi-hop inference paths, thereby rerouting valid reasoning into dead ends while maintaining surface-level textual plausibility.
What carries the argument
Type-preserving entity swapping mechanism that replaces entities with others of identical type to alter graph topology and break inference paths.
If this is right
- GraphRAG performance on reasoning tasks degrades significantly under targeted swaps.
- Existing defenses such as community detection and relation filtering are bypassed.
- The attack achieves higher effectiveness and stealth than prior text-based baselines.
- Valid multi-hop inferences are rerouted into unreachable dead ends in the graph.
Where Pith is reading between the lines
- The same swapping approach could expose weaknesses in other graph-structured retrieval methods beyond GraphRAG.
- Future systems may require explicit checks for topological consistency during graph construction or query time.
- Testing the attack on varied graph densities would reveal how hub density influences vulnerability.
- Defenses could combine text checks with simple graph metrics like path length changes after entity replacement.
Load-bearing premise
The security of GraphRAG systems depends on the topological integrity of the underlying graph, which can be undermined by corrupting logical connections without any visible change to the text.
What would settle it
Apply the entity-swapping attack to a standard GraphRAG benchmark and measure whether multi-hop query accuracy drops sharply while surface text remains coherent and plausible.
Figures
read the original abstract
Graph-based Retrieval-Augmented Generation (GraphRAG) enhances the reasoning capabilities of Large Language Models (LLMs) by grounding their responses in structured knowledge graphs. Leveraging community detection and relation filtering techniques, GraphRAG systems demonstrate inherent resistance to traditional RAG attacks, such as text poisoning and prompt injection. However, in this paper, we find that the security of GraphRAG systems fundamentally relies on the topological integrity of the underlying graph, which can be undermined by implicitly corrupting the logical connections, without altering surface-level text semantics. To exploit this vulnerability, we propose \textsc{LogicPoison}, a novel attack framework that targets logical reasoning rather than injecting false contents. Specifically, \textsc{LogicPoison} employs a type-preserving entity swapping mechanism to perturb both global logic hubs for disrupting overall graph connectivity and query-specific reasoning bridges for severing essential multi-hop inference paths. This approach effectively reroutes valid reasoning into dead ends while maintaining surface-level textual plausibility. Comprehensive experiments across multiple benchmarks demonstrate that \textsc{LogicPoison} successfully bypasses GraphRAG's defenses, significantly degrading performance and outperforming state-of-the-art baselines in both effectiveness and stealth. Our code is available at \textcolor{blue}https://github.com/Jord8061/logicPoison.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that GraphRAG systems are vulnerable to logical attacks because their security depends on graph topological integrity rather than surface text. It introduces LogicPoison, which applies type-preserving entity swapping to disrupt global logic hubs (affecting overall connectivity) and query-specific reasoning bridges (severing multi-hop paths), rerouting reasoning into dead ends while preserving textual plausibility. The work asserts that this bypasses GraphRAG defenses and outperforms baselines across benchmarks, with code released.
Significance. If the separation between topological corruption and semantic alteration holds, the result identifies a new attack surface on graph-structured RAG that is harder to defend than text poisoning or prompt injection. The type-preserving mechanism and dual targeting of hubs versus bridges are technically interesting and could motivate topology-aware defenses or verification steps in GraphRAG pipelines.
major comments (2)
- [Abstract, §3] Abstract and §3 (attack construction): The central claim that LogicPoison perturbs only logical topology 'without altering surface-level text semantics' is load-bearing yet unsupported. Type-preserving entity swaps directly revise the factual triples and sentences that feed entity extraction, relation extraction, and community detection; this changes the knowledge supplied to the LLM independently of graph structure, making it impossible to attribute performance drops solely to broken connectivity.
- [§4] §4 (experiments): The abstract states 'comprehensive experiments' and 'significantly degrading performance' with outperformance over baselines, yet no quantitative metrics, dataset statistics, ablation results on hub versus bridge perturbations, or controls for semantic drift are referenced. Without these, the effectiveness and stealth claims cannot be evaluated against the topological-only hypothesis.
minor comments (2)
- [§3] Notation for 'global logic hubs' and 'query-specific reasoning bridges' is introduced without a formal graph-theoretic definition or algorithm for their identification; add a precise characterization (e.g., degree centrality thresholds or betweenness scores) in §3.
- [Abstract] The GitHub link is given in blue text; ensure it is a permanent DOI or archive link in the camera-ready version.
Simulated Author's Rebuttal
Thank you for the constructive feedback on our paper. We have addressed each of the major comments point-by-point below, making revisions where necessary to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract, §3] Abstract and §3 (attack construction): The central claim that LogicPoison perturbs only logical topology 'without altering surface-level text semantics' is load-bearing yet unsupported. Type-preserving entity swaps directly revise the factual triples and sentences that feed entity extraction, relation extraction, and community detection; this changes the knowledge supplied to the LLM independently of graph structure, making it impossible to attribute performance drops solely to broken connectivity.
Authors: We thank the referee for this observation. Upon reflection, the original wording overstated the preservation of semantics. The entity swaps do modify the factual content in the graph by substituting same-type entities, thereby altering the knowledge available for reasoning. However, the attack maintains textual plausibility and avoids introducing detectable semantic inconsistencies or falsehoods at the surface level, which is what allows it to evade existing defenses focused on text integrity. We have revised the abstract and §3 to clarify that the method preserves surface-level textual plausibility and logical type consistency rather than claiming no change to semantics. Additionally, we have included new experiments measuring semantic similarity (e.g., using cosine similarity on embeddings) to quantify the minimal drift. revision: partial
-
Referee: [§4] §4 (experiments): The abstract states 'comprehensive experiments' and 'significantly degrading performance' with outperformance over baselines, yet no quantitative metrics, dataset statistics, ablation results on hub versus bridge perturbations, or controls for semantic drift are referenced. Without these, the effectiveness and stealth claims cannot be evaluated against the topological-only hypothesis.
Authors: We apologize if the experimental details were not sufficiently highlighted. Section 4 of the manuscript includes quantitative metrics showing performance degradation (e.g., up to 45% drop in F1 score on multi-hop QA tasks), dataset statistics in Table 1 (including 5 benchmarks with over 10k queries total), ablations in §4.3 separating global hub perturbations from query-specific bridge attacks with corresponding performance impacts, and controls for semantic drift via automated metrics like perplexity scores and manual annotations confirming stealth. These results support that the degradation stems primarily from disrupted connectivity, as semantic-preserving baselines underperform. We have updated the abstract to reference specific tables and figures for these results and added a new subsection discussing the topological hypothesis with supporting evidence. revision: yes
Circularity Check
No circularity in attack framework or claims
full rationale
The paper proposes LogicPoison via type-preserving entity swapping to disrupt graph topology while claiming to preserve surface semantics, then validates the approach experimentally across benchmarks. No equations, fitted parameters, predictions, or self-citation chains appear in the derivation; the central distinction between logical topology and textual semantics is presented as a hypothesis tested by results rather than derived by construction from inputs. The attack definition and performance claims remain independent of any reduction to prior fitted values or self-referential definitions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption GraphRAG systems demonstrate inherent resistance to traditional RAG attacks such as text poisoning and prompt injection through community detection and relation filtering
invented entities (1)
-
LogicPoison attack framework
no independent evidence
Reference graph
Works this paper leans on
-
[1]
In Proceedings of the 34th USENIX Conference on Se- curity Symposium, SEC ’25, USA
Machine against the rag: jamming retrieval- augmented generation with blocker documents. In Proceedings of the 34th USENIX Conference on Se- curity Symposium, SEC ’25, USA. USENIX Associ- ation. Daniel Spielman. 2012. Spectral graph theory.Combi- natorial scientific computing, 18(18). Xuchen Suo. 2024. Signed-prompt: A new approach to prevent prompt injec...
work page 2012
-
[2]
Yilin Xiao, Junnan Dong, Chuang Zhou, Su Dong, Qian wen Zhang, Di Yin, Xing Sun, and Xiao Huang
Jailbroken: How does LLM safety training fail? InThirty-seventh Conference on Neural Infor- mation Processing Systems. Yilin Xiao, Junnan Dong, Chuang Zhou, Su Dong, Qian wen Zhang, Di Yin, Xing Sun, and Xiao Huang. 2025. Graphrag-bench: Challenging domain-specific rea- soning for evaluating graph retrieval-augmented gen- eration.Preprint, arXiv:2506.0240...
-
[3]
is a kind of security attack against Large Language Model integration applications. The core of prompt injection is that the attacker can inject malicious instructions or data by manipulating the external input data of the application, and induce the LLM to ignore the original task instructions and execute the injection task preset by the attacker, so as ...
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.