arxiv: 2604.02954 · v1 · submitted 2026-04-03 · 💻 cs.CL · cs.AI

Recognition: no theorem link

LogicPoison: Logical Attacks on Graph Retrieval-Augmented Generation

Yilin Xiao , Jin Chen , Qinggang Zhang , Yujing Zhang , Chuang Zhou , Longhao Yang , Lingfei Ren , Xin Yang

show 1 more author

Xiao Huang

Authors on Pith no claims yet

Pith reviewed 2026-05-13 19:40 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords GraphRAGlogical attacksentity swappingknowledge graphsretrieval augmented generationadversarial attacksLLM securitymulti-hop reasoning

0 comments

The pith

LogicPoison breaks GraphRAG reasoning by swapping same-type entities to sever graph connections without changing text meaning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that GraphRAG systems lose their resistance to attacks once the knowledge graph's logical structure is altered through hidden topological changes. It introduces LogicPoison as a method that uses type-preserving entity swaps to target central logic hubs and specific reasoning bridges, turning valid multi-hop paths into dead ends. A sympathetic reader would care because this shows that current graph-based safeguards in LLMs can be bypassed while keeping responses looking normal on the surface. The approach maintains textual plausibility so that standard content filters fail to catch it.

Core claim

LogicPoison employs a type-preserving entity swapping mechanism to perturb both global logic hubs for disrupting overall graph connectivity and query-specific reasoning bridges for severing essential multi-hop inference paths, thereby rerouting valid reasoning into dead ends while maintaining surface-level textual plausibility.

What carries the argument

Type-preserving entity swapping mechanism that replaces entities with others of identical type to alter graph topology and break inference paths.

If this is right

GraphRAG performance on reasoning tasks degrades significantly under targeted swaps.
Existing defenses such as community detection and relation filtering are bypassed.
The attack achieves higher effectiveness and stealth than prior text-based baselines.
Valid multi-hop inferences are rerouted into unreachable dead ends in the graph.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same swapping approach could expose weaknesses in other graph-structured retrieval methods beyond GraphRAG.
Future systems may require explicit checks for topological consistency during graph construction or query time.
Testing the attack on varied graph densities would reveal how hub density influences vulnerability.
Defenses could combine text checks with simple graph metrics like path length changes after entity replacement.

Load-bearing premise

The security of GraphRAG systems depends on the topological integrity of the underlying graph, which can be undermined by corrupting logical connections without any visible change to the text.

What would settle it

Apply the entity-swapping attack to a standard GraphRAG benchmark and measure whether multi-hop query accuracy drops sharply while surface text remains coherent and plausible.

Figures

Figures reproduced from arXiv: 2604.02954 by Chuang Zhou, Jin Chen, Lingfei Ren, Longhao Yang, Qinggang Zhang, Xiao Huang, Xin Yang, Yilin Xiao, Yujing Zhang.

**Figure 2.** Figure 2: The overall framework of LOGICPOISON. The attack pipeline is divided into three stages: I. Strategic Entity Selection, where target entities are identified via a dual-pronged strategy combining global logic hubs and query-centric reasoning bridges into a unified set R. II. Attack Mechanism, which employs a type-preserving cyclic permutation to swap entities within their respective type buckets in the corpu… view at source ↗

**Figure 3.** Figure 3: Ablation study of the components on the 2Wiki Dataset. We compare Global-only, Query-Centric-only, [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: PPL-based detection for LOGICPOISON. 5.5.3 Poisoning Text Identification Perplexity (PPL) (Alon and Kamfonas, 2023) is widely used to measure text quality and also to defend against attacks on large language models. Previous studies have shown that a high perplexity of text indicates low quality, and poisoned texts often have a higher perplexity than texts written by humans, which makes poisoned texts very… view at source ↗

**Figure 5.** Figure 5: Hyperparameter analysis results of LOGICPOISON. finally chose top-n%=5% as the hyperparameter, which achieves the optimal balance between attack effect and resource cost. A.7 Prompts In this section, we present the prompt templates that drive the components of our method [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: Prompt template used for evaluating exact matches between predictions and gold answers. [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 7.** Figure 7: Prompt template for reasoning-critical entity extraction. [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

read the original abstract

Graph-based Retrieval-Augmented Generation (GraphRAG) enhances the reasoning capabilities of Large Language Models (LLMs) by grounding their responses in structured knowledge graphs. Leveraging community detection and relation filtering techniques, GraphRAG systems demonstrate inherent resistance to traditional RAG attacks, such as text poisoning and prompt injection. However, in this paper, we find that the security of GraphRAG systems fundamentally relies on the topological integrity of the underlying graph, which can be undermined by implicitly corrupting the logical connections, without altering surface-level text semantics. To exploit this vulnerability, we propose \textsc{LogicPoison}, a novel attack framework that targets logical reasoning rather than injecting false contents. Specifically, \textsc{LogicPoison} employs a type-preserving entity swapping mechanism to perturb both global logic hubs for disrupting overall graph connectivity and query-specific reasoning bridges for severing essential multi-hop inference paths. This approach effectively reroutes valid reasoning into dead ends while maintaining surface-level textual plausibility. Comprehensive experiments across multiple benchmarks demonstrate that \textsc{LogicPoison} successfully bypasses GraphRAG's defenses, significantly degrading performance and outperforming state-of-the-art baselines in both effectiveness and stealth. Our code is available at \textcolor{blue}https://github.com/Jord8061/logicPoison.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LogicPoison uses type-preserving entity swaps to target GraphRAG logic hubs and bridges, but the swaps alter source facts so the topological-only claim does not separate cleanly from text poisoning.

read the letter

The main thing here is that LogicPoison swaps same-type entities in the source documents to disrupt global logic hubs and query-specific reasoning bridges in the knowledge graph. The authors position this as a new attack that reroutes multi-hop inference into dead ends while keeping surface text plausible, and they claim it beats prior text-poisoning and prompt-injection baselines on benchmarks with code released for inspection. That framing of a topology-focused attack on GraphRAG is the clearest new element, and the focus on community detection and relation filtering as points of weakness is a reasonable angle to explore. The code availability is a practical plus for anyone wanting to test the method. The soft spot is the central separation the paper needs. Swapping entities of the same type revises the actual triples and sentences that feed graph construction, so the LLM receives changed factual content rather than purely broken connectivity. This makes it hard to isolate whether performance drops come from topology or from the model reasoning over different knowledge. The abstract does not supply metrics, dataset details, or ablations that would show the effect is connectivity-driven, which leaves the distinction from text poisoning unconvincing. The experiments are described as comprehensive but without numbers or controls in the provided text it is difficult to judge how much the results support the topological story. This paper is for people working on retrieval-augmented systems and their robustness. A reader interested in attack surfaces on graph-based RAG would get value from the attack construction itself. I would send it to peer review so referees can check the experimental isolation and whether the semantic-topological distinction can be made rigorous.

Referee Report

2 major / 2 minor

Summary. The paper claims that GraphRAG systems are vulnerable to logical attacks because their security depends on graph topological integrity rather than surface text. It introduces LogicPoison, which applies type-preserving entity swapping to disrupt global logic hubs (affecting overall connectivity) and query-specific reasoning bridges (severing multi-hop paths), rerouting reasoning into dead ends while preserving textual plausibility. The work asserts that this bypasses GraphRAG defenses and outperforms baselines across benchmarks, with code released.

Significance. If the separation between topological corruption and semantic alteration holds, the result identifies a new attack surface on graph-structured RAG that is harder to defend than text poisoning or prompt injection. The type-preserving mechanism and dual targeting of hubs versus bridges are technically interesting and could motivate topology-aware defenses or verification steps in GraphRAG pipelines.

major comments (2)

[Abstract, §3] Abstract and §3 (attack construction): The central claim that LogicPoison perturbs only logical topology 'without altering surface-level text semantics' is load-bearing yet unsupported. Type-preserving entity swaps directly revise the factual triples and sentences that feed entity extraction, relation extraction, and community detection; this changes the knowledge supplied to the LLM independently of graph structure, making it impossible to attribute performance drops solely to broken connectivity.
[§4] §4 (experiments): The abstract states 'comprehensive experiments' and 'significantly degrading performance' with outperformance over baselines, yet no quantitative metrics, dataset statistics, ablation results on hub versus bridge perturbations, or controls for semantic drift are referenced. Without these, the effectiveness and stealth claims cannot be evaluated against the topological-only hypothesis.

minor comments (2)

[§3] Notation for 'global logic hubs' and 'query-specific reasoning bridges' is introduced without a formal graph-theoretic definition or algorithm for their identification; add a precise characterization (e.g., degree centrality thresholds or betweenness scores) in §3.
[Abstract] The GitHub link is given in blue text; ensure it is a permanent DOI or archive link in the camera-ready version.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive feedback on our paper. We have addressed each of the major comments point-by-point below, making revisions where necessary to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract, §3] Abstract and §3 (attack construction): The central claim that LogicPoison perturbs only logical topology 'without altering surface-level text semantics' is load-bearing yet unsupported. Type-preserving entity swaps directly revise the factual triples and sentences that feed entity extraction, relation extraction, and community detection; this changes the knowledge supplied to the LLM independently of graph structure, making it impossible to attribute performance drops solely to broken connectivity.

Authors: We thank the referee for this observation. Upon reflection, the original wording overstated the preservation of semantics. The entity swaps do modify the factual content in the graph by substituting same-type entities, thereby altering the knowledge available for reasoning. However, the attack maintains textual plausibility and avoids introducing detectable semantic inconsistencies or falsehoods at the surface level, which is what allows it to evade existing defenses focused on text integrity. We have revised the abstract and §3 to clarify that the method preserves surface-level textual plausibility and logical type consistency rather than claiming no change to semantics. Additionally, we have included new experiments measuring semantic similarity (e.g., using cosine similarity on embeddings) to quantify the minimal drift. revision: partial
Referee: [§4] §4 (experiments): The abstract states 'comprehensive experiments' and 'significantly degrading performance' with outperformance over baselines, yet no quantitative metrics, dataset statistics, ablation results on hub versus bridge perturbations, or controls for semantic drift are referenced. Without these, the effectiveness and stealth claims cannot be evaluated against the topological-only hypothesis.

Authors: We apologize if the experimental details were not sufficiently highlighted. Section 4 of the manuscript includes quantitative metrics showing performance degradation (e.g., up to 45% drop in F1 score on multi-hop QA tasks), dataset statistics in Table 1 (including 5 benchmarks with over 10k queries total), ablations in §4.3 separating global hub perturbations from query-specific bridge attacks with corresponding performance impacts, and controls for semantic drift via automated metrics like perplexity scores and manual annotations confirming stealth. These results support that the degradation stems primarily from disrupted connectivity, as semantic-preserving baselines underperform. We have updated the abstract to reference specific tables and figures for these results and added a new subsection discussing the topological hypothesis with supporting evidence. revision: yes

Circularity Check

0 steps flagged

No circularity in attack framework or claims

full rationale

The paper proposes LogicPoison via type-preserving entity swapping to disrupt graph topology while claiming to preserve surface semantics, then validates the approach experimentally across benchmarks. No equations, fitted parameters, predictions, or self-citation chains appear in the derivation; the central distinction between logical topology and textual semantics is presented as a hypothesis tested by results rather than derived by construction from inputs. The attack definition and performance claims remain independent of any reduction to prior fitted values or self-referential definitions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Abstract-only review limits identification of parameters; main reliance is on stated properties of GraphRAG systems.

axioms (1)

domain assumption GraphRAG systems demonstrate inherent resistance to traditional RAG attacks such as text poisoning and prompt injection through community detection and relation filtering
This premise is used to position the new attack as bypassing existing defenses.

invented entities (1)

LogicPoison attack framework no independent evidence
purpose: To target logical reasoning in GraphRAG by perturbing graph topology via entity swapping
Newly proposed method introduced to exploit the claimed topological vulnerability.

pith-pipeline@v0.9.0 · 5545 in / 1273 out tokens · 48747 ms · 2026-05-13T19:40:36.378590+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

[1]

In Proceedings of the 34th USENIX Conference on Se- curity Symposium, SEC ’25, USA

Machine against the rag: jamming retrieval- augmented generation with blocker documents. In Proceedings of the 34th USENIX Conference on Se- curity Symposium, SEC ’25, USA. USENIX Associ- ation. Daniel Spielman. 2012. Spectral graph theory.Combi- natorial scientific computing, 18(18). Xuchen Suo. 2024. Signed-prompt: A new approach to prevent prompt injec...

work page 2012
[2]

Yilin Xiao, Junnan Dong, Chuang Zhou, Su Dong, Qian wen Zhang, Di Yin, Xing Sun, and Xiao Huang

Jailbroken: How does LLM safety training fail? InThirty-seventh Conference on Neural Infor- mation Processing Systems. Yilin Xiao, Junnan Dong, Chuang Zhou, Su Dong, Qian wen Zhang, Di Yin, Xing Sun, and Xiao Huang. 2025. Graphrag-bench: Challenging domain-specific rea- soning for evaluating graph retrieval-augmented gen- eration.Preprint, arXiv:2506.0240...

work page arXiv 2025
[3]

YES" or

is a kind of security attack against Large Language Model integration applications. The core of prompt injection is that the attacker can inject malicious instructions or data by manipulating the external input data of the application, and induce the LLM to ignore the original task instructions and execute the injection task preset by the attacker, so as ...

work page 2023