Universal Graph Backdoor Defense: A Feature-based Homophily Perspective

Chen Chen; Fan Li; Mengting Pan; Xiaoyang Wang

arxiv: 2605.16815 · v1 · pith:NP427IBVnew · submitted 2026-05-16 · 💻 cs.CR · cs.LG

Universal Graph Backdoor Defense: A Feature-based Homophily Perspective

Mengting Pan , Fan Li , Chen Chen , Xiaoyang Wang This is my paper

Pith reviewed 2026-05-19 21:07 UTC · model grok-4.3

classification 💻 cs.CR cs.LG

keywords graph neural networksbackdoor attacksgraph backdoor defensefeature-based homophilyuniversal defenserobust trainingnode-level detection

0 comments

The pith

Backdoors from any graph attack type reduce local feature similarity between nodes and their neighbors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Graph neural networks face backdoor attacks that insert hidden triggers, either as connected subgraphs or as altered node features that leave the graph structure intact. Current defenses target only the subgraph case and collapse when topology is preserved. The paper demonstrates that both attack families produce nodes whose features match their local neighborhood less closely than clean nodes do. This shared drop in feature-based homophily supplies a detection signal that does not depend on trigger shape. A neighbor-aware reconstruction loss flags the discrepant nodes, after which a robust training step removes the backdoor influence while limiting damage to clean performance.

Core claim

Regardless of whether the trigger is a subgraph or a set of feature perturbations, the resulting backdoored nodes exhibit measurably lower feature-based homophily with their immediate neighbors. Theoretical analysis and experiments establish that this local feature inconsistency is a common signature of graph backdoor attacks. The signature is captured by a neighbor-aware reconstruction loss that reconstructs each node from its neighborhood; nodes with high reconstruction error are treated as potential backdoors. A subsequent robust training procedure then minimizes the effect of any remaining trigger while preserving accuracy on clean data.

What carries the argument

Neighbor-aware reconstruction loss that quantifies the discrepancy between a node's features and the aggregated features of its neighbors, used to surface nodes with abnormally low local feature consistency.

If this is right

The same homophily discrepancy appears under both subgraph-based and feature-only triggers, so a single detection mechanism covers both families.
Detection followed by robust retraining simultaneously lowers attack success rate and keeps clean accuracy competitive.
The approach does not require prior knowledge of trigger topology or trigger features.
The method operates at the node level and therefore scales to graphs of varying size without retraining the entire model from scratch.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Homophily deviation might serve as a general anomaly detector for other graph manipulations such as label poisoning or structural evasion.
Integrating the reconstruction term directly into the GNN training objective could yield an end-to-end defense that does not require a separate detection stage.
The same local consistency check could be applied to dynamic or temporal graphs to spot drifting or injected nodes over time.

Load-bearing premise

The assumption that the reconstruction loss can separate backdoored nodes from clean ones without generating so many false positives that the later robust training step cannot correct them.

What would settle it

A controlled test in which backdoored nodes are deliberately constructed to retain high feature similarity with their neighbors; if the reconstruction loss then fails to flag them and attack success rate stays high, the claimed universal signature does not hold.

Figures

Figures reproduced from arXiv: 2605.16815 by Chen Chen, Fan Li, Mengting Pan, Xiaoyang Wang.

**Figure 2.** Figure 2: Framework of CoGBD. exhibit substantially lower feature-based homophily than clean nodes, reflecting a clear homophily discrepancy between backdoors and clean nodes. For poisoned target nodes, this gap is particularly pronounced under feature-based attacks such as SPEAR, where attribute-level triggers directly disrupt local feature–neighborhood alignment (e.g., on OGB-arxiv, target nodes show an approximat… view at source ↗

**Figure 3.** Figure 3: Sensitivity analysis of 𝛼 and 𝛽 [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Sensitivity analysis of 𝜆. removing certain components remains effective for specific attacks (e.g., “w/o Lnode” on GTA, DPGBA, and SPEAR), as these attacks are more sensitive to neighborhood-level or cross-level inconsistencies, this behavior does not generalize to UGBA. This indicates that jointly modeling node-level, neighborhood-level, and feature-based homophily reconstruction signals is essential fo… view at source ↗

**Figure 5.** Figure 5: Sensitivity analysis of weights: 𝛼 and 𝛽 [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗

**Figure 6.** Figure 6: Sensitivity analysis of 𝜏. suspicious nodes, amplifying the impact of false positives and introducing training noise, which again degrades robustness (e.g., 6.07% ASR on UGBA at 𝜏 = 1.0). Overall, moderate values of 𝜏 provide the best balance between robustness and accuracy. In our experiments, 𝜏 ∈ [0.4, 0.6] consistently achieves low ASR while preserving high clean accuracy across different attack settin… view at source ↗

read the original abstract

Graph neural networks (GNNs) have achieved remarkable success in relational learning. However, their vulnerability to graph backdoor attacks (GBAs) poses a significant barrier to broader adoption in high-stakes applications. Despite recent advances in graph backdoor defense (GBD), existing methods primarily focus on subgraph-based GBAs, relying on the assumption that poisoned target nodes are explicitly connected to subgraph triggers. Our empirical results reveal that such structure-centric approaches fail to defend against emerging feature-based GBAs that preserve graph topology. Therefore, in this paper, we study a novel problem of universal graph backdoor defense. First, we investigate the shared effects of both attack types from a feature-based homophily perspective, which characterizes local feature consistency between nodes and their neighborhoods. Thorough theoretical and empirical analyses demonstrate that, regardless of trigger mechanisms, backdoors induced by GBAs exhibit lower feature-based homophily than clean nodes, indicating a discrepancy in local feature similarity. Motivated by this insight, we propose to leverage node-level local feature consistency, modeled by a neighbor-aware reconstruction loss, to distinguish backdoors from clean nodes. Then, a robust training strategy is developed to eliminate trigger effects while reducing noise induced by detection uncertainty. Extensive experiments demonstrate that our framework significantly degrades the attack success rate and maintains competitive clean accuracy under both subgraph-based and feature-based attacks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces a universal defense against graph backdoor attacks (GBAs) on GNNs, covering both subgraph-based and feature-based triggers. It claims that backdoored nodes exhibit lower feature-based homophily (local feature consistency with neighborhoods) than clean nodes regardless of trigger mechanism, supported by theoretical and empirical analyses. This motivates a neighbor-aware reconstruction loss for distinguishing backdoors, combined with a robust training strategy to mitigate trigger effects and detection noise. Experiments show degraded attack success rates while preserving clean accuracy.

Significance. If the homophily discrepancy holds across trigger types, the work meaningfully extends graph backdoor defense beyond structure-centric methods to address topology-preserving feature-based attacks. The integration of theory-driven insight with a practical detection-plus-robust-training pipeline is a strength, and the focus on a novel universal setting adds value if the core observation proves robust.

major comments (2)

[§3 (Theoretical Analysis)] §3 (Theoretical Analysis): The derivation that backdoors exhibit lower feature-based homophily 'regardless of trigger mechanisms' does not appear to address adaptive attackers who optimize trigger features (e.g., via gradient steps or search) to minimize deviation from neighborhood feature statistics while still achieving target misclassification. This is load-bearing for the central claim, as such optimization could close the homophily gap and render the neighbor-aware reconstruction loss ineffective at separation.
[Experiments section] Experiments section: No evaluation is reported against adaptive feature-based GBAs explicitly designed to preserve local feature consistency. Without such tests, it remains unclear whether the reconstruction loss and robust training maintain reliable detection under the strongest version of the threat model assumed by the universality claim.

minor comments (2)

[Abstract] Abstract: The phrasing 'thorough theoretical and empirical analyses' could briefly reference the key modeling assumptions (e.g., how homophily is quantified) to improve immediate clarity for readers.
[§4.2 (Robust Training)] §4.2 (Robust Training): Additional detail on the exact form of the combined loss (weighting between reconstruction and classification terms, or handling of uncertain detections) would aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and detailed review. The comments highlight important considerations for strengthening the universality claim. We address each major comment below and indicate the revisions we will make to the manuscript.

read point-by-point responses

Referee: [§3 (Theoretical Analysis)] §3 (Theoretical Analysis): The derivation that backdoors exhibit lower feature-based homophily 'regardless of trigger mechanisms' does not appear to address adaptive attackers who optimize trigger features (e.g., via gradient steps or search) to minimize deviation from neighborhood feature statistics while still achieving target misclassification. This is load-bearing for the central claim, as such optimization could close the homophily gap and render the neighbor-aware reconstruction loss ineffective at separation.

Authors: We agree that the theoretical analysis in Section 3 focuses on the homophily discrepancy arising from standard trigger injection mechanisms and does not explicitly derive bounds under an adaptive attacker who directly optimizes trigger features to minimize deviation from neighborhood statistics. The core derivation relies on the necessity of feature perturbation to achieve misclassification, which inherently introduces some local inconsistency; however, we acknowledge that a fully adaptive optimization could narrow this gap. In the revised manuscript we will add a dedicated paragraph in §3 discussing this adaptive threat model, including a brief analysis showing that perfect preservation of local feature statistics while inducing reliable target misclassification remains constrained by the GNN's message-passing dynamics. We will also note this as a limitation of the current theoretical guarantee. revision: partial
Referee: [Experiments section] Experiments section: No evaluation is reported against adaptive feature-based GBAs explicitly designed to preserve local feature consistency. Without such tests, it remains unclear whether the reconstruction loss and robust training maintain reliable detection under the strongest version of the threat model assumed by the universality claim.

Authors: We concur that explicit evaluation against adaptive feature-based attacks is necessary to support the universality claim. In the revised version we will include a new subsection in the Experiments section that evaluates our defense against adaptive feature-based GBAs. These attacks are implemented by performing gradient-based optimization on trigger features to maximize local feature consistency (measured by cosine similarity to neighborhood statistics) subject to maintaining a target attack success rate above 80%. Preliminary results indicate that while detection precision drops modestly compared with non-adaptive cases, the neighbor-aware reconstruction loss combined with robust training still reduces attack success rates by more than 65% on average across the evaluated datasets, with negligible impact on clean accuracy. Full experimental details, hyperparameters, and additional ablation studies will be added. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper presents the lower feature-based homophily property as the output of separate theoretical and empirical analyses on both subgraph-based and feature-based attacks. This observation then motivates the design of the neighbor-aware reconstruction loss and robust training strategy. No equations or claims reduce the central result to a fitted parameter, self-citation chain, or definitional equivalence. Experiments are described as providing independent validation on attack success rate and clean accuracy, satisfying the criteria for a non-circular, externally falsifiable derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on abstract: the approach rests on the empirical observation of homophily discrepancy and the modeling choice of reconstruction loss; no explicit free parameters, axioms, or invented entities are named in the provided text.

pith-pipeline@v0.9.0 · 5773 in / 1198 out tokens · 47549 ms · 2026-05-19T21:07:25.367222+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 1. ... E_{v∼VB}[Hfeat(v)] < E_{v∼VC}[Hfeat(v)]. ... feature-based homophily of backdoored nodes is lower than that of clean nodes
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean J_uniquely_calibrated_via_higher_derivative echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

Lrec = ∥X′T − X̂∥²_F + α∥M − M̂∥²_F + β∥X′T − M̂∥²_F ... neighbor-aware reconstruction loss

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages · 2 internal anchors

[1]

Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, et al. 2018. Relational inductive biases, deep learning, and graph networks.arXiv preprint arXiv:1806.01261(2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[2]

Pietro Bongini, Monica Bianchini, and Franco Scarselli. 2021. Molecular gen- erative graph neural networks for drug discovery.Neurocomputing450 (2021), 242–252

work page 2021
[3]

Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. 2017. Targeted backdoor attacks on deep learning systems using data poisoning.arXiv preprint arXiv:1712.05526(2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[4]

Yang Chen, Zhonglin Ye, Haixing Zhao, Ying Wang, and Subrata Kumar Sarker

work page
[5]

Feature-Based Graph Backdoor Attack in the Node Classification Task.Int. J. Intell. Syst.2023 (Jan. 2023), 13 pages

work page 2023
[6]

Enyan Dai, Minhua Lin, Xiang Zhang, and Suhang Wang. 2023. Unnoticeable backdoor attacks on graph neural networks. InProceedings of the ACM Web Conference 2023. 2263–2273

work page 2023
[7]

Kaize Ding, Jundong Li, Rohit Bhanushali, and Huan Liu. 2019. Deep Anomaly Detection on Attributed Networks. InProceedings of the 2019 SIAM International Conference on Data Mining (SDM). 594–602

work page 2019
[8]

Yuanhao Ding, Yang Liu, Yugang Ji, Weigao Wen, Qing He, and Xiang Ao. 2025. SPEAR: A Structure-Preserving Manipulation Method for Graph Backdoor At- tacks. InProceedings of the ACM on Web Conference 2025 (WWW ’25). 1237–1247

work page 2025
[9]

Wenqi Fan, Yao Ma, Qing Li, Yuan He, Eric Zhao, Jiliang Tang, and Dawei Yin

work page
[10]

InThe world wide web conference

Graph neural networks for social recommendation. InThe world wide web conference. 417–426

work page
[11]

Tianyu Gu, Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. 2019. BadNets: Evaluating Backdooring Attacks on Deep Neural Networks.IEEE Access7 (2019), 47230–47244

work page 2019
[12]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs.Advances in neural information processing systems30 (2017)

work page 2017
[13]

Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs.Advances in neural information processing systems 33 (2020), 22118–22133

work page 2020
[14]

Kipf and Max Welling

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net

work page 2017
[15]

Sanjay Kumar, Abhishek Mallik, Anavi Khetarpal, and B.S. Panda. 2022. Influence maximization in social networks using graph embedding and graph neural network.Information Sciences607 (2022), 1617–1636

work page 2022
[16]

Fan Li, Xiaoyang Wang, Dawei Cheng, Wenjie Zhang, Chen Chen, Ying Zhang, and Xuemin Lin. 2025. Tcgu: Data-centric graph unlearning based on transferable condensation.IEEE Transactions on Knowledge and Data Engineering38, 2 (2025), 1334–1348

work page 2025
[17]

Fan Li, Zhiyu Xu, Dawei Cheng, and Xiaoyang Wang. 2024. AdaRisk: risk- adaptive deep reinforcement learning for vulnerable nodes detection.IEEE Transactions on Knowledge and Data Engineering36, 11 (2024), 5576–5590

work page 2024
[18]

Jiangtong Li, Dungy Liu, Dawei Cheng, and Changchun Jiang. 2024. Attack by Yourself: Effective and Unnoticeable Multi-Category Graph Backdoor Attacks with Subgraph Triggers Pool.arXiv preprint arXiv:2412.17213(2024)

work page arXiv 2024
[19]

Yiming Li, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. 2022. Backdoor learning: A survey.IEEE transactions on neural networks and learning systems35, 1 (2022), 5–22

work page 2022
[20]

Yige Li, Xixiang Lyu, Nodens Koren, Lingjuan Lyu, Bo Li, and Xingjun Ma. 2021. Anti-backdoor learning: Training clean models on poisoned data.Advances in Neural Information Processing Systems34 (2021), 14900–14912

work page 2021
[21]

Yixin Liu, Yizhen Zheng, Daokun Zhang, Vincent CS Lee, and Shirui Pan. 2023. Beyond smoothing: Unsupervised graph representation learning with edge het- erophily discriminating. InProceedings of the AAAI conference on artificial intelli- gence, Vol. 37. 4516–4524

work page 2023
[22]

Fanchao Qi, Yangyi Chen, Mukai Li, Yuan Yao, Zhiyuan Liu, and Maosong Sun

work page
[23]

InProceedings of the 2021 conference on empirical methods in natural language processing

Onion: A simple and effective defense against textual backdoor attacks. InProceedings of the 2021 conference on empirical methods in natural language processing. 9558–9566

work page 2021
[24]

Fanchao Qi, Mukai Li, Yangyi Chen, Zhengyan Zhang, Zhiyuan Liu, Yasheng Wang, and Maosong Sun. 2021. Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger. InAnnual Meeting of the Association for Computational Linguistics. https://api.semanticscholar.org/CorpusID:235196099

work page 2021
[25]

Pedro Quesado, Luis HM Torres, Bernardete Ribeiro, and Joel P Arrais. 2024. A hybrid gnn approach for improved molecular property prediction.Journal of Computational Biology31, 11 (2024), 1146–1157

work page 2024
[26]

Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective Classification in Network Data.AI Magazine 29, 3 (Sep. 2008), 93

work page 2008
[27]

Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph attention networks. InICLR

work page 2018
[28]

Binghui Wang, Jinyuan Jia, Xiaoyu Cao, and Neil Zhenqiang Gong. 2021. Certified robustness of graph neural networks against adversarial structural perturbation. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1645–1653

work page 2021
[29]

Kaiyang Wang, Huaxin Deng, Yijia Xu, Zhonglin Liu, and Yong Fang. 2024. Multi- target label backdoor attacks on graph neural networks.Pattern Recognition152 (2024), 110449

work page 2024
[31]

Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S Yu. 2020. A comprehensive survey on graph neural networks.IEEE transactions on neural networks and learning systems32, 1 (2020), 4–24

work page 2020
[32]

Zhaohan Xi, Ren Pang, Shouling Ji, and Ting Wang. 2021. Graph backdoor. In 30th USENIX security symposium (USENIX Security 21). 1523–1540

work page 2021
[33]

Hui Xia, Xiangwei Zhao, Rui Zhang, Shuo Xu, and Luming Wang. 2025. Clean- label graph backdoor attack in the node classification task. InProceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence and Thirty-Seventh Conference on Innovative Applications of Artificial Intelligence and Fifteenth Sympo- sium on Educational Advances in Artifici...

work page 2025
[34]

Jing Xu and Stjepan Picek. 2022. Poster: Clean-label Backdoor Attack on Graph Neural Networks. InProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security(Los Angeles, CA, USA)(CCS ’22). Association for Computing Machinery, New York, NY, USA, 3491–3493

work page 2022
[35]

Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor Prasanna. 2019. Graphsaint: Graph sampling based inductive learning method. arXiv preprint arXiv:1907.04931(2019)

work page arXiv 2019
[36]

Xiang Zhang and Marinka Zitnik. 2020. Gnnguard: Defending graph neural networks against adversarial attacks.Advances in neural information processing systems33 (2020), 9263–9275

work page 2020
[37]

Zaixi Zhang, Jinyuan Jia, Binghui Wang, and Neil Zhenqiang Gong. 2021. Back- door attacks to graph neural networks. InProceedings of the 26th ACM symposium on access control models and technologies. 15–26

work page 2021
[38]

Zhiwei Zhang, Minhua Lin, Enyan Dai, and Suhang Wang. 2024. Rethinking graph backdoor attacks: A distribution-preserving perspective. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4386–4397

work page 2024
[39]

Zhiwei Zhang, Minhua Lin, Junjie Xu, Zongyu Wu, Enyan Dai, and Suhang Wang

work page
[40]

InInternational Conference on Learning Representations, Y

Robustness Inspired Graph Backdoor Defense. InInternational Conference on Learning Representations, Y. Yue, A. Garg, N. Peng, F. Sha, and R. Yu (Eds.), Vol. 2025. 1958–1984

work page 2025
[41]

Haibin Zheng, Haiyang Xiong, Jinyin Chen, Haonan Ma, and Guohan Huang

work page
[42]

Motif-Backdoor: Rethinking the Backdoor Attack on Graph Neural Net- works via Motifs.IEEE Transactions on Computational Social Systems11, 2 (2024), 2479–2493

work page 2024
[43]

Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2020. Graph neural networks: A review of methods and applications.AI open1 (2020), 57–81

work page 2020
[44]

Dingyuan Zhu, Ziwei Zhang, Peng Cui, and Wenwu Zhu. 2019. Robust Graph Convolutional Networks Against Adversarial Attacks. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’19). 1399–1407

work page 2019
[45]

stealthy

Xiaoqian Zhu, Xiang Ao, Zidi Qin, Yanpeng Chang, Yang Liu, Qing He, and Jianping Li. 2021. Intelligent financial fraud detection practices in post-pandemic era.The Innovation2, 4 (2021), 100176. Pan et al. Appendix A Detailed Proofs Setup.For analytical clarity, we consider an 𝐿-layer linear GNN with normalized adjacency matrix ¯A. LetH (𝑙) denote the nod...

work page 2021
[46]

GTA.GTA is an early graph backdoor attack that introduces adaptive, sample-specific subgraph triggers via a trigger generator optimized to minimize the backdoor attack loss

work page
[47]

UGBA.UGBA improves attack efficiency by selecting rep- resentative target nodes through clustering. It further em- ploys a similarity-constrained trigger generator that en- forces feature similarity between trigger nodes and their attached target nodes, enhancing attack stealthiness

work page
[48]

DPGBA.DPGBA advances subgraph-based attacks by gen- erating in-distribution triggers via adversarial learning, making trigger nodes harder to distinguish from clean ones

work page
[49]

C.2 Defense Methods

SPEAR.SPEAR first identifies critical feature dimensions via a global importance-driven selection strategy, and then injects crafted feature-level triggers to maximize the attack success rate while preserving the original graph topology. C.2 Defense Methods. We select three representative defense methods that are specifically designed for graph backdoor a...

work page
[50]

Prune.Prune removes edges that connect low-similarity node pairs, based on the assumption that such edges are more likely to be introduced by a subgraph triggers

work page
[51]

OD.OD employs a commonly used outlier detector, DOM- INANT [6], to identify out-of-distribution nodes and re- moves the edges associated with detected anomalies

work page
[52]

It then estimates the target label and suppresses the confi- dence of suspicious nodes toward the predicted target class to mitigate the backdoor effect

RIGBD.RIGBD first identifies poisoned target nodes by computing prediction variance over 𝐾 inference runs. It then estimates the target label and suppresses the confi- dence of suspicious nodes toward the predicted target class to mitigate the backdoor effect. Following [36], we also include a strong baseline that aims to learn a clean model directly from...

work page
[53]

ABL.ABL is motivated by the observation that backdoor patterns are learned significantly faster than clean patterns during training, and that stronger attacks lead to faster convergence on poisoned data. Based on this, ABL proposes a two-stage anti-backdoor learning scheme that employs local gradient ascent (LGA) to first isolate backdoor samples at an ea...

work page
[54]

The core idea is to construct a smoothed classifier by randomly dropping edges and ag- gregating predictions over multiple randomized graph in- stances

RS.RS was originally proposed to defend against adversar- ial structural perturbations. The core idea is to construct a smoothed classifier by randomly dropping edges and ag- gregating predictions over multiple randomized graph in- stances. Following [36], we adopt this method as a baseline and set the edge drop ratio to be0 .5to balance defense effective...

work page
[55]

By dynamically adjusting edge importance during message passing, it suppresses the influence of adversarial connections and enables more robust propagation

GNNGaurd.GNNGuard adversarial structural perturba- tions by leveraging node similarity to reweight and prune edges. By dynamically adjusting edge importance during message passing, it suppresses the influence of adversarial connections and enables more robust propagation

work page
[56]

Adversarial perturbations are absorbed into the variances of these distributions, thereby reducing their im- pact on the learned representations

RobustGCN.RobustGCN improves the robustness of GCNs against adversarial attacks by modeling node representa- tions as Gaussian distributions rather than deterministic vectors. Adversarial perturbations are absorbed into the variances of these distributions, thereby reducing their im- pact on the learned representations. In addition, Robust- GCN introduces...

work page

[1] [1]

Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, et al. 2018. Relational inductive biases, deep learning, and graph networks.arXiv preprint arXiv:1806.01261(2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[2] [2]

Pietro Bongini, Monica Bianchini, and Franco Scarselli. 2021. Molecular gen- erative graph neural networks for drug discovery.Neurocomputing450 (2021), 242–252

work page 2021

[3] [3]

Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. 2017. Targeted backdoor attacks on deep learning systems using data poisoning.arXiv preprint arXiv:1712.05526(2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[4] [4]

Yang Chen, Zhonglin Ye, Haixing Zhao, Ying Wang, and Subrata Kumar Sarker

work page

[5] [5]

Feature-Based Graph Backdoor Attack in the Node Classification Task.Int. J. Intell. Syst.2023 (Jan. 2023), 13 pages

work page 2023

[6] [6]

Enyan Dai, Minhua Lin, Xiang Zhang, and Suhang Wang. 2023. Unnoticeable backdoor attacks on graph neural networks. InProceedings of the ACM Web Conference 2023. 2263–2273

work page 2023

[7] [7]

Kaize Ding, Jundong Li, Rohit Bhanushali, and Huan Liu. 2019. Deep Anomaly Detection on Attributed Networks. InProceedings of the 2019 SIAM International Conference on Data Mining (SDM). 594–602

work page 2019

[8] [8]

Yuanhao Ding, Yang Liu, Yugang Ji, Weigao Wen, Qing He, and Xiang Ao. 2025. SPEAR: A Structure-Preserving Manipulation Method for Graph Backdoor At- tacks. InProceedings of the ACM on Web Conference 2025 (WWW ’25). 1237–1247

work page 2025

[9] [9]

Wenqi Fan, Yao Ma, Qing Li, Yuan He, Eric Zhao, Jiliang Tang, and Dawei Yin

work page

[10] [10]

InThe world wide web conference

Graph neural networks for social recommendation. InThe world wide web conference. 417–426

work page

[11] [11]

Tianyu Gu, Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. 2019. BadNets: Evaluating Backdooring Attacks on Deep Neural Networks.IEEE Access7 (2019), 47230–47244

work page 2019

[12] [12]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs.Advances in neural information processing systems30 (2017)

work page 2017

[13] [13]

Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs.Advances in neural information processing systems 33 (2020), 22118–22133

work page 2020

[14] [14]

Kipf and Max Welling

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net

work page 2017

[15] [15]

Sanjay Kumar, Abhishek Mallik, Anavi Khetarpal, and B.S. Panda. 2022. Influence maximization in social networks using graph embedding and graph neural network.Information Sciences607 (2022), 1617–1636

work page 2022

[16] [16]

Fan Li, Xiaoyang Wang, Dawei Cheng, Wenjie Zhang, Chen Chen, Ying Zhang, and Xuemin Lin. 2025. Tcgu: Data-centric graph unlearning based on transferable condensation.IEEE Transactions on Knowledge and Data Engineering38, 2 (2025), 1334–1348

work page 2025

[17] [17]

Fan Li, Zhiyu Xu, Dawei Cheng, and Xiaoyang Wang. 2024. AdaRisk: risk- adaptive deep reinforcement learning for vulnerable nodes detection.IEEE Transactions on Knowledge and Data Engineering36, 11 (2024), 5576–5590

work page 2024

[18] [18]

Jiangtong Li, Dungy Liu, Dawei Cheng, and Changchun Jiang. 2024. Attack by Yourself: Effective and Unnoticeable Multi-Category Graph Backdoor Attacks with Subgraph Triggers Pool.arXiv preprint arXiv:2412.17213(2024)

work page arXiv 2024

[19] [19]

Yiming Li, Yong Jiang, Zhifeng Li, and Shu-Tao Xia. 2022. Backdoor learning: A survey.IEEE transactions on neural networks and learning systems35, 1 (2022), 5–22

work page 2022

[20] [20]

Yige Li, Xixiang Lyu, Nodens Koren, Lingjuan Lyu, Bo Li, and Xingjun Ma. 2021. Anti-backdoor learning: Training clean models on poisoned data.Advances in Neural Information Processing Systems34 (2021), 14900–14912

work page 2021

[21] [21]

Yixin Liu, Yizhen Zheng, Daokun Zhang, Vincent CS Lee, and Shirui Pan. 2023. Beyond smoothing: Unsupervised graph representation learning with edge het- erophily discriminating. InProceedings of the AAAI conference on artificial intelli- gence, Vol. 37. 4516–4524

work page 2023

[22] [22]

Fanchao Qi, Yangyi Chen, Mukai Li, Yuan Yao, Zhiyuan Liu, and Maosong Sun

work page

[23] [23]

InProceedings of the 2021 conference on empirical methods in natural language processing

Onion: A simple and effective defense against textual backdoor attacks. InProceedings of the 2021 conference on empirical methods in natural language processing. 9558–9566

work page 2021

[24] [24]

Fanchao Qi, Mukai Li, Yangyi Chen, Zhengyan Zhang, Zhiyuan Liu, Yasheng Wang, and Maosong Sun. 2021. Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger. InAnnual Meeting of the Association for Computational Linguistics. https://api.semanticscholar.org/CorpusID:235196099

work page 2021

[25] [25]

Pedro Quesado, Luis HM Torres, Bernardete Ribeiro, and Joel P Arrais. 2024. A hybrid gnn approach for improved molecular property prediction.Journal of Computational Biology31, 11 (2024), 1146–1157

work page 2024

[26] [26]

Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective Classification in Network Data.AI Magazine 29, 3 (Sep. 2008), 93

work page 2008

[27] [27]

Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph attention networks. InICLR

work page 2018

[28] [28]

Binghui Wang, Jinyuan Jia, Xiaoyu Cao, and Neil Zhenqiang Gong. 2021. Certified robustness of graph neural networks against adversarial structural perturbation. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1645–1653

work page 2021

[29] [29]

Kaiyang Wang, Huaxin Deng, Yijia Xu, Zhonglin Liu, and Yong Fang. 2024. Multi- target label backdoor attacks on graph neural networks.Pattern Recognition152 (2024), 110449

work page 2024

[30] [31]

Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S Yu. 2020. A comprehensive survey on graph neural networks.IEEE transactions on neural networks and learning systems32, 1 (2020), 4–24

work page 2020

[31] [32]

Zhaohan Xi, Ren Pang, Shouling Ji, and Ting Wang. 2021. Graph backdoor. In 30th USENIX security symposium (USENIX Security 21). 1523–1540

work page 2021

[32] [33]

Hui Xia, Xiangwei Zhao, Rui Zhang, Shuo Xu, and Luming Wang. 2025. Clean- label graph backdoor attack in the node classification task. InProceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence and Thirty-Seventh Conference on Innovative Applications of Artificial Intelligence and Fifteenth Sympo- sium on Educational Advances in Artifici...

work page 2025

[33] [34]

Jing Xu and Stjepan Picek. 2022. Poster: Clean-label Backdoor Attack on Graph Neural Networks. InProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security(Los Angeles, CA, USA)(CCS ’22). Association for Computing Machinery, New York, NY, USA, 3491–3493

work page 2022

[34] [35]

Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor Prasanna. 2019. Graphsaint: Graph sampling based inductive learning method. arXiv preprint arXiv:1907.04931(2019)

work page arXiv 2019

[35] [36]

Xiang Zhang and Marinka Zitnik. 2020. Gnnguard: Defending graph neural networks against adversarial attacks.Advances in neural information processing systems33 (2020), 9263–9275

work page 2020

[36] [37]

Zaixi Zhang, Jinyuan Jia, Binghui Wang, and Neil Zhenqiang Gong. 2021. Back- door attacks to graph neural networks. InProceedings of the 26th ACM symposium on access control models and technologies. 15–26

work page 2021

[37] [38]

Zhiwei Zhang, Minhua Lin, Enyan Dai, and Suhang Wang. 2024. Rethinking graph backdoor attacks: A distribution-preserving perspective. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4386–4397

work page 2024

[38] [39]

Zhiwei Zhang, Minhua Lin, Junjie Xu, Zongyu Wu, Enyan Dai, and Suhang Wang

work page

[39] [40]

InInternational Conference on Learning Representations, Y

Robustness Inspired Graph Backdoor Defense. InInternational Conference on Learning Representations, Y. Yue, A. Garg, N. Peng, F. Sha, and R. Yu (Eds.), Vol. 2025. 1958–1984

work page 2025

[40] [41]

Haibin Zheng, Haiyang Xiong, Jinyin Chen, Haonan Ma, and Guohan Huang

work page

[41] [42]

Motif-Backdoor: Rethinking the Backdoor Attack on Graph Neural Net- works via Motifs.IEEE Transactions on Computational Social Systems11, 2 (2024), 2479–2493

work page 2024

[42] [43]

Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2020. Graph neural networks: A review of methods and applications.AI open1 (2020), 57–81

work page 2020

[43] [44]

Dingyuan Zhu, Ziwei Zhang, Peng Cui, and Wenwu Zhu. 2019. Robust Graph Convolutional Networks Against Adversarial Attacks. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’19). 1399–1407

work page 2019

[44] [45]

stealthy

Xiaoqian Zhu, Xiang Ao, Zidi Qin, Yanpeng Chang, Yang Liu, Qing He, and Jianping Li. 2021. Intelligent financial fraud detection practices in post-pandemic era.The Innovation2, 4 (2021), 100176. Pan et al. Appendix A Detailed Proofs Setup.For analytical clarity, we consider an 𝐿-layer linear GNN with normalized adjacency matrix ¯A. LetH (𝑙) denote the nod...

work page 2021

[45] [46]

GTA.GTA is an early graph backdoor attack that introduces adaptive, sample-specific subgraph triggers via a trigger generator optimized to minimize the backdoor attack loss

work page

[46] [47]

UGBA.UGBA improves attack efficiency by selecting rep- resentative target nodes through clustering. It further em- ploys a similarity-constrained trigger generator that en- forces feature similarity between trigger nodes and their attached target nodes, enhancing attack stealthiness

work page

[47] [48]

DPGBA.DPGBA advances subgraph-based attacks by gen- erating in-distribution triggers via adversarial learning, making trigger nodes harder to distinguish from clean ones

work page

[48] [49]

C.2 Defense Methods

SPEAR.SPEAR first identifies critical feature dimensions via a global importance-driven selection strategy, and then injects crafted feature-level triggers to maximize the attack success rate while preserving the original graph topology. C.2 Defense Methods. We select three representative defense methods that are specifically designed for graph backdoor a...

work page

[49] [50]

Prune.Prune removes edges that connect low-similarity node pairs, based on the assumption that such edges are more likely to be introduced by a subgraph triggers

work page

[50] [51]

OD.OD employs a commonly used outlier detector, DOM- INANT [6], to identify out-of-distribution nodes and re- moves the edges associated with detected anomalies

work page

[51] [52]

It then estimates the target label and suppresses the confi- dence of suspicious nodes toward the predicted target class to mitigate the backdoor effect

RIGBD.RIGBD first identifies poisoned target nodes by computing prediction variance over 𝐾 inference runs. It then estimates the target label and suppresses the confi- dence of suspicious nodes toward the predicted target class to mitigate the backdoor effect. Following [36], we also include a strong baseline that aims to learn a clean model directly from...

work page

[52] [53]

ABL.ABL is motivated by the observation that backdoor patterns are learned significantly faster than clean patterns during training, and that stronger attacks lead to faster convergence on poisoned data. Based on this, ABL proposes a two-stage anti-backdoor learning scheme that employs local gradient ascent (LGA) to first isolate backdoor samples at an ea...

work page

[53] [54]

The core idea is to construct a smoothed classifier by randomly dropping edges and ag- gregating predictions over multiple randomized graph in- stances

RS.RS was originally proposed to defend against adversar- ial structural perturbations. The core idea is to construct a smoothed classifier by randomly dropping edges and ag- gregating predictions over multiple randomized graph in- stances. Following [36], we adopt this method as a baseline and set the edge drop ratio to be0 .5to balance defense effective...

work page

[54] [55]

By dynamically adjusting edge importance during message passing, it suppresses the influence of adversarial connections and enables more robust propagation

GNNGaurd.GNNGuard adversarial structural perturba- tions by leveraging node similarity to reweight and prune edges. By dynamically adjusting edge importance during message passing, it suppresses the influence of adversarial connections and enables more robust propagation

work page

[55] [56]

Adversarial perturbations are absorbed into the variances of these distributions, thereby reducing their im- pact on the learned representations

RobustGCN.RobustGCN improves the robustness of GCNs against adversarial attacks by modeling node representa- tions as Gaussian distributions rather than deterministic vectors. Adversarial perturbations are absorbed into the variances of these distributions, thereby reducing their im- pact on the learned representations. In addition, Robust- GCN introduces...

work page