Certifiable Robustness and Robust Training for Graph Convolutional Networks

Daniel Z\"ugner; Stephan G\"unnemann

arxiv: 1906.12269 · v1 · pith:75JZ7XEMnew · submitted 2019-06-28 · 💻 cs.LG · cs.CR· stat.ML

Certifiable Robustness and Robust Training for Graph Convolutional Networks

Daniel Z\"ugner , Stephan G\"unnemann This is my paper

Pith reviewed 2026-05-25 13:20 UTC · model grok-4.3

classification 💻 cs.LG cs.CRstat.ML

keywords graph convolutional networkscertifiable robustnessadversarial robustnessnode attribute perturbationsL0 bounded attacksrobust trainingsemi-supervised learning

0 comments

The pith

A certification method determines whether graph convolutional network predictions stay fixed under any L0-bounded flips of binary node attributes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents the first procedure that certifies both robustness and non-robustness of GCN outputs against limited changes to binary node features. Certification means a node's predicted label is guaranteed to remain the same no matter which allowed set of attribute flips occurs. The same procedure identifies nodes that can be forced to change label by some allowed perturbation. A joint training method for labeled and unlabeled nodes is introduced that raises the fraction of certified-robust nodes while keeping ordinary accuracy nearly unchanged.

Core claim

We propose the first method for certifiable (non-)robustness of graph convolutional networks with respect to perturbations of the node attributes. We consider the case of binary node attributes (e.g. bag-of-words) and perturbations that are L0-bounded. If a node has been certified with our method, it is guaranteed to be robust under any possible perturbation given the attack model. Likewise, we can certify non-robustness. Finally, we propose a robust semi-supervised training procedure that treats the labeled and unlabeled nodes jointly.

What carries the argument

Certification procedure that computes tight bounds on the possible outputs of GCN message-passing layers over every admissible L0-bounded set of binary attribute flips.

If this is right

Any node certified robust keeps its predicted label against every allowed perturbation.
Nodes certified non-robust are known to have at least one successful attack inside the model.
The robust training step raises the share of certified nodes on both labeled and unlabeled data.
Certification and training apply directly to the semi-supervised setting without separate handling of unlabeled nodes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same bounding technique could be adapted to certify robustness against edge additions or deletions if suitable interval arithmetic for adjacency changes is derived.
Certified-robust GCNs could be used as a building block inside larger pipelines that require guaranteed behavior under feature noise.
The certification might reveal systematic vulnerabilities in particular layers or aggregation functions that are not visible from ordinary accuracy numbers.

Load-bearing premise

The bound computation never underestimates the range of outputs reachable by any allowed attribute flips and never includes outputs that no actual flip sequence can produce.

What would settle it

Discovery of an L0-bounded attribute flip sequence that changes the label of a node the procedure had certified as robust.

Figures

Figures reproduced from arXiv: 1906.12269 by Daniel Z\"ugner, Stephan G\"unnemann.

**Figure 1.** Figure 1: Certificates for a GNN trained with standard training on Cora-ML. 0.0 0.2 0.4 0.6 0.8 1.0 Nb. purity 0 25 50 75 100 Avg. Max Q robust Mean 95% CI [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗

**Figure 4.** Figure 4: Robust training (Cora-ML). Dashed lines are w/o robust training. 0 12 50 100 Certificate w.r.t Q 0 50 100% Nodes Certifiably robust Certifiably nonrobust [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 7.** Figure 7: Difference of Primal and Dual Bound. Tightness of lower bounds: Next, we aim to analyze how tight our dual lower bounds are, which we needed to obtain efficient certification. For this, we analyze (i) the value of дq,Q (·) we obtain from our dual solution (either when optimizing over Ω are using the default value), compared to (ii) the value of the primal solution we obtain using our construction from Se… view at source ↗

**Figure 8.** Figure 8: Training dynamics. 0 12 24 48 0 50 100 % Nodes cert. robust CE RH-U Q = 12 RH-U Q = 24 RH-U Q = 48 [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

**Figure 9.** Figure 9: Robust training, diff. Q the classification accuracy. E.g., training accuracy drops from 99% to 87% when going from Q = 12 to 48, while test accuracy stays almost unchanged (68% vs. 66%) on Citeseer. We attribute this to the fact that the GNN still uses the normal CE loss in addition to our robust hinge loss during training [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗

read the original abstract

Recent works show that Graph Neural Networks (GNNs) are highly non-robust with respect to adversarial attacks on both the graph structure and the node attributes, making their outcomes unreliable. We propose the first method for certifiable (non-)robustness of graph convolutional networks with respect to perturbations of the node attributes. We consider the case of binary node attributes (e.g. bag-of-words) and perturbations that are L_0-bounded. If a node has been certified with our method, it is guaranteed to be robust under any possible perturbation given the attack model. Likewise, we can certify non-robustness. Finally, we propose a robust semi-supervised training procedure that treats the labeled and unlabeled nodes jointly. As shown in our experimental evaluation, our method significantly improves the robustness of the GNN with only minimal effect on the predictive accuracy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper introduces the first certification procedure for GCN robustness to L0-bounded binary node attribute flips, plus a joint robust training step for labeled and unlabeled nodes.

read the letter

The main point is a certification method that can prove a GCN node is robust (or not) to any allowed perturbation of its binary features. The paper also adds a training procedure that improves robustness while keeping accuracy nearly the same, treating labeled and unlabeled nodes together in the semi-supervised setting. This moves past the many empirical attack papers on GNNs to something with explicit guarantees under a concrete attack model. The claim of being first for this setting on GCNs looks accurate from the abstract and prior literature referenced. The joint training idea fits the data regime well and avoids treating robustness as a post-hoc fix. Nothing in the description creates an obvious internal contradiction or circularity. The central argument stands if the bound propagation through the layers is sound, and the stress-test note finds no visible flaw in the stated procedure. One limitation is that the abstract supplies no derivation or example bound computation, so it is impossible to judge how tight the certificates are or how much over-approximation occurs during aggregation. Experiments would need to show that a useful fraction of nodes receive certificates and that the method runs at reasonable scale. This work is aimed at researchers focused on verifiable robustness in graph models rather than pure empirical attacks. A reader in that niche would find the certification technique and the training extension directly usable. It deserves a serious referee because it adds a new capability with a clear problem statement and a concrete procedure, even if the details require verification.

Referee Report

0 major / 2 minor

Summary. The manuscript proposes the first method for certifying (non-)robustness of graph convolutional networks (GCNs) to L0-bounded perturbations of binary node attributes. It claims that certified nodes are guaranteed robust (or non-robust) under the attack model, introduces a robust semi-supervised training procedure that jointly optimizes over labeled and unlabeled nodes, and reports experimental results showing substantially improved robustness with only minimal loss in predictive accuracy.

Significance. If the certification procedure is sound and the bounds are correctly computed through the GCN layers, the work would provide the first formal robustness guarantees for GCNs under attribute perturbations, moving beyond purely empirical defenses. The joint labeled/unlabeled treatment in training and the ability to certify non-robustness are practically useful. The experimental evaluation, if reproducible, would support the claim of improved robustness.

minor comments (2)

The abstract states the central claims at a high level without referencing the specific certification procedure or bound-propagation steps; a brief pointer to the relevant section or algorithm would improve readability for readers scanning the front matter.
Notation for the attack model (L0 bound on binary flips) and the message-passing aggregation should be introduced consistently in the first section that defines the GCN forward pass to avoid later ambiguity.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of our work, recognition of its novelty as the first certification approach for GCN robustness under L0-bounded attribute perturbations, and recommendation for minor revision. We are pleased that the joint labeled/unlabeled training procedure and ability to certify non-robustness were viewed as practically useful.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents a new certification procedure for L0-bounded perturbations on binary node attributes in GCNs, along with a robust training method. The core claim is that the procedure yields sound guarantees by correctly propagating bounds through message-passing and aggregation steps. No equations, fitted parameters, or self-citations are shown to reduce the certification result to a quantity defined by the authors' prior work or by construction from the inputs. The derivation is therefore self-contained against external verification of the bound-propagation logic.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides insufficient technical detail to enumerate free parameters, axioms, or invented entities; the work introduces a new certification procedure whose internal assumptions cannot be audited from the given text.

pith-pipeline@v0.9.0 · 5675 in / 1154 out tokens · 40976 ms · 2026-05-25T13:20:23.395223+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

[1]

Aleksandar Bojchevski and Stephan Günnemann. 2019. Adversarial Attacks on Node Embeddings via Graph Poisoning. In ICML

work page 2019
[2]

2006.Semi-Supervised Learning

Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien. 2006.Semi-Supervised Learning. Adaptive Computation and Machine Learning series . The MIT Press

work page 2006
[3]

Francesco Croce, Maksym Andriushchenko, and Matthias Hein. 2018. Provable robustness of relu networks via maximization of linear regions. In AISTATS

work page 2018
[4]

Hanjun Dai, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, and Le Song. 2018. Adversarial Attack on Graph Structured Data. In ICML

work page 2018
[5]

Michaël Defferrard et al. 2016. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In NIPS

work page 2016
[6]

Dhivya Eswaran, Stephan Günnemann, Christos Faloutsos, Disha Makhija, and Mohit Kumar. 2017. ZooBP: Belief Propagation for Heterogeneous Networks. PVLDB 10, 5 (2017), 625–636

work page 2017
[7]

Schoenholz, Patrick F

Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. 2017. Neural Message Passing for Quantum Chemistry. InICML. 1263–1272

work page 2017
[8]

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In ICLR

work page 2015
[9]

William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In NIPS

work page 2017
[10]

Matthias Hein and Maksym Andriushchenko. 2017. Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation. In NIPS. 2263–2273

work page 2017
[11]

Bryan Hooi, Neil Shah, Alex Beutel, Stephan Günnemann, Leman Akoglu, Mohit Kumar, Disha Makhija, and Christos Faloutsos. 2016. BIRDNEST: Bayesian Inference for Ratings-Fraud Detection. In SDM. 495–503

work page 2016
[12]

Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In ICLR

work page 2017
[13]

Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Pre- dict then Propagate: Graph Neural Networks meet Personalized PageRank. In ICLR

work page 2019
[14]

Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. 2017. Simple and scalable predictive uncertainty estimation using deep ensembles. In NIPS

work page 2017
[15]

Ben London and Lise Getoor. 2014. Collective Classification of Network Data. Data Classification: Algorithms and Applications 399 (2014)

work page 2014
[16]

Andrew Kachites McCallum, Kamal Nigam, Jason Rennie, and Kristie Seymore

work page
[17]

Information Retrieval 3, 2 (2000), 127–163

Automating the construction of internet portals with machine learning. Information Retrieval 3, 2 (2000), 127–163

work page 2000
[18]

Nicolas Papernot et al. 2016. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks. In IEEE Symposium on Security and Privacy

work page 2016
[19]

Aditi Raghunathan, Jacob Steinhardt, and Percy S Liang. 2018. Semidefinite relaxations for certifying robustness to adversarial examples. In NIPS

work page 2018
[20]

Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective classification in network data. AI magazine 29, 3 (2008), 93

work page 2008
[21]

Eric Wong and Zico Kolter. 2018. Provable defenses against adversarial examples via the convex outer adversarial polytope. In ICML. 5283–5292

work page 2018
[22]

Daniel Zügner, Amir Akbarnejad, and Stephan Günnemann. 2018. Adversarial attacks on neural networks for graph data. In SIGKDD. 2847–2856

work page 2018
[23]

Daniel Zügner and Stephan Günnemann. 2019. Adversarial Attacks on Graph Neural Networks via Meta Learning. In ICLR. ACKNOWLEDGEMENTS This research was supported by the German Research Foundation, grant GU 1409/2-1. 8 APPENDIX Implementation Details: We perform the robust training using stochastic gradient descent with mini-batches and Adam Optimizer. For ...

work page 2019

[1] [1]

Aleksandar Bojchevski and Stephan Günnemann. 2019. Adversarial Attacks on Node Embeddings via Graph Poisoning. In ICML

work page 2019

[2] [2]

2006.Semi-Supervised Learning

Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien. 2006.Semi-Supervised Learning. Adaptive Computation and Machine Learning series . The MIT Press

work page 2006

[3] [3]

Francesco Croce, Maksym Andriushchenko, and Matthias Hein. 2018. Provable robustness of relu networks via maximization of linear regions. In AISTATS

work page 2018

[4] [4]

Hanjun Dai, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, and Le Song. 2018. Adversarial Attack on Graph Structured Data. In ICML

work page 2018

[5] [5]

Michaël Defferrard et al. 2016. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In NIPS

work page 2016

[6] [6]

Dhivya Eswaran, Stephan Günnemann, Christos Faloutsos, Disha Makhija, and Mohit Kumar. 2017. ZooBP: Belief Propagation for Heterogeneous Networks. PVLDB 10, 5 (2017), 625–636

work page 2017

[7] [7]

Schoenholz, Patrick F

Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, and George E. Dahl. 2017. Neural Message Passing for Quantum Chemistry. InICML. 1263–1272

work page 2017

[8] [8]

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In ICLR

work page 2015

[9] [9]

William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In NIPS

work page 2017

[10] [10]

Matthias Hein and Maksym Andriushchenko. 2017. Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation. In NIPS. 2263–2273

work page 2017

[11] [11]

Bryan Hooi, Neil Shah, Alex Beutel, Stephan Günnemann, Leman Akoglu, Mohit Kumar, Disha Makhija, and Christos Faloutsos. 2016. BIRDNEST: Bayesian Inference for Ratings-Fraud Detection. In SDM. 495–503

work page 2016

[12] [12]

Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In ICLR

work page 2017

[13] [13]

Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Pre- dict then Propagate: Graph Neural Networks meet Personalized PageRank. In ICLR

work page 2019

[14] [14]

Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. 2017. Simple and scalable predictive uncertainty estimation using deep ensembles. In NIPS

work page 2017

[15] [15]

Ben London and Lise Getoor. 2014. Collective Classification of Network Data. Data Classification: Algorithms and Applications 399 (2014)

work page 2014

[16] [16]

Andrew Kachites McCallum, Kamal Nigam, Jason Rennie, and Kristie Seymore

work page

[17] [17]

Information Retrieval 3, 2 (2000), 127–163

Automating the construction of internet portals with machine learning. Information Retrieval 3, 2 (2000), 127–163

work page 2000

[18] [18]

Nicolas Papernot et al. 2016. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks. In IEEE Symposium on Security and Privacy

work page 2016

[19] [19]

Aditi Raghunathan, Jacob Steinhardt, and Percy S Liang. 2018. Semidefinite relaxations for certifying robustness to adversarial examples. In NIPS

work page 2018

[20] [20]

Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective classification in network data. AI magazine 29, 3 (2008), 93

work page 2008

[21] [21]

Eric Wong and Zico Kolter. 2018. Provable defenses against adversarial examples via the convex outer adversarial polytope. In ICML. 5283–5292

work page 2018

[22] [22]

Daniel Zügner, Amir Akbarnejad, and Stephan Günnemann. 2018. Adversarial attacks on neural networks for graph data. In SIGKDD. 2847–2856

work page 2018

[23] [23]

Daniel Zügner and Stephan Günnemann. 2019. Adversarial Attacks on Graph Neural Networks via Meta Learning. In ICLR. ACKNOWLEDGEMENTS This research was supported by the German Research Foundation, grant GU 1409/2-1. 8 APPENDIX Implementation Details: We perform the robust training using stochastic gradient descent with mini-batches and Adam Optimizer. For ...

work page 2019