Graph Reconstruction from Differentially Private GNN Explanations
Pith reviewed 2026-05-07 17:18 UTC · model grok-4.3
The pith
Differential privacy on GNN explanations fails to protect hidden graph structure from reconstruction
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
An adversary observing only DP-perturbed GNN explanations can reconstruct hidden graph structure with high accuracy. The attack PRIVX exploits the fact that the Gaussian DP mechanism is a single DDPM forward step at known noise level sigma(epsilon), recasting reconstruction as reverse diffusion conditioned on the corrupted signal. A stratified adversary model parameterised by (M, epsilon-hat, delta-hat, S, rho) yields endpoint-matched bounds on reconstruction AUC. Experiments across seven benchmarks, three DP mechanisms and three GNN backbones show AUC above 0.7 at epsilon=5 on five datasets, with neighbourhood-aggregating explainers leaking more on homophilic graphs and the ordering reverse
What carries the argument
PRIVX attack, which treats the DP perturbation as a known-variance DDPM forward step and performs Bayesian denoising via conditioned reverse diffusion to recover adjacency information
If this is right
- Neighbourhood-aggregating explainers leak more structure than per-node gradient explainers on homophilic graphs under the same DP budget
- On strongly heterophilic graphs the leakage ordering of explainers reverses
- The auxiliary diagnostic PRIVF decomposes total leakage into explainer-induced and intrinsic graph-distribution components
- Reconstruction AUC exceeds 0.7 at epsilon=5 on five of the seven evaluated benchmarks
Where Pith is reading between the lines
- The same diffusion equivalence could be used to attack DP releases of other continuous signals beyond explanations
- Stronger DP parameters or non-Gaussian mechanisms may be required before explanations can be released without structural leakage
- The stratified adversary model allows systematic study of how much auxiliary knowledge an attacker needs to succeed
Load-bearing premise
The Gaussian DP mechanism adds noise whose variance is known exactly from epsilon and matches exactly one forward diffusion step
What would settle it
Running the reconstruction while deliberately using a noise variance in the reverse process that differs from the variance implied by the released epsilon, and observing whether AUC collapses to random guessing
Figures
read the original abstract
Regulatory frameworks such as GDPR increasingly require that ML predictions be accompanied by post-hoc explanations, even when raw data and trained models cannot be released. Differential privacy (DP) is the standard mitigation for the residual privacy risk of releasing these explanations. We show that DP is not sufficient: an adversary observing only DP-perturbed GNN explanations can reconstruct hidden graph structure with high accuracy. Our attack, PRIVX, exploits the fact that the Gaussian DP mechanism is a single DDPM forward step at known noise level {\sigma}({\epsilon}), recasting reconstruction as reverse diffusion conditioned on the corrupted signal, a principled Bayesian denoiser under known DP corruption. We formalise a stratified adversary model parameterised by (M, \hat{\epsilon}, \hat{\delta}, S, \rho) that interpolates between oblivious and oracle attackers, and derive endpoint-matched two-sided bounds on reconstruction AUC. For practitioners, we provide regime-stratified guidance on explainer choice: on homophilic graphs, neighbourhood-aggregating explainers (GraphLIME, GNNExplainer) leak more structure than per-node gradient explainers under the same DP budget; on strongly heterophilic graphs the ordering reverses. We introduce PRIVF as an auxiliary diagnostic sharing the same diffusion backbone to decompose leakage into explainer-induced and intrinsic graph-distribution components. Experiments across seven benchmarks, three DP mechanisms, and three GNN backbones show PRIVX achieves AUC above 0.7 at {\epsilon} = 5 on five of seven datasets, with the attack succeeding well within typically deployed privacy budgets.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that differentially private GNN explanations leak sufficient information for an adversary to reconstruct the underlying graph structure with high accuracy. The proposed PRIVX attack recasts the Gaussian DP mechanism as a single DDPM forward step at known noise level σ(ε), enabling reconstruction via conditioned reverse diffusion as a Bayesian denoiser. It formalizes a stratified adversary model parameterized by (M, ε̂, δ̂, S, ρ), derives endpoint-matched two-sided AUC bounds, provides regime-stratified guidance on explainer choice (neighbourhood-aggregating vs. gradient-based) depending on homophily, and introduces PRIVF as a diagnostic to decompose leakage. Experiments across seven benchmarks, three DP mechanisms, and three GNN backbones report AUC above 0.7 at ε=5 on five datasets.
Significance. If the attack and its supporting derivation hold, the result is significant because it shows that releasing DP-perturbed post-hoc explanations does not adequately protect graph structure, with direct implications for regulatory requirements such as GDPR. The stratified adversary model and explainer-specific guidance are practical contributions, and the multi-dataset, multi-mechanism experiments provide a broad empirical basis. The work does not mention machine-checked proofs or fully reproducible code artifacts, but the introduction of PRIVF for leakage decomposition is a useful methodological addition.
major comments (2)
- [§3.2] §3.2 (Gaussian-to-DDPM equivalence): The central modeling step asserts that the Gaussian DP mechanism (noisy explanation = clean explanation + N(0, σ(ε)² I)) is exactly a DDPM forward step at known noise level σ(ε). This is incorrect because the standard DDPM forward process is x_t = sqrt(1-β_t) x_0 + sqrt(β_t) ε and therefore attenuates the clean signal by sqrt(1-β_t) < 1 when σ(ε) is non-negligible (typical privacy budgets ε=1–5). The paper's Bayesian denoiser, reverse-process mean predictor, and derived two-sided AUC bounds implicitly assume the unscaled additive case; the mismatch renders the posterior inexact and the principled justification for the reported AUC values unsupported.
- [§4] §4 (Adversary model and bounds): The endpoint-matched two-sided AUC bounds are derived under the stated (M, ε̂, δ̂, S, ρ) model, but the derivation does not address how the scaling mismatch in the forward process propagates into the posterior used for reconstruction. Because this is load-bearing for the claim that the attack is 'principled' rather than purely empirical, the bounds require re-derivation or an explicit error analysis.
minor comments (2)
- [Abstract] The abstract states 'seven benchmarks' while the title and core claims focus on graphs; a brief clarification in the introduction or §5 would prevent reader confusion about whether all datasets are graphs.
- [§2] Notation for the adversary parameters uses hats on ε̂ and δ̂; ensure these are defined at first use in §2 and used consistently in all equations and tables.
Simulated Author's Rebuttal
We thank the referee for the careful and constructive review. The comments identify a key modeling distinction that we will address through targeted revisions to improve clarity and rigor.
read point-by-point responses
-
Referee: [§3.2] §3.2 (Gaussian-to-DDPM equivalence): The central modeling step asserts that the Gaussian DP mechanism (noisy explanation = clean explanation + N(0, σ(ε)² I)) is exactly a DDPM forward step at known noise level σ(ε). This is incorrect because the standard DDPM forward process is x_t = sqrt(1-β_t) x_0 + sqrt(β_t) ε and therefore attenuates the clean signal by sqrt(1-β_t) < 1 when σ(ε) is non-negligible (typical privacy budgets ε=1–5). The paper's Bayesian denoiser, reverse-process mean predictor, and derived two-sided AUC bounds implicitly assume the unscaled additive case; the mismatch renders the posterior inexact and the principled justification for the reported AUC values unsupported.
Authors: We acknowledge that the standard DDPM forward process includes the sqrt(1-β_t) scaling. Our formulation in §3.2 models the DP mechanism as direct additive Gaussian noise without attenuation, which is a deliberate simplification equivalent to a diffusion step under a variance schedule where the scaling factor is taken as unity. This yields a tractable Bayesian denoiser while preserving the known noise level σ(ε). We will revise §3.2 to state this modeling choice explicitly, contrast it with the full scaled DDPM process, and note the conditions under which the approximation holds. revision: yes
-
Referee: [§4] §4 (Adversary model and bounds): The endpoint-matched two-sided AUC bounds are derived under the stated (M, ε̂, δ̂, S, ρ) model, but the derivation does not address how the scaling mismatch in the forward process propagates into the posterior used for reconstruction. Because this is load-bearing for the claim that the attack is 'principled' rather than purely empirical, the bounds require re-derivation or an explicit error analysis.
Authors: We agree that the AUC bounds in §4 rest on the unscaled additive model. We will augment §4 with an explicit error analysis that bounds the discrepancy introduced by any scaling factor and, where feasible, sketch the re-derivation under the scaled forward process. This will clarify the scope of the principled claim without altering the empirical attack results. revision: yes
Circularity Check
No significant circularity; derivation self-contained under explicit modeling assumption
full rationale
The paper states the Gaussian DP mechanism equivalence to a DDPM forward step as an explicit premise, then uses it to recast reconstruction as conditioned reverse diffusion and derive the stratified adversary model (M, ε̂, δ̂, S, ρ) plus endpoint-matched AUC bounds. This chain does not reduce any central result to a fitted parameter renamed as prediction, nor does it rely on load-bearing self-citation or self-definitional loops. The bounds and PRIVX/PRIVF diagnostics follow from the stated assumptions and are validated empirically across datasets; any scaling mismatch with standard DDPM is a correctness issue with the modeling choice, not a circular reduction by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Gaussian DP mechanism equals a single DDPM forward step at known sigma(epsilon)
Reference graph
Works this paper leans on
-
[1]
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps.Workshop at the International Conference on Learning Representations (ICLR), 2014
work page 2014
-
[2]
Axiomatic attribution for deep networks
Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. InProceedings of the 34th International Conference on Machine Learning (ICML), pages 3319–3328, 2017
work page 2017
-
[3]
GNNExplainer: Generating explanations for graph neural networks
Rex Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, and Jure Leskovec. GNNExplainer: Generating explanations for graph neural networks. InAdvances in Neural Information Pro- cessing Systems (NeurIPS), volume 32, 2019
work page 2019
-
[4]
Qiang Huang, Makoto Yamada, Yuan Tian, Danica Singh, Dawei Yin, and Yi Chang. GraphLIME: Local interpretable model explanations for graph neural networks.IEEE Transac- tions on Knowledge and Data Engineering, 2022
work page 2022
-
[5]
Olatunji, Wolfgang Nejdl, and Meike Khosla
Iyiola E. Olatunji, Wolfgang Nejdl, and Meike Khosla. Private graph extraction via feature explanations. InProceedings on Privacy Enhancing Technologies (PoPETS), 2023
work page 2023
-
[6]
Calibrating noise to sensitivity in private data analysis
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. InTheory of Cryptography Conference (TCC), pages 265–284. Springer, 2006
work page 2006
-
[7]
Ilya Mironov. Rényi differential privacy. InIEEE 30th Computer Security Foundations Symposium (CSF), pages 263–275. IEEE, 2017
work page 2017
-
[8]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.Advances in Neural Information Processing Systems (NeurIPS), 33:6840–6851, 2020
work page 2020
-
[9]
Denoising diffusion restoration models
Bahjat Kawar, Michael Elad, Stefano Ermon, and Jiaming Song. Denoising diffusion restoration models. InAdvances in Neural Information Processing Systems (NeurIPS), volume 35, 2022
work page 2022
-
[10]
Jinshuo Dong, Aaron Roth, and Weijie J. Su. Gaussian differential privacy.Journal of the Royal Statistical Society: Series B, 84(1):3–37, 2022
work page 2022
-
[11]
Stealing links from graph neural networks
Xinlei He, Jinyuan Jia, Michael Backes, Neil Zhenqiang Gong, and Yang Zhang. Stealing links from graph neural networks. InProceedings of the 30th USENIX Security Symposium, 2021
work page 2021
-
[12]
GraphMI: Extracting private graph data from graph neural networks
Zaixi Zhang, Qi Liu, Zhenya Huang, Hao Wang, Chengqiang Lu, Chuanren Liu, and Enhong Chen. GraphMI: Extracting private graph data from graph neural networks. InProceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI), 2021
work page 2021
-
[13]
Quantifying privacy leakage in graph embedding
Vasisht Duddu, Antoine Boutet, and Virat Shejwalkar. Quantifying privacy leakage in graph embedding. InPrivacy Enhancing Technologies Symposium (PETS), 2020
work page 2020
-
[14]
Graph embedding reconstruction attack
Yang Zhang et al. Graph embedding reconstruction attack. InAdvances in Neural Information Processing Systems (NeurIPS), 2024. NeurIPS 2024
work page 2024
-
[15]
Johnson covering numbers, approximate differential privacy, and applications
Jeremiah Blocki, Avrim Blum, Anupam Datta, and Or Sheffet. Johnson covering numbers, approximate differential privacy, and applications. InProceedings of the 4th Conference on Innovations in Theoretical Computer Science (ITCS), 2012
work page 2012
-
[16]
Private analysis of graph structure
Vishesh Karwa, Sofya Raskhodnikova, Adam Smith, and Grigory Yaroslavtsev. Private analysis of graph structure. InACM Transactions on Database Systems, volume 39, 2014
work page 2014
-
[17]
Node-level differentially private graph neural networks
Ameya Daigavane, Gagan Madan, Aditya Aggarwal, Anirban Agarwal, Prateek Gupta, Prateek Jain, and Mausam. Node-level differentially private graph neural networks. InWorkshop on Privacy in Machine Learning (PriML), NeurIPS, 2021
work page 2021
-
[18]
GAP: Differentially private graph neural networks with aggregation perturbation
Sina Sajadmanesh, Stan Matwin, Majid Nategh, and Ali Mohammadi. GAP: Differentially private graph neural networks with aggregation perturbation. InProceedings of the 32nd USENIX Security Symposium, 2023
work page 2023
-
[19]
Wenxuan Zhu et al. Private Gomory–Hu trees. InAdvances in Neural Information Processing Systems (NeurIPS), 2025. NeurIPS 2025
work page 2025
-
[20]
Catarina Meng, Mia Rinberg, and Michael Kearns. Interpretability and privacy in machine learn- ing: Studying the trade-off through attribution methods. InWorkshop on Privacy-Preserving Machine Learning, NeurIPS, 2022. 10
work page 2022
-
[21]
Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole
Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations (ICLR), 2021
work page 2021
-
[22]
Johnson, Jonathan Ho, Daniel Tarlow, and Rianne van den Berg
Jacob Austin, Daniel D. Johnson, Jonathan Ho, Daniel Tarlow, and Rianne van den Berg. Struc- tured denoising diffusion models in discrete state-spaces. InAdvances in Neural Information Processing Systems (NeurIPS), volume 34, 2021
work page 2021
-
[23]
Score-based generative modeling of graphs via the system of stochastic differential equations
Jaehyeong Jo, Seul Lee, and Sung Ju Hwang. Score-based generative modeling of graphs via the system of stochastic differential equations. InInternational Conference on Machine Learning (ICML), 2022
work page 2022
-
[24]
Graphrcg: Self-conditioned graph generation
Gang Liu, Eric Hu, Jiaxin Zeng, and Meng Jiang. Graphrcg: Self-conditioned graph generation. InInternational Conference on Machine Learning (ICML), 2024
work page 2024
-
[25]
Fan Chung and Linyuan Lu. The average distances in random graphs with given expected degrees.Proceedings of the National Academy of Sciences, 99(25):15879–15882, 2002
work page 2002
-
[26]
Automating the construction of internet portals with machine learning
Andrew Kachites McCallum, Kamal Nigam, Jason Rennie, and Kristie Seymore. Automating the construction of internet portals with machine learning. volume 3, pages 127–163, 2000
work page 2000
-
[27]
Collective classification in network data
Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi- Rad. Collective classification in network data. volume 29, pages 93–106, 2008
work page 2008
-
[28]
Query-driven active surveying for collective classification
Galileo Namata, Ben London, Lise Getoor, Bert Huang, and UMD. Query-driven active surveying for collective classification. Technical report, 10th International Workshop on Mining and Learning with Graphs, 2012
work page 2012
-
[29]
Open graph benchmark: Datasets for machine learning on graphs
Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. Open graph benchmark: Datasets for machine learning on graphs. Advances in Neural Information Processing Systems (NeurIPS), 33:22118–22133, 2020
work page 2020
-
[30]
Multi-scale attributed node embedding
Benedek Rozemberczki, Carl Allen, and Rik Sarkar. Multi-scale attributed node embedding. Journal of Complex Networks, 9(2), 2021
work page 2021
-
[31]
A critical look at the state of self-supervised learning for graphs
Oleg Platonov, Denis Kuznedelev, Artem Babenko, and Liudmila Prokhorenkova. A critical look at the state of self-supervised learning for graphs. InInternational Conference on Learning Representations (ICLR), 2023
work page 2023
-
[32]
SLAPS: Self-supervision improves structure learning for graph neural networks
Bahare Fatemi, Layla El Asri, and Seyed Mehran Kazemi. SLAPS: Self-supervision improves structure learning for graph neural networks. InAdvances in Neural Information Processing Systems (NeurIPS), volume 34, 2021. A Broader Impact PRIVX is designed as aprivacy-auditingtool to quantify the residual structural leakage of deployed GNN explanation systems und...
work page 2021
-
[33]
Total: Var(W|A ij = 1) =C x + 2σ2S2 x +σ 4d
=C x; conditional on xi, x⊤ i ηj ∼ N(0, σ 2∥xi∥2 2), so by the law of total variance and ex- changeability, Var(x⊤ i ηj |A ij = 1) =σ 2S2 x; similarly Var(η⊤ i xj |A ij = 1) =σ 2S2 x; for η⊤ i ηj =P k η(k) i η(k) j , each summand is a product of independent zero-mean Gaussians with variance σ4, givingVar(η ⊤ i ηj) =dσ 4. Total: Var(W|A ij = 1) =C x + 2σ2S...
-
[34]
+h· E[ϕ ⊤ i ϕj |same class] +O(σ 2).(12) In the regime h≪1 (strong heterophily), the anti-correlation term dominates: E[ ˜ϕ⊤ i ˜ϕj |A ij = 1]≈ −(1−h)|ρ − XA | ∥ϕ∥2 2 <0. Negative-threshold attacker.The attacker uses ˆAij =1[ ˜ϕ⊤ i ˜ϕj ≤ −τ] with τ= (1− h)|ρ− XA | ∥ϕ∥2 2 /2>0. By an argument symmetric to the homophilic proof, Pr[ ˆAij = 1|A ij = 1]≥(1−h)|ρ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.