Node-Private Community Detection in Stochastic Block Models

Ilias Zadik; Olga Klopp

arxiv: 2604.09078 · v1 · submitted 2026-04-10 · 🧮 math.ST · stat.TH

Node-Private Community Detection in Stochastic Block Models

Olga Klopp , Ilias Zadik This is my paper

Pith reviewed 2026-05-10 17:42 UTC · model grok-4.3

classification 🧮 math.ST stat.TH

keywords stochastic block modelcommunity detectionnode differential privacyexact recoveryminimax riskexponential mechanismsparse graphsprivacy-utility tradeoff

0 comments

The pith

A logarithmic privacy budget is both sufficient and necessary to achieve exact community recovery under node differential privacy in sparse stochastic block models with fixed communities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs a node-private estimator by applying the exponential mechanism to a suitable objective and invoking an extension lemma to handle the linear impact of a single node on the observed edges. It proves that this estimator attains exact recovery when the privacy parameter is logarithmic in the number of nodes, in the standard regime where average degree is logarithmic and the number of communities is fixed. The matching lower bound shows that any pure node-private procedure fails to drive exact-recovery error or expected mismatch below a polynomial threshold unless the privacy budget reaches this logarithmic order. When the privacy budget is super-logarithmic, the minimax risk decomposes into two explicit terms, one controlled by the non-private statistical signal and the other by the privacy constraint, and the upper and lower bounds agree up to universal constants in the exponents.

Core claim

Under pure node-level differential privacy, exact recovery in the sparse stochastic block model with fixed number of communities remains possible once the privacy budget reaches logarithmic order; moreover, this scaling is information-theoretically necessary, since sub-logarithmic budgets force the exact-recovery error or expected mismatch to remain polynomially large, while for larger budgets the minimax risk admits a two-term characterization governed separately by the non-private signal strength and by the privacy expenditure.

What carries the argument

A node-private estimator obtained by feeding a suitably chosen objective into the exponential mechanism together with an extension lemma that controls the sensitivity induced by node deletion.

If this is right

Exact recovery is achievable with only a logarithmic privacy budget in the logarithmic-degree regime.
Sub-logarithmic privacy budgets force polynomially large exact-recovery error or expected mismatch for every pure node-private method.
For super-logarithmic budgets the minimax risk separates into a statistical term and a privacy term that match up to constants in the exponents.
The logarithmic privacy cost is absent under the weaker edge-differential-privacy model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same logarithmic overhead is likely to appear in other node-private graph problems such as planted clique or community detection with growing number of communities.
Relaxations such as approximate node privacy or local differential privacy may reduce the cost while retaining meaningful protection for node participation.
Practical network analyses that treat nodes as the sensitive unit will need to budget privacy at least logarithmically to preserve recovery quality.

Load-bearing premise

The observed graph is drawn from a stochastic block model having a fixed number of communities and logarithmic average degree, and the extension lemma for the exponential mechanism remains valid under node privacy.

What would settle it

Existence of any pure node-private algorithm that drives exact-recovery error below a polynomial threshold using a privacy budget o(log n) on sparse stochastic block model instances with fixed communities would falsify the necessity claim.

read the original abstract

We study community detection in stochastic block models under pure node-level differential privacy, a stringent notion that protects the participation of an individual together with all of their incident edges. This setting is substantially more challenging than edge-private community detection, since modifying a single node can affect linearly many observations. On the algorithmic side, we analyze a node-private estimator based on the exponential mechanism combined with an extension lemma, and show that exact recovery remains achievable. In the standard sparse regime with logarithmic average degree and a fixed number of communities, our results imply that a logarithmic privacy budget suffices to obtain nontrivial recovery guarantees. On the lower bound side, we show that this logarithmic scaling is in fact unavoidable: any pure node-private method must fail to achieve polynomially small exact-recovery error, or polynomially small expected mismatch, unless the privacy budget is at least of this order. Moreover, in the regime of super-logarithmic privacy budgets, our upper and lower bounds yield a matching two-term characterization of the minimax risk, with one term governed by the non-private statistical signal and the other by the privacy budget; these match up to universal constants in the exponents. Taken together, our results identify an inherent logarithmic privacy cost in node-private community detection, absent under edge differential privacy, and provide a precise rate-level characterization of the tradeoff between node privacy and SBM recovery.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper pins down a logarithmic privacy cost for node-private exact recovery in sparse SBMs and gives matching upper and lower bounds that separate it cleanly from the edge-private case.

read the letter

The key point is that node privacy forces a log-scale privacy budget to reach nontrivial exact recovery in the standard sparse SBM regime, while edge privacy does not. The lower bound shows any pure node-private method fails to get polynomially small error unless epsilon is at least logarithmic. The upper bound reaches the same threshold by running the exponential mechanism after an extension step that accounts for node additions and removals. For larger epsilon the two bounds match up to constants and split the risk into a statistical term and a privacy term. That two-term characterization and the explicit log cost are the actual new pieces; prior edge-private work does not force this scaling. The analysis stays within standard SBM assumptions with fixed communities and log average degree, and the lower bound argument looks independent of the upper-bound construction. The extension lemma is the part that needs the closest check, since node changes affect a linear number of edges and any looseness there could affect how tightly the upper bound meets the non-private threshold. The rest of the argument follows established sensitivity and utility calculations for the exponential mechanism. This work is for people who care about minimax rates in private network analysis or who need to know the precise cost of node versus edge privacy in community detection. A reader who already knows the non-private SBM thresholds will see the privacy adjustment clearly. It deserves a serious referee because the claims are concrete, the lower bound stands alone, and the problem sits at the intersection of two active areas. I would send it out for review.

Referee Report

1 major / 1 minor

Summary. The paper studies community detection in stochastic block models under pure node differential privacy. It proposes an estimator based on the exponential mechanism combined with an extension lemma and proves that exact recovery remains possible with a logarithmic privacy budget in the standard sparse regime (logarithmic average degree, fixed number of communities). It also establishes a matching lower bound showing that any pure node-private method requires at least logarithmic privacy to achieve polynomially small exact-recovery error or expected mismatch, and derives a two-term minimax risk characterization (non-private statistical term plus privacy term) that matches up to universal constants when the privacy budget is super-logarithmic.

Significance. If the central claims hold, the work identifies a fundamental logarithmic privacy cost specific to node privacy in SBM recovery that is absent under edge privacy, together with tight rate characterizations. This provides a precise understanding of the privacy-utility tradeoff for a core graph inference task and strengthens the theoretical foundation for private network analysis.

major comments (1)

[Upper-bound analysis (extension lemma for exponential mechanism)] The upper bound relies on an extension lemma that adapts the exponential mechanism to node privacy (mentioned in the abstract and used to obtain the logarithmic epsilon sufficiency). The lemma must correctly bound the sensitivity of the community-detection score function when a single node and its incident edges are added or removed; in the logarithmic-degree regime this sensitivity is linear in the degree. If the lemma fails to preserve the exact-recovery utility guarantee after the extension, the claimed sufficiency of a logarithmic privacy budget does not follow from the non-private SBM threshold. Please state the lemma explicitly (including its assumptions on the score function) and provide its full proof.

minor comments (1)

[Minimax risk characterization] The abstract states that the two-term risk characterization matches 'up to universal constants in the exponents'; it would be useful to make the dependence of these constants on the SBM parameters (p, q, k) explicit in the theorem statement.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for the constructive feedback. We address the major comment below and describe the revisions we will make.

read point-by-point responses

Referee: [Upper-bound analysis (extension lemma for exponential mechanism)] The upper bound relies on an extension lemma that adapts the exponential mechanism to node privacy (mentioned in the abstract and used to obtain the logarithmic epsilon sufficiency). The lemma must correctly bound the sensitivity of the community-detection score function when a single node and its incident edges are added or removed; in the logarithmic-degree regime this sensitivity is linear in the degree. If the lemma fails to preserve the exact-recovery utility guarantee after the extension, the claimed sufficiency of a logarithmic privacy budget does not follow from the non-private SBM threshold. Please state the lemma explicitly (including its assumptions on the score function) and provide its full proof.

Authors: We agree that the extension lemma is central to establishing the upper bound and that a fully explicit statement together with its complete proof is required for rigor. In the revised manuscript we will add an explicit statement of the lemma in Section 3 (immediately preceding the application to the exponential mechanism), listing all assumptions on the score function: it must be a real-valued function of the adjacency matrix whose value changes by at most the degree of the added or removed node, and whose range is bounded by a quantity that is O(log n) with high probability under the sparse SBM parameters. The proof will verify that this sensitivity bound is indeed linear in the (logarithmic) degree, that the exponential mechanism therefore satisfies pure node differential privacy with the stated epsilon, and that the utility guarantee (probability of exact recovery) carries over from the non-private threshold up to a multiplicative constant factor that is absorbed into the logarithmic privacy budget. This addition will be placed in the main text rather than the appendix so that the logical dependence on the non-private SBM threshold is transparent. revision: yes

Circularity Check

0 steps flagged

No circularity: upper and lower bounds derived independently from SBM parameters, node-DP definitions, and exponential mechanism analysis.

full rationale

The derivation chain begins from the standard SBM with fixed communities and logarithmic degree, applies the exponential mechanism to a community-detection score function, and invokes an extension lemma (analyzed within the paper) to handle node-level sensitivity. The resulting upper bound on exact recovery with logarithmic epsilon follows directly from utility guarantees of the mechanism under the stated sensitivity. The matching lower bound is obtained via separate information-theoretic arguments showing that sub-logarithmic epsilon forces failure on exact recovery or mismatch, without reference to the upper-bound construction. No step reduces a claimed prediction to a fitted parameter by construction, renames a known result, or relies on a self-citation whose content is itself the target claim. The extension lemma is internal to the node-privacy analysis rather than an unverified external assumption that tautologically forces the logarithmic scaling.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the standard SBM generative assumptions and the properties of the exponential mechanism; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption The input graph is generated from a stochastic block model with a fixed number of communities and logarithmic average degree.
This is the standard sparse regime referenced in the abstract for which the upper and lower bounds are stated.

pith-pipeline@v0.9.0 · 5533 in / 1210 out tokens · 89664 ms · 2026-05-10T17:42:57.667548+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Node-private community estimation in stochastic block models: Tractable algorithms and lower bounds
math.ST 2026-05 unverdicted novelty 7.0

Develops tractable node-differentially private algorithms for community estimation in fixed-community stochastic block models together with lower bounds on the privacy parameter ε needed for consistency.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages · cited by 1 Pith paper

[1]

Community detection and stochastic bloc k models: Recent developments

Emmanuel Abbe. Community detection and stochastic bloc k models: Recent developments. Journal of Machine Learning Research , 18(177):1–86, 2018

work page 2018
[2]

Bandeira, and Georgina Hall

Emmanuel Abbe, Afonso S. Bandeira, and Georgina Hall. Ex act recovery in the stochastic block model. IEEE Transactions on Information Theory , 62(1):471–487, 2016

work page 2016
[3]

Recovering communities in the general stochastic block model without knowing the parameters

Emmanuel Abbe and Colin Sandon. Recovering communities in the general stochastic block model without knowing the parameters. In Advances in Neural Information Processing Systems (NeurIPS), 2015. arXiv:1506.03729

work page arXiv 2015
[4]

Non-backtracking spectrum of random graphs: Community detection and non-regular ramanu jan graphs

Charles Bordenave, Marc Lelarge, and Laurent Massouli´ e. Non-backtracking spectrum of random graphs: Community detection and non-regular ramanu jan graphs. The Annals of Probability, 46, 2018

work page 2018
[5]

Private algorithms can always be extended, 2018

Christian Borgs, Jennifer Chayes, Adam Smith, and Ilias Zadik. Private algorithms can always be extended, 2018

work page 2018
[6]

Revealing network structure, conﬁdentially: Improved rates for node-private graphon es timation

Christian Borgs, Jennifer Chayes, Adam Smith, and Ilias Zadik. Revealing network structure, conﬁdentially: Improved rates for node-private graphon es timation. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS) , pages 533–543, 2018

work page 2018
[7]

Private estimation algor ithms for stochastic block mod- els and mixture models

Hongjie Chen, Vincent Cohen-Addad, Tommaso d’Orsi, Ale ssandro Epasto, Jacob Imola, David Steurer, and Stefan Tiegel. Private estimation algor ithms for stochastic block mod- els and mixture models. Advances in Neural Information Processing Systems , 36:68134–68183, 2023

work page 2023
[8]

The algorithmic foundatio ns of diﬀerential privacy

Cynthia Dwork and Aaron Roth. The algorithmic foundatio ns of diﬀerential privacy. Founda- tions and Trends in Theoretical Computer Science , 9(3–4):211–407, 2014

work page 2014
[9]

Zhang, and Harrison H

Chao Gao, Zongming Ma, Anderson Y. Zhang, and Harrison H. Zhou. Achieving optimal misclassiﬁcation proportion in stochastic block models. Journal of Machine Learning Research, 18(60):1–45, 2017

work page 2017
[10]

Holland, Kathryn Blackmond Laskey, and Samuel L einhardt

Paul W. Holland, Kathryn Blackmond Laskey, and Samuel L einhardt. Stochastic blockmodels: First steps. Social Networks , 5(2):109–137, 1983

work page 1983
[11]

Shiva Prasad Kasiviswanathan, Kobbi Nissim, Sofya Ras khodnikova, and Adam D. Smith. Analyzing graphs with node diﬀerential privacy. In Amit Saha i, editor, Theory of Cryptogra- phy - 10th Theory of Cryptography Conference, TCC 2013, Tokyo, Jap an, March 3-6, 2013. Proceedings, volume 7785 of Lecture Notes in Computer Science , pages 457–476. Springer, 2013

work page 2013
[12]

Community detection thresholds a nd the weak ramanujan property

Laurent Massouli´ e. Community detection thresholds a nd the weak ramanujan property. Pro- ceedings of the Forty-Sixth Annual ACM Symposium on Theory of Co mputing, 2014

work page 2014
[13]

Mechanism design via d iﬀerential privacy

Frank McSherry and Kunal Talwar. Mechanism design via d iﬀerential privacy. In Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Sc ience (FOCS) , pages 94–103, 2007. 39

work page 2007
[14]

Diﬀerentially private community detection for stochastic block models

Mohamed S Mohamed, Dung Nguyen, Anil Vullikanti, and Ra vi Tandon. Diﬀerentially private community detection for stochastic block models. In Kamali ka Chaudhuri and Aarti Singh, editors, Proceedings of the 39th International Conference on Machine L earning, volume 162 of Proceedings of Machine Learning Research , pages 15858–15894. PMLR, 17–23 Jul 2022

work page 2022
[15]

A proof of th e block model threshold conjecture

Elchanan Mossel, Joe Neeman, and Allan Sly. A proof of th e block model threshold conjecture. Combinatorica, 38, 2018

work page 2018
[16]

Vullikanti

Dung Nguyen and Anil Kumar S. Vullikanti. Diﬀerentially private exact recovery for stochastic block models. In Forty-ﬁrst International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27, 2024 . OpenReview.net, 2024

work page 2024
[17]

Sofya Raskhodnikova and Adam D. Smith. Lipschitz exten sions for node-private graph statis- tics and the generalized exponential mechanism. In Irit Din ur, editor, IEEE 57th Annual Sym- posium on Foundations of Computer Science, FOCS 2016, Hyatt Re gency, New Brunswick, New Jersey, USA, October 9-11, 2016 , pages 495–504. IEEE Computer Society, 2016

work page 2016
[18]

Zhang and Harrison H

Anderson Y. Zhang and Harrison H. Zhou. Minimax rates of community detection in stochastic block models. Annals of statistics , 44, 2016. 40

work page 2016

[1] [1]

Community detection and stochastic bloc k models: Recent developments

Emmanuel Abbe. Community detection and stochastic bloc k models: Recent developments. Journal of Machine Learning Research , 18(177):1–86, 2018

work page 2018

[2] [2]

Bandeira, and Georgina Hall

Emmanuel Abbe, Afonso S. Bandeira, and Georgina Hall. Ex act recovery in the stochastic block model. IEEE Transactions on Information Theory , 62(1):471–487, 2016

work page 2016

[3] [3]

Recovering communities in the general stochastic block model without knowing the parameters

Emmanuel Abbe and Colin Sandon. Recovering communities in the general stochastic block model without knowing the parameters. In Advances in Neural Information Processing Systems (NeurIPS), 2015. arXiv:1506.03729

work page arXiv 2015

[4] [4]

Non-backtracking spectrum of random graphs: Community detection and non-regular ramanu jan graphs

Charles Bordenave, Marc Lelarge, and Laurent Massouli´ e. Non-backtracking spectrum of random graphs: Community detection and non-regular ramanu jan graphs. The Annals of Probability, 46, 2018

work page 2018

[5] [5]

Private algorithms can always be extended, 2018

Christian Borgs, Jennifer Chayes, Adam Smith, and Ilias Zadik. Private algorithms can always be extended, 2018

work page 2018

[6] [6]

Revealing network structure, conﬁdentially: Improved rates for node-private graphon es timation

Christian Borgs, Jennifer Chayes, Adam Smith, and Ilias Zadik. Revealing network structure, conﬁdentially: Improved rates for node-private graphon es timation. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS) , pages 533–543, 2018

work page 2018

[7] [7]

Private estimation algor ithms for stochastic block mod- els and mixture models

Hongjie Chen, Vincent Cohen-Addad, Tommaso d’Orsi, Ale ssandro Epasto, Jacob Imola, David Steurer, and Stefan Tiegel. Private estimation algor ithms for stochastic block mod- els and mixture models. Advances in Neural Information Processing Systems , 36:68134–68183, 2023

work page 2023

[8] [8]

The algorithmic foundatio ns of diﬀerential privacy

Cynthia Dwork and Aaron Roth. The algorithmic foundatio ns of diﬀerential privacy. Founda- tions and Trends in Theoretical Computer Science , 9(3–4):211–407, 2014

work page 2014

[9] [9]

Zhang, and Harrison H

Chao Gao, Zongming Ma, Anderson Y. Zhang, and Harrison H. Zhou. Achieving optimal misclassiﬁcation proportion in stochastic block models. Journal of Machine Learning Research, 18(60):1–45, 2017

work page 2017

[10] [10]

Holland, Kathryn Blackmond Laskey, and Samuel L einhardt

Paul W. Holland, Kathryn Blackmond Laskey, and Samuel L einhardt. Stochastic blockmodels: First steps. Social Networks , 5(2):109–137, 1983

work page 1983

[11] [11]

Shiva Prasad Kasiviswanathan, Kobbi Nissim, Sofya Ras khodnikova, and Adam D. Smith. Analyzing graphs with node diﬀerential privacy. In Amit Saha i, editor, Theory of Cryptogra- phy - 10th Theory of Cryptography Conference, TCC 2013, Tokyo, Jap an, March 3-6, 2013. Proceedings, volume 7785 of Lecture Notes in Computer Science , pages 457–476. Springer, 2013

work page 2013

[12] [12]

Community detection thresholds a nd the weak ramanujan property

Laurent Massouli´ e. Community detection thresholds a nd the weak ramanujan property. Pro- ceedings of the Forty-Sixth Annual ACM Symposium on Theory of Co mputing, 2014

work page 2014

[13] [13]

Mechanism design via d iﬀerential privacy

Frank McSherry and Kunal Talwar. Mechanism design via d iﬀerential privacy. In Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Sc ience (FOCS) , pages 94–103, 2007. 39

work page 2007

[14] [14]

Diﬀerentially private community detection for stochastic block models

Mohamed S Mohamed, Dung Nguyen, Anil Vullikanti, and Ra vi Tandon. Diﬀerentially private community detection for stochastic block models. In Kamali ka Chaudhuri and Aarti Singh, editors, Proceedings of the 39th International Conference on Machine L earning, volume 162 of Proceedings of Machine Learning Research , pages 15858–15894. PMLR, 17–23 Jul 2022

work page 2022

[15] [15]

A proof of th e block model threshold conjecture

Elchanan Mossel, Joe Neeman, and Allan Sly. A proof of th e block model threshold conjecture. Combinatorica, 38, 2018

work page 2018

[16] [16]

Vullikanti

Dung Nguyen and Anil Kumar S. Vullikanti. Diﬀerentially private exact recovery for stochastic block models. In Forty-ﬁrst International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27, 2024 . OpenReview.net, 2024

work page 2024

[17] [17]

Sofya Raskhodnikova and Adam D. Smith. Lipschitz exten sions for node-private graph statis- tics and the generalized exponential mechanism. In Irit Din ur, editor, IEEE 57th Annual Sym- posium on Foundations of Computer Science, FOCS 2016, Hyatt Re gency, New Brunswick, New Jersey, USA, October 9-11, 2016 , pages 495–504. IEEE Computer Society, 2016

work page 2016

[18] [18]

Zhang and Harrison H

Anderson Y. Zhang and Harrison H. Zhou. Minimax rates of community detection in stochastic block models. Annals of statistics , 44, 2016. 40

work page 2016