pith. sign in

arxiv: 2605.23285 · v1 · pith:7N3WW66Tnew · submitted 2026-05-22 · 💻 cs.LG · cond-mat.stat-mech· cs.AI

Reinforcement Learning for Microcanonical Graph Ensemble with Assortativity Constraints

Pith reviewed 2026-05-25 04:38 UTC · model grok-4.3

classification 💻 cs.LG cond-mat.stat-mechcs.AI
keywords reinforcement learninggraph ensemblesassortativitymicrocanonical ensemblesdegree-preserving rewiringsnull modelsjoint-degree matrix
0
0 comments X

The pith

A reinforcement learning framework generates graphs with exact prescribed assortativity through degree-preserving rewirings without parameter tuning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a method to create microcanonical graph ensembles that satisfy hard constraints on assortativity exactly, rather than only on average. It does so by training a policy to guide rewirings that change the joint-degree matrix as much as possible while keeping the degree sequence fixed. This approach removes the need for tuning parameters that appear in traditional sampling and produces diverse graphs faster than Metropolis-Hastings dynamics. Because the same policy works across different sizes and densities, it supplies reliable null models that separate the effect of assortativity from other network features.

Core claim

DMGG employs a policy-guided search that maximally alters the joint-degree matrix. This eliminates exhaustive parameter tuning and accelerates generation by at least an order of magnitude while preserving configurational diversity.

What carries the argument

The Deep Microcanonical Graph Generator (DMGG), a reinforcement learning policy trained to direct degree-preserving rewirings that maximize alteration of the joint-degree matrix toward a target assortativity value.

If this is right

  • Exact null models become available for any chosen assortativity value.
  • Secondary observables such as the clustering coefficient can be isolated quantitatively without ensemble artifacts.
  • Generation of constrained graphs speeds up by at least an order of magnitude compared with entropically dominated sampling.
  • The same trained policy works across different graph sizes, sparsities, and topologies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The policy-guided search could be adapted to enforce other hard structural constraints such as fixed clustering or modularity.
  • If the policy generalizes across graph classes, repeated retraining for each new instance may become unnecessary.
  • Faster exact sampling opens the possibility of studying structure-function relations on networks too large for conventional microcanonical methods.

Load-bearing premise

The reinforcement learning policy can be trained to reach exact target assortativity values through degree-preserving rewirings for arbitrary graph sizes, sparsities, and topologies without becoming trapped in configurations that prevent exact satisfaction of the constraint.

What would settle it

Run DMGG on a large sparse graph with a target assortativity that lies outside the range reachable by any sequence of degree-preserving rewirings; if the policy never reaches the target within a fixed number of steps, the claim fails.

read the original abstract

How network structure determines function is a fundamental question, and it can be investigated by graph ensembles with precisely controlled structural properties. Canonical approaches, formulated as exponential random graph models (ERGMs), enforce constraints only in expectation, allowing individual realizations to fluctuate around the target. Conversely, microcanonical ensembles impose hard constraints exactly, but practical sampling methods beyond fixing the degree sequence have remained out of reach. Here we introduce the Deep Microcanonical Graph Generator (DMGG), a reinforcement learning (RL) framework that transforms any given graph through degree-preserving rewirings to exactly reach a prescribed assortativity, which characterizes the degree--degree correlation of adjacent nodes. Instead of relying on the entropically dominated Metropolis--Hastings dynamics of the ERGM, DMGG employs a policy-guided search that maximally alters the joint-degree matrix. This eliminates exhaustive parameter tuning and accelerates generation by at least an order of magnitude while preserving configurational diversity. As DMGG generalizes across various graph sizes, sparsities, and topologies, it provides exact null models that allow for the quantitative isolation of secondary observables, such as the clustering coefficient. These results establish RL as a practical and powerful paradigm for generating hard-constrained graphs, opening avenues to investigate structure-function relationships free from ensemble artifacts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 0 minor

Summary. The manuscript introduces the Deep Microcanonical Graph Generator (DMGG), a reinforcement learning framework that applies policy-guided, degree-preserving rewirings to transform an input graph into one that exactly satisfies a prescribed assortativity target. It claims this eliminates exhaustive parameter tuning required by ERGM-style methods, delivers at least an order-of-magnitude speedup while preserving configurational diversity, and generalizes across graph sizes, sparsities, and topologies, thereby enabling exact null models for isolating secondary observables such as the clustering coefficient.

Significance. If the central empirical claims hold, DMGG would supply a practical route to sampling microcanonical ensembles with hard assortativity constraints, a capability that has been missing beyond degree-sequence fixing. This would strengthen quantitative structure-function studies by removing ensemble-fluctuation artifacts. The RL formulation is a novel direction for constrained graph generation and could extend to other hard constraints if the policy reliability generalizes.

major comments (3)
  1. [Abstract] Abstract: the assertion that DMGG 'accelerates generation by at least an order of magnitude' is load-bearing for the practical-utility claim yet is unsupported by any timing benchmarks, baseline comparisons (e.g., Metropolis-Hastings), or scaling plots; without these data the speedup cannot be evaluated.
  2. [Abstract] Abstract: the claim that the trained policy 'can be trained to reach exact target assortativity values' for 'arbitrary graph sizes, sparsities, and topologies' without trapping is central to the microcanonical guarantee, but the text provides no success-rate statistics, failure-mode analysis, or out-of-distribution tests that would verify connectivity of the rewiring space or policy robustness when the target lies far from the initial assortativity.
  3. [Abstract] Abstract: the statement that DMGG 'preserves configurational diversity' is asserted without any quantitative diversity metric (e.g., entropy of the joint-degree matrix distribution or comparison to ERGM samples) or ablation showing that the policy-guided search does not collapse the ensemble.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback on the abstract claims. We address each major comment point by point below and will revise the manuscript accordingly to provide the requested supporting evidence.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the assertion that DMGG 'accelerates generation by at least an order of magnitude' is load-bearing for the practical-utility claim yet is unsupported by any timing benchmarks, baseline comparisons (e.g., Metropolis-Hastings), or scaling plots; without these data the speedup cannot be evaluated.

    Authors: We agree that the abstract claim requires explicit empirical backing to be evaluated. The experimental sections of the manuscript contain runtime measurements, but these were not highlighted with dedicated benchmarks or scaling plots against Metropolis-Hastings. In the revision we will add a timing table, baseline comparisons, and scaling plots, and we will update the abstract to reference these results directly. revision: yes

  2. Referee: [Abstract] Abstract: the claim that the trained policy 'can be trained to reach exact target assortativity values' for 'arbitrary graph sizes, sparsities, and topologies' without trapping is central to the microcanonical guarantee, but the text provides no success-rate statistics, failure-mode analysis, or out-of-distribution tests that would verify connectivity of the rewiring space or policy robustness when the target lies far from the initial assortativity.

    Authors: We acknowledge that success-rate statistics, failure-mode analysis, and out-of-distribution tests are necessary to substantiate the generalization claim. The current manuscript reports results on a range of graphs but does not include aggregated success rates or explicit OOD tests for distant targets. We will add these quantitative statistics and analysis in the revised version to demonstrate policy robustness and rewiring-space connectivity. revision: yes

  3. Referee: [Abstract] Abstract: the statement that DMGG 'preserves configurational diversity' is asserted without any quantitative diversity metric (e.g., entropy of the joint-degree matrix distribution or comparison to ERGM samples) or ablation showing that the policy-guided search does not collapse the ensemble.

    Authors: We agree that the diversity claim should be supported by explicit metrics rather than assertion alone. The manuscript compares sample properties but lacks the requested quantitative measures such as joint-degree entropy or direct ERGM comparisons. We will incorporate these metrics and an ablation study in the revision to show that the policy-guided search maintains ensemble diversity. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper presents DMGG as a novel RL framework for exact microcanonical sampling via degree-preserving rewirings to target assortativity. No quoted steps reduce a claimed prediction or uniqueness result to a fitted parameter, self-citation, or definitional tautology. The method description contrasts with ERGM but does not rely on load-bearing self-citations or rename known results as new derivations. Central claims concern empirical performance of the trained policy, which remain falsifiable and independent of the paper's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities beyond the high-level description of the RL policy and rewiring process.

pith-pipeline@v0.9.0 · 5760 in / 1100 out tokens · 31107 ms · 2026-05-25T04:38:32.675468+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · 1 internal anchor

  1. [1]

    Jaynes, E.T.: Information theory and sta- tistical mechanics. Phys. Rev.106(4), 620 (1957)

  2. [2]

    Frank, O., Strauss, D.: Markov graphs. J. Am. Stat. Assoc.81(395), 832–842 (1986)

  3. [3]

    Park, J., Newman, M.E.: Statistical mechan- ics of networks. Phys. Rev. E70(6), 066117 (2004)

  4. [4]

    Bianconi, G.: Entropy of network ensembles. Phys. Rev. E79(3), 036114 (2009)

  5. [5]

    SIAM Rev.60(2), 315–355 (2018)

    Fosdick, B.K., Larremore, D.B., Nishimura, J., Ugander, J.: Configuring random graph models with fixed degree sequences. SIAM Rev.60(2), 315–355 (2018)

  6. [6]

    Newman, M.E., Strogatz, S.H., Watts, D.J.: Random graphs with arbitrary degree distri- butions and their applications. Phys. Rev. E 64(2), 026118 (2001)

  7. [7]

    Squartini, T., Mol, J., Hollander, F., Gar- laschelli, D.: Breaking of ensemble equiva- lence in networks. Phys. Rev. Lett.115(26), 268701 (2015)

  8. [8]

    Squartini, T., Mastrandrea, R., Garlaschelli, D.: Unbiased sampling of network ensembles. New J. Phys.17(2), 023052 (2015) 8

  9. [9]

    Radin, C., Yin, M.: Phase transitions in expo- nential random graphs. Ann. Appl. Probab., 2458–2471 (2013)

  10. [10]

    Chatterjee, S., Diaconis, P.: Estimating and understanding exponential random graph models. Ann. Stat.41(5), 2428–2461 (2013)

  11. [11]

    SIAM Rev.45(2), 167– 256 (2003)

    Newman, M.E.: The structure and function of complex networks. SIAM Rev.45(2), 167– 256 (2003)

  12. [12]

    Noh, J.D.: Percolation transition in networks with degree-degree correlation. Phys. Rev. E 76(2), 026116 (2007)

  13. [13]

    Goltsev, A.V., Dorogovtsev, S.N., Mendes, J.F.: Percolation on correlated networks. Phys. Rev. E78(5), 051105 (2008)

  14. [14]

    Van Mieghem, P., Wang, H., Ge, X., Tang, S., Kuipers, F.A.: Influence of assortativity and degree-preserving rewiring on the spectra of networks. Eur. Phys. J. B76(4), 643–652 (2010)

  15. [15]

    Avalos-Gayt´ an, V., Almendral, J.A., Papo, D., Schaeffer, S.E., Boccaletti, S.: Assortative and modular networks are shaped by adap- tive synchronization processes. Phys. Rev. E 86(1), 015101 (2012)

  16. [16]

    Bogun´ a, M., Pastor-Satorras, R.: Epidemic spreading in correlated complex networks. Phys. Rev. E66(4), 047104 (2002)

  17. [17]

    Zhou, D., Stanley, H.E., D’Agostino, G., Scala, A.: Assortativity decreases the robust- ness of interdependent networks. Phys. Rev. E86(6), 066103 (2012)

  18. [18]

    Bassler, K.E., Del Genio, C.I., Erd˝ os, P.L., Mikl´ os, I., Toroczkai, Z.: Exact sampling of graphs with prescribed degree correlations. New J. Phys.17, 083052 (2015)

  19. [19]

    Xulvi-Brunet, R., Sokolov, I.M.: Reshuffling scale-free networks: From random to assorta- tive. Phys. Rev. E70(6), 066102 (2004)

  20. [20]

    In: Proceedings of the 35th International Conference on Machine Learning (ICML), vol

    You, J., Ying, R., Ren, X., Hamilton, W., Leskovec, J.: GraphRNN: Generating realis- tic graphs with deep auto-regressive models. In: Proceedings of the 35th International Conference on Machine Learning (ICML), vol. 80 (2018)

  21. [21]

    In: Proceedings of the 33rd Conference on Neural Informa- tion Processing Systems (NeurIPS), vol

    Liao, R., Li, Y., Song, Y., Wang, S., Hamil- ton, W.L., Duvenaud, D., Urtasun, R., Zemel, R.: Efficient graph generation with graph recurrent attention networks. In: Proceedings of the 33rd Conference on Neural Informa- tion Processing Systems (NeurIPS), vol. 32 (2019)

  22. [22]

    In: Proceedings of the 27th International Conference on Arti- ficial Neural Networks (ICANN), Rhodes, Greece (2018)

    Simonovsky, M., Komodakis, N.: Graphvae: Towards generation of small graphs using variational autoencoders. In: Proceedings of the 27th International Conference on Arti- ficial Neural Networks (ICANN), Rhodes, Greece (2018)

  23. [23]

    In: Proceedings of the 35th International Conference on Machine Learn- ing (ICML), vol

    Jin, W., Barzilay, R., Jaakkola, T.: Junction tree variational autoencoder for molecular graph generation. In: Proceedings of the 35th International Conference on Machine Learn- ing (ICML), vol. 80 (2018)

  24. [24]

    In: Workshop on Theoretical Foundations and Applications of Deep Generative Mod- els at the 35th International Conference on Machine Learning (ICML) (2018)

    De Cao, N., Kipf, T.: MolGAN: An implicit generative model for small molecular graphs. In: Workshop on Theoretical Foundations and Applications of Deep Generative Mod- els at the 35th International Conference on Machine Learning (ICML) (2018)

  25. [25]

    In: Pro- ceedings of the 39th International Confer- ence on Machine Learning (ICML), vol

    Martinkus, K., Loukas, A., Perraudin, N., Wattenhofer, R.: SPECTRE: Spectral con- ditioning helps to overcome the expressivity limits of one-shot graph generators. In: Pro- ceedings of the 39th International Confer- ence on Machine Learning (ICML), vol. 162 (2022)

  26. [26]

    In: Proceedings of the Eleventh Interna- tional Conference on Learning Representa- tions (ICLR) (2023)

    Vignac, C., Krawczuk, I., Siraudin, A., Wang, B., Cevher, V., Frossard, P.: DiGress: Dis- crete denoising diffusion for graph generation. In: Proceedings of the Eleventh Interna- tional Conference on Learning Representa- tions (ICLR) (2023)

  27. [27]

    9 In: Proceedings of the 22nd IEEE Interna- tional Conference on Data Mining (ICDM) (2022)

    Huang, H., Sun, L., Du, B., Fu, Y., Lv, W.: GraphGDP: Generative diffusion processes for permutation invariant graph generation. 9 In: Proceedings of the 22nd IEEE Interna- tional Conference on Data Mining (ICDM) (2022)

  28. [28]

    ACS Cent

    G´ omez-Bombarelli, R., Wei, J.N., Duve- naud, D., Hern´ andez-Lobato, J.M., S´ anchez-Lengeling, B., Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T.D., Adams, R.P., Aspuru-Guzik, A.: Automatic chemical design using a data-driven continu- ous representation of molecules. ACS Cent. Sci.4(2), 268–276 (2018)

  29. [29]

    Proceedings of the Royal Society A: Mathematical, Physical and Engi- neering Sciences477(2254) (2021)

    Darvariu, V.-A., Hailes, S., Musolesi, M.: Goal-directed graph construction using rein- forcement learning. Proceedings of the Royal Society A: Mathematical, Physical and Engi- neering Sciences477(2254) (2021)

  30. [30]

    Transac- tions on Machine Learning Research (2023)

    Yang, S., KAILI, M., Wang, B., Yu, T., Zha, H.: Learning to boost resilience of complex networks via neural edge rewiring. Transac- tions on Machine Learning Research (2023)

  31. [31]

    Newman, M.E.: Assortative mixing in net- works. Phys. Rev. Lett.89(20), 208701 (2002)

  32. [32]

    Newman, M.E.: Mixing patterns in networks. Phys. Rev. E67(2), 026126 (2003)

  33. [33]

    Metropolis, N., Rosenbluth, A.W., Rosen- bluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys.21(6), 1087–1092 (1953)

  34. [34]

    Biometrika57(1), 97–109 (1970)

    Hastings, W.K.: Monte carlo sampling meth- ods using markov chains and their applica- tions. Biometrika57(1), 97–109 (1970)

  35. [35]

    Bellman, R.: A markovian decision process. J. Math. Mech., 679–684 (1957)

  36. [36]

    IEEE Trans

    Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. IEEE Trans. Neu- ral Netw.9(5), 1054–1054 (1998)

  37. [37]

    Nature 393(6684), 440–442 (1998)

    Watts, D.J., Strogatz, S.H.: Collective dynamics of ’small-world’networks. Nature 393(6684), 440–442 (1998)

  38. [38]

    Erd˝ os, P., R´ enyi, A.: On the evolution of ran- dom graphs. Publ. Math. Inst. Hungar. Acad. Sci5(1), 17–61 (1960)

  39. [39]

    Science 286(5439), 509–512 (1999)

    Barab´ asi, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)

  40. [40]

    Proximal Policy Optimization Algorithms

    Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal pol- icy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

  41. [41]

    Touchette, H.: Equivalence and nonequiv- alence of ensembles: Thermodynamic, macrostate, and measure levels. J. Stat. Phys.159, 987–1016 (2015)

  42. [42]

    Social Networks5(2), 109–137 (1983)

    Holland, P.W., Laskey, K.B., Leinhardt, S.: Stochastic blockmodels: First steps. Social Networks5(2), 109–137 (1983)

  43. [43]

    Dall, J., Christensen, M.: Random geometric graphs. Phys. Rev. E66(1), 016121 (2002)

  44. [44]

    Chung, F., Lu, L.: The average distances in random graphs with given expected degrees. Proc. Natl. Acad. Sci. USA99(25), 15879– 15882 (2002)

  45. [45]

    Holme, P., Kim, B.J.: Growing scale-free net- works with tunable clustering. Phys. Rev. E 65(2), 026107 (2002)

  46. [46]

    In: Proceedings of the 32nd AAAI Conference on Artificial Intelli- gence (AAAI), vol

    Tavakoli, A., Pardo, F., Kormushev, P.: Action branching architectures for deep rein- forcement learning. In: Proceedings of the 32nd AAAI Conference on Artificial Intelli- gence (AAAI), vol. 32 (2018)

  47. [47]

    In: Proceedings of the 35th International Florida Artificial Intelli- gence Research Society Conference (FLAIRS) (2022)

    Huang, S., Onta˜ n´ on, S.: A closer look at invalid action masking in policy gra- dient algorithms. In: Proceedings of the 35th International Florida Artificial Intelli- gence Research Society Conference (FLAIRS) (2022)

  48. [48]

    Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How powerful are graph neural networks? In: Pro- ceedings of the 7th International Conference on Learning Representations (ICLR) (2019) 10

  49. [49]

    In: Pro- ceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI), vol

    Perez, E., Strub, F., De Vries, H., Dumoulin, V., Courville, A.: FiLM: Visual reasoning with a general conditioning layer. In: Pro- ceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI), vol. 32 (2018)

  50. [50]

    In: Proceedings of the 34th International Conference on Machine Learning (ICML), vol

    Pathak, D., Agrawal, P., Efros, A.A., Dar- rell, T.: Curiosity-driven exploration by self- supervised prediction. In: Proceedings of the 34th International Conference on Machine Learning (ICML), vol. 70 (2017)

  51. [51]

    Ng, A.Y., Harada, D., Russell, S.: Pol- icy invariance under reward transformations: Theory and application to reward shaping. In: Proceedings of the 16th International Conference on Machine Learning (ICML) (1999) Methods Canonical ensemble via ERGM ERGM defines a graph distributionP(G) on a graph spaceGby maximizing the Shannon entropy subject to soft co...