On Halting vs Converging in Recurrent Graph Neural Networks
Pith reviewed 2026-05-07 16:42 UTC · model grok-4.3
The pith
Converging recurrent graph neural networks match the expressiveness of graded-bisimulation-invariant halting ones over undirected graphs, capturing exactly the graded modal mu-calculus.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Over undirected graphs, converging RGNNs are equally expressive as graded-bisimulation-invariant halting RGNNs, while output-converging RGNNs are at least as expressive. Combined with prior results on halting RGNNs, this shows that, relative to the classifiers expressible in monadic second-order logic, converging RGNNs express exactly the graded modal mu-calculus, and output-converging RGNNs express at least that. The results hold even when restricting to ReLU networks with sum aggregation. The main technical challenge of simulating halting RGNNs by converging ones is solved by a traffic-light protocol that enables vertices to coordinate despite asynchrony.
What carries the argument
The traffic-light protocol, which uses coordinated signals to synchronize local halting decisions across vertices in converging RGNNs and prevent desynchronization.
If this is right
- Converging RGNNs can realize every classifier in the graded modal mu-calculus.
- Output-converging RGNNs can realize at least every classifier in the graded modal mu-calculus.
- Full expressiveness is retained when using only ReLU activations and sum aggregation.
- The equivalence classes remain valid relative to all classifiers expressible in monadic second-order logic.
Where Pith is reading between the lines
- The coordination method may extend to directed graphs if similar synchronization signals can be defined.
- Empirical checks on small graphs with known mu-calculus properties could verify whether the theoretical equivalence appears in trained models.
- The unification suggests that convergence guarantees need not sacrifice logical power in recurrent graph models.
Load-bearing premise
The traffic-light protocol coordinates halting decisions without desynchronization issues, and the results apply to undirected graphs with graded-bisimulation-invariant halting classifiers.
What would settle it
A concrete undirected graph and graded-bisimulation-invariant halting RGNN classifier that no converging RGNN can match, or an input where the traffic-light protocol allows persistent desynchronization in halting times.
read the original abstract
Recurrent Graph Neural Networks (RGNNs) extend standard GNNs by iterating message-passing until some stopping condition is met. Various RGNN models have been proposed in the literature. In this paper, we study three such models: converging RGNNs, where all vertex representations must stabilise; output-converging RGNNs, where only the output classifications must stabilise; and halting RGNNs, where a per-vertex halting classifier determines when to stop. We establish expressiveness relationships between these models: over undirected graphs, converging RGNNs are equally expressive as graded-bisimulation-invariant halting RGNNs, while output-converging RGNNs are at least as expressive. Combined with prior results on halting RGNNs, this shows that, relative to the classifiers expressible in monadic second-order logic (MSO), converging RGNNs express exactly the graded modal $\mu$-calculus ($\mu$GML), and output-converging RGNNs express at least $\mu$GML. These results hold even when restricting to ReLU networks with sum aggregation. The main technical challenge is simulating halting RGNNs by converging ones: without a global halting classifier, vertices may locally decide to halt at different times, causing desynchronisation. We develop a "traffic-light" protocol that enables vertices to coordinate despite this asynchrony. Our results answer an open question from Bollen et al. (2025) and show that the RGNN model of Pflueger et al. (2024) retains full $\mu$GML expressiveness even when convergence is guaranteed.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies expressiveness relationships among three RGNN variants—converging RGNNs (all vertex representations stabilize), output-converging RGNNs (only outputs stabilize), and halting RGNNs (per-vertex halting classifier)—over undirected graphs. It claims that converging RGNNs are equally expressive as graded-bisimulation-invariant halting RGNNs, while output-converging RGNNs are at least as expressive. Combined with prior results on halting RGNNs, this yields that converging RGNNs express exactly the graded modal μ-calculus (μGML) relative to MSO classifiers, and output-converging RGNNs express at least μGML; the results hold even for ReLU networks with sum aggregation. The central technical device is a “traffic-light” protocol that lets vertices coordinate halting decisions asynchronously to simulate halting behavior inside a converging model.
Significance. If the simulation argument holds, the paper resolves an open question from Bollen et al. (2025) and shows that the Pflueger et al. (2024) RGNN model retains its full μGML expressiveness even when convergence is enforced. The explicit connection to graded-bisimulation-invariant classifiers and the restriction to the ReLU + sum-aggregation fragment are valuable for both theory and practice; the work supplies a clean hierarchy among RGNN termination regimes and strengthens the logical characterization of recurrent message-passing.
major comments (3)
- [§4] §4 (Traffic-light protocol): The claim that the protocol forces eventual global stabilization on every undirected graph while producing exactly the same per-vertex outputs as the original halting classifier is load-bearing for the “equally expressive” direction. The provided sketch does not contain an explicit lemma ruling out persistent oscillation on graphs with odd-length cycles or heterogeneous degrees; a concrete invariant or potential-function argument showing termination in O(diam(G)) steps is required.
- [§4.3] §4.3 (Simulation correctness): The reduction must preserve graded-bisimulation invariance of the simulated halting classifier. It is not shown that the extra state introduced by the traffic-light bits (red/green) cannot increase the distinguishing power beyond graded bisimulation; an explicit argument that the augmented state remains within the graded-bisimulation quotient is needed.
- [Theorem 5.1] Theorem 5.1 (Equivalence to μGML): The “exactly μGML” direction for converging RGNNs relies on both the new simulation and the prior halting-RGNN characterization. Because the simulation is the only new ingredient, any gap in the protocol immediately weakens the exact characterization; the proof should isolate the simulation lemma so that it can be checked independently of the prior results.
minor comments (2)
- Notation for the traffic-light states (red/green) is introduced without a compact tabular summary; a small table listing the four possible (own-light, neighbor-light) combinations and the resulting update rule would improve readability.
- The abstract states the results hold “even when restricting to ReLU networks with sum aggregation,” but the main text does not explicitly restate the activation and aggregation functions used in the simulation construction; a single sentence confirming that the protocol uses only ReLU and sum would eliminate ambiguity.
Simulated Author's Rebuttal
We thank the referee for the careful and constructive review. The comments identify places where the sketch in the current manuscript can be strengthened with explicit arguments, and we will revise accordingly to address them fully.
read point-by-point responses
-
Referee: [§4] §4 (Traffic-light protocol): The claim that the protocol forces eventual global stabilization on every undirected graph while producing exactly the same per-vertex outputs as the original halting classifier is load-bearing for the “equally expressive” direction. The provided sketch does not contain an explicit lemma ruling out persistent oscillation on graphs with odd-length cycles or heterogeneous degrees; a concrete invariant or potential-function argument showing termination in O(diam(G)) steps is required.
Authors: We agree that the termination argument in the current sketch of §4 is insufficiently detailed. In the revised manuscript we will add an explicit Lemma 4.1 that defines a potential function Φ(t) equal to the sum, over all vertices, of a state value (2 for red, 1 for green, 0 for halted) plus a diameter-scaled term that accounts for propagation distance. We prove that Φ decreases by at least 1 every two rounds and that, on any undirected graph, the protocol reaches a globally stable configuration (with identical outputs to the original halting classifier) in at most 2·diam(G) steps. The argument handles odd-length cycles by alternating red/green phases that respect local parity and heterogeneous degrees because it relies only on sum aggregation, not on uniform degree assumptions. This lemma will be stated and proved independently of the μGML results. revision: yes
-
Referee: [§4.3] §4.3 (Simulation correctness): The reduction must preserve graded-bisimulation invariance of the simulated halting classifier. It is not shown that the extra state introduced by the traffic-light bits (red/green) cannot increase the distinguishing power beyond graded bisimulation; an explicit argument that the augmented state remains within the graded-bisimulation quotient is needed.
Authors: We accept that an explicit invariance argument is required. The revised §4.3 will contain a new Proposition 4.3 proving that the traffic-light bits are graded-bisimulation invariant. Because the bits are computed from neighborhood sums and the protocol uses only the same sum-aggregation and ReLU operations as the original model, any two vertices that are graded bisimilar receive identical traffic-light states at corresponding time steps. Consequently the augmented state vector (original representation plus red/green bit) lies in the same graded-bisimulation quotient as the original halting state, so the simulation does not increase distinguishing power. revision: yes
-
Referee: [Theorem 5.1] Theorem 5.1 (Equivalence to μGML): The “exactly μGML” direction for converging RGNNs relies on both the new simulation and the prior halting-RGNN characterization. Because the simulation is the only new ingredient, any gap in the protocol immediately weakens the exact characterization; the proof should isolate the simulation lemma so that it can be checked independently of the prior results.
Authors: We agree that isolating the simulation result improves clarity and verifiability. In the revision we will extract the traffic-light simulation as a self-contained Theorem 4.1 (“Converging RGNNs are equally expressive as graded-bisimulation-invariant halting RGNNs over undirected graphs”). The proof of Theorem 5.1 will then be reorganized to cite Theorem 4.1 first and only afterwards combine it with the existing halting-RGNN-to-μGML characterization (relative to MSO classifiers). This separation makes the new technical contribution independent and easier to check. revision: yes
Circularity Check
Minor self-citation to prior halting RGNN results; new independent simulation
full rationale
The paper develops a novel traffic-light protocol to establish equivalence between converging RGNNs and graded-bisimulation-invariant halting RGNNs over undirected graphs, providing an independent construction that does not reduce to prior inputs. It combines this with results from Bollen et al. (2025) to conclude exact μGML expressiveness, but the self-citation is not load-bearing for the new equivalence or the protocol's correctness. No self-definitional reductions, fitted inputs renamed as predictions, ansatz smuggling, or renaming of known results occur. The derivation remains self-contained via the explicit protocol and external MSO/μGML characterizations.
Axiom & Free-Parameter Ledger
axioms (3)
- standard math Properties of monadic second-order logic (MSO) for expressing classifiers
- domain assumption Undirected graphs as input structure
- domain assumption Graded bisimulation invariance for halting RGNNs
invented entities (1)
-
traffic-light protocol
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Ahvonen, V.; Heiman, D.; Kuusisto, A.; and Lutz, C. 2024. Logical characterizations of recurrent graph neural networks with reals and floats. In Globerson, A.; Mackey, L.; Belgrave, D.; Fan, A.; Paquet, U.; Tomczak, J.; and Zhang, C., eds., Advances in Neural Information Processing Systems , volume 37, 104205--104249. Curran Associates, Inc
work page 2024
-
[2]
Ahvonen, V.; Heiman, D.; and Kuusisto, A. 2025. Graph neural networks and mso
work page 2025
-
[3]
V.; Monet, M.; Pérez, J.; Reutter, J.; and Silva, J
Barceló, P.; Kostylev, E. V.; Monet, M.; Pérez, J.; Reutter, J.; and Silva, J. P. 2020. The logical expressiveness of graph neural networks. In International Conference on Learning Representations
work page 2020
-
[4]
Bollen, J.; Van den Bussche, J.; Vansummeren, S.; and Virtema, J. 2025. Halting Recurrent GNNs and the Graded mu-Calculus . In Proceedings of the 22nd International Conference on Principles of Knowledge Representation and Reasoning , 175--184
work page 2025
-
[5]
Colcombet, T.; Doumane, A.; and Kuperberg, D. 2025. Tree Algebras and Bisimulation-Invariant MSO on Finite Graphs . In Censor-Hillel, K.; Grandoni, F.; Ouaknine, J.; and Puppis, G., eds., 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025) , volume 334 of Leibniz International Proceedings in Informatics (LIPIcs) , 152:1--152...
work page 2025
-
[6]
Corso, G.; Cavalleri, L.; Beaini, D.; Li\` o , P.; and Veli c kovi\' c , P. 2020. Principal neighbourhood aggregation for graph nets. In Larochelle, H.; Ranzato, M.; Hadsell, R.; Balcan, M.; and Lin, H., eds., Advances in Neural Information Processing Systems , volume 33, 13260--13271. Curran Associates, Inc
work page 2020
-
[7]
Gilmer, J.; Schoenholz, S. S.; Riley, P. F.; Vinyals, O.; and Dahl, G. E. 2017. Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning - Volume 70 , ICML'17, 1263–1272. JMLR.org
work page 2017
-
[8]
Grohe, M. 2021. The logic of graph neural networks. In 2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS) , 1--17
work page 2021
-
[9]
Hamilton, W.; Ying, Z.; and Leskovec, J. 2017. Inductive representation learning on large graphs. In Guyon, I.; Luxburg, U. V.; Bengio, S.; Wallach, H.; Fergus, R.; Vishwanathan, S.; and Garnett, R., eds., Advances in Neural Information Processing Systems , volume 30. Curran Associates, Inc
work page 2017
-
[10]
Hamilton, W. L. 2020. Graph representation learning. Synthesis Lectures on Artificial Intelligence and Machine Learning 14(3):1--159
work page 2020
-
[11]
Janin, D., and Lenzi, G. 2001. Relating levels of the mu-calculus hierarchy and levels of the monadic hierarchy. In 16th Annual IEEE Symposium on Logic in Computer Science, Boston, Massachusetts, USA, June 16-19, 2001, Proceedings , 347--356. IEEE Computer Society
work page 2001
-
[12]
Kipf, T. N., and Welling, M. 2017. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations
work page 2017
-
[13]
Kuusisto, A. 2013. Modal Logic and Distributed Message Passing Automata . In Ronchi Della Rocca, S., ed., Computer Science Logic 2013 (CSL 2013) , volume 23 of Leibniz International Proceedings in Informatics (LIPIcs) , 452--468. Dagstuhl, Germany: Schloss Dagstuhl -- Leibniz-Zentrum f \"u r Informatik
work page 2013
-
[14]
Morris, C.; Ritzert, M.; Fey, M.; Hamilton, W. L.; Lenssen, J. E.; Rattan, G.; and Grohe, M. 2019. Weisfeiler and leman go neural: Higher-order graph neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 33(01):4602--4609
work page 2019
-
[15]
Pflueger, M.; Tena Cucala, D.; and Kostylev, E. V. 2024. Recurrent graph neural networks and their connections to bisimulation and logic. Proceedings of the AAAI Conference on Artificial Intelligence 38(13):14608--14616
work page 2024
-
[16]
Rosenbluth, E., and Grohe, M. 2025. Repetition makes perfect: Recurrent graph neural networks match message-passing limit
work page 2025
-
[17]
C.; Hagenbuchner, M.; and Monfardini, G
Scarselli, F.; Gori, M.; Tsoi, A. C.; Hagenbuchner, M.; and Monfardini, G. 2009. The graph neural network model. IEEE Transactions on Neural Networks 20(1):61--80
work page 2009
-
[18]
Walukiewicz, I. 2002. Monadic second-order logic on tree-like structures. Theor. Comput. Sci. 275(1–2):311–346
work page 2002
-
[19]
Xu, K.; Hu, W.; Leskovec, J.; and Jegelka, S. 2019. How powerful are graph neural networks? In International Conference on Learning Representations
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.