On the Convergence and Size Transferability of Continuous-depth Graph Neural Networks

Charles Kulick; Mingsong Yan; Sui Tang

arxiv: 2510.03923 · v2 · submitted 2025-10-04 · 💻 cs.LG · cs.AI

On the Convergence and Size Transferability of Continuous-depth Graph Neural Networks

Mingsong Yan , Charles Kulick , Sui Tang This is my paper

Pith reviewed 2026-05-18 09:55 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords graph neural differential equationsgraphon neural differential equationsconvergence analysissize transferabilityinfinite node limitgraphonsneural ODEs

0 comments

The pith

Graph Neural Differential Equations converge to Graphon-NDE solutions in the infinite-node limit.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Continuous-depth graph neural networks, known as GNDEs, blend graph structure with neural ODEs to model dynamics on graphs. The paper proves that GNDE solutions converge trajectory-wise to those of Graphon-NDEs, defined as the infinite-node limit on continuous graphons. Explicit convergence rates are derived for two cases: weighted graphs from smooth graphons and unweighted graphs from 0-1 valued graphons. Size transferability bounds are also established to show models trained on moderate-sized graphs can apply to larger similar graphs without retraining. This supplies a theoretical foundation for why such models scale across graph sizes in practice.

Core claim

We prove the trajectory-wise convergence of GNDE solutions to Graphon-NDE solutions. Moreover, we derive explicit convergence rates under two deterministic graph sampling regimes and establish size transferability bounds providing theoretical justification for transferring GNDE models trained on moderate-sized graphs to larger graphs without retraining.

What carries the argument

Graphon-NDEs, defined as the infinite-node limit of GNDEs, which enable the use of graphon theory and dynamical systems tools to prove convergence and rates.

If this is right

GNDE solutions converge trajectory-wise to Graphon-NDE solutions in the infinite-node limit.
Explicit convergence rates hold for weighted graphs sampled from smooth graphons.
Explicit convergence rates hold for unweighted graphs sampled from 0-1 valued graphons.
Size transferability bounds allow GNDE models trained on moderate graphs to apply to larger structurally similar graphs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The convergence framework may extend to other continuous-depth graph models beyond ODEs.
Practitioners could use the bounds to select training graph sizes that guarantee acceptable error on target larger graphs.
The results connect discrete graph dynamics to mean-field type limits, potentially informing stability analysis for very large networks.
Similar transferability arguments might apply when graphs evolve over time rather than remaining static.

Load-bearing premise

Graphon-NDEs are well-posed and graphon theory with dynamical systems tools applies to establish the infinite-node limit and convergence for graphs sampled from smooth or 0-1 valued graphons.

What would settle it

Numerical observation that GNDE solution trajectories do not approach Graphon-NDE trajectories, or fail to match the stated explicit rates, as the number of nodes increases under either the smooth graphon or 0-1 graphon sampling regime.

read the original abstract

Continuous-depth graph neural networks, also known as Graph Neural Differential Equations (GNDEs), combine the structural inductive bias of Graph Neural Networks (GNNs) with the continuous-depth architecture of Neural ODEs, offering a scalable and principled framework for modeling dynamics on graphs. In this paper, we present a rigorous convergence analysis of GNDEs with time-varying parameters in the infinite-node limit, providing theoretical insights into their size transferability. To this end, we introduce Graphon Neural Differential Equations (Graphon-NDEs) as the infinite-node limit of GNDEs and establish their well-posedness. Leveraging tools from graphon theory and dynamical systems, we prove the trajectory-wise convergence of GNDE solutions to Graphon-NDE solutions. Moreover, we derive explicit convergence rates under two deterministic graph sampling regimes: (1) weighted graphs sampled from smooth graphons, and (2) unweighted graphs sampled from $\{0,1\}$-valued (discontinuous) graphons. We further establish size transferability bounds, providing theoretical justification for the practical strategy of transferring GNDE models trained on moderate-sized graphs to larger, structurally similar graphs without retraining. Numerical experiments using synthetic and real data support our theoretical findings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper establishes explicit convergence rates for GNDEs to Graphon-NDEs under both smooth and {0,1} graphons plus transferability bounds, but the well-posedness step for discontinuous cases is the part that needs verification.

read the letter

Hi, the one or two things to know about this paper are that it introduces Graphon-NDEs as the limit object for GNDEs with time-varying parameters and then derives explicit convergence rates under two graph sampling regimes, plus size transferability bounds. What the paper does well is to extend the graphon framework to continuous-depth models. The trajectory-wise convergence result and the rates for both smooth and discontinuous graphons are the new pieces. Having bounds that justify transferring a model trained on a moderate-sized graph to a larger one without retraining is practically relevant, and the experiments on synthetic and real data give some reassurance that the theory is not just formal. The soft spots are mainly around the well-posedness claim for the Graphon-NDE when the graphon is {0,1}-valued. The concern is whether the time-dependent vector field remains sufficiently regular for existence and uniqueness in the infinite-dimensional setting. For smooth graphons this is standard, but with jumps the integral operator may only be bounded, and if the Lipschitz constant blows up or the Osgood condition fails, the subsequent estimates using Gronwall would not go through. The abstract says they establish well-posedness, so presumably they have conditions that make it work, but that part needs careful reading to see if the assumptions are realistic or overly restrictive. This paper is for researchers in machine learning theory who work on graph dynamical systems or limits of neural networks on graphs. Someone looking for theoretical support for size transfer in GNNs would find it relevant. I would recommend sending it for peer review. The work is grounded enough in existing tools that a referee can evaluate the proofs and the applicability of the assumptions.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces Graphon Neural Differential Equations (Graphon-NDEs) as the infinite-node limit of Graph Neural Differential Equations (GNDEs) with time-varying parameters. It establishes their well-posedness and proves trajectory-wise convergence of GNDE solutions to Graphon-NDE solutions, deriving explicit convergence rates under weighted sampling from smooth graphons and unweighted sampling from {0,1}-valued graphons. Size transferability bounds are provided to justify transferring models from moderate to larger graphs, supported by numerical experiments.

Significance. If the derivations hold, this work is significant for providing theoretical justification for size transferability in continuous-depth GNNs via graphon limits. The explicit convergence rates under both smooth and discontinuous graphon regimes, combined with the use of cut-norm estimates and dynamical systems tools, strengthen the foundation for scalable graph modeling and offer falsifiable predictions that can be checked numerically.

major comments (1)

[§4 (Well-posedness of Graphon-NDEs)] §4 (Well-posedness of Graphon-NDEs): The trajectory-wise convergence and explicit rates rest on well-posedness of Graphon-NDEs as ODEs on L^2[0,1]. For {0,1}-valued graphons the induced integral operator is bounded but the time-varying vector field must satisfy uniform Lipschitz (or Osgood) conditions in the chosen function space; the manuscript should expand the verification that Carathéodory conditions hold when the GNN layers are composed with a discontinuous graphon, as failure here would invalidate the subsequent Gronwall estimates used for convergence.

minor comments (2)

[Abstract] The abstract could state the precise function space and norm in which convergence is measured to allow readers to assess the result quickly.
[Numerical Experiments] Figure captions in the experimental section should report the exact graph sizes and number of Monte-Carlo repetitions used to generate the plotted transferability errors.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading of our manuscript and for the constructive comment on the well-posedness analysis. We address the point below and will incorporate the requested expansion in the revised version.

read point-by-point responses

Referee: §4 (Well-posedness of Graphon-NDEs): The trajectory-wise convergence and explicit rates rest on well-posedness of Graphon-NDEs as ODEs on L^2[0,1]. For {0,1}-valued graphons the induced integral operator is bounded but the time-varying vector field must satisfy uniform Lipschitz (or Osgood) conditions in the chosen function space; the manuscript should expand the verification that Carathéodory conditions hold when the GNN layers are composed with a discontinuous graphon, as failure here would invalidate the subsequent Gronwall estimates used for convergence.

Authors: We appreciate this observation and agree that a more explicit verification strengthens the presentation. In the revised manuscript we will expand Section 4 with a dedicated paragraph (and, if space permits, a short appendix) that directly checks the Carathéodory conditions for the time-varying vector field on L^2[0,1] when the underlying graphon is {0,1}-valued. Under the standing assumptions that the GNN layers are Lipschitz continuous and the activation is bounded, we verify that the resulting map (t,u) ↦ f(t,u) is measurable in t for almost every t, continuous in u, and satisfies an integrable bound independent of the discontinuity of the graphon. The boundedness of the integral operator follows from the L^∞ norm of the graphon, which is finite by definition. These properties ensure local existence and uniqueness via the standard Carathéodory theorem on Banach spaces, thereby justifying the subsequent application of Gronwall’s inequality in the convergence proofs. The main theorems and rates remain unchanged; only the exposition is augmented. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation relies on external graphon theory and dynamical systems

full rationale

The paper defines Graphon-NDEs as the infinite-node limit of GNDEs, establishes well-posedness, and proves trajectory-wise convergence plus explicit rates using standard tools from graphon theory and dynamical systems under two sampling regimes. No self-definitional reductions appear, no fitted parameters are relabeled as predictions, and no load-bearing self-citations or uniqueness theorems imported from the authors' prior work are invoked to force the result. The central claims remain independent of the paper's own inputs and rest on externally verifiable mathematical machinery.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The claims rest on the introduction of Graphon-NDEs and background assumptions from graphon theory; no free parameters are mentioned.

axioms (2)

domain assumption Well-posedness of Graphon-NDEs
Invoked to define the infinite-node limit and support convergence proofs.
standard math Convergence results from graphon theory and dynamical systems
Leveraged to prove trajectory-wise convergence and derive rates.

invented entities (1)

Graphon-NDE no independent evidence
purpose: Serves as the infinite-node limit of GNDEs
Newly defined object that the convergence analysis targets.

pith-pipeline@v0.9.0 · 5744 in / 1188 out tokens · 44219 ms · 2026-05-18T09:55:06.432553+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We prove the trajectory-wise convergence of GNDE solutions to Graphon-NDE solutions... derive explicit convergence rates under two deterministic graph sampling regimes... size transferability bounds
IndisputableMonolith/Cost/FunctionalEquation washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 3 (Well-posedness... AS0 and AS1... unique solution X in C1([0,T];L^∞)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.