Any-Dimensional Invariant Universality

Eitan Levin; Mateo D\'iaz; Shengtai Yao

arxiv: 2605.23156 · v1 · pith:PSUA4NIJnew · submitted 2026-05-22 · 💻 cs.LG · math.FA· math.RT· stat.ML

Any-Dimensional Invariant Universality

Shengtai Yao , Eitan Levin , Mateo D\'iaz This is my paper

Pith reviewed 2026-05-25 04:51 UTC · model grok-4.3

classification 💻 cs.LG math.FAmath.RTstat.ML

keywords any-dimensional modelsuniversalitylimit spaceinvariant functionsvariable size inputsmachine learninggraph neural networkspoint clouds

0 comments

The pith

Any-dimensional models are universal when viewed as functions on an infinite-dimensional limit space with a symmetry-induced topology.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework for proving universality of machine learning models that accept inputs of arbitrary sizes, such as graphs or point clouds. It identifies these any-dimensional functions with a single function defined on a limit space that includes all finite-sized inputs and their limits. By equipping this space with a topology derived from input symmetries and size relations, the authors establish universality on families of compact sets. This method also identifies failures in existing models and provides modifications to achieve universality.

Core claim

We develop a systematic approach to establish any-dimensional universality, by identifying any-dimensional functions with a unique function taking inputs in a suitable infinite-dimensional limit space containing inputs of all finite sizes as well as their limits. Using the symmetries of these inputs and relations between inputs of different sizes, we show that this limit space admits a natural topology with rich families of compact sets on which any-dimensional universality can be established. We illustrate our approach by showing that several existing architectures fail to be universal, and we propose simple modifications that restore universality.

What carries the argument

The infinite-dimensional limit space that embeds all finite-sized inputs and their limits, equipped with a natural topology induced by symmetries and inter-size relations.

If this is right

Existing architectures for any-dimensional inputs can fail to be universal.
Simple modifications to those architectures can restore universality.
Universality holds on rich families of compact sets within the limit space.
Any-dimensional models can be analyzed uniformly through their corresponding functions on the limit space.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This approach might extend to other domains with variable input sizes, such as sequences or trees.
Designing new any-dimensional architectures could prioritize compatibility with the limit space topology.
Practical implementations may need to approximate the limit space behavior for finite but large inputs.

Load-bearing premise

The assumption that any-dimensional functions correspond uniquely to functions on an infinite-dimensional limit space whose topology is naturally induced by symmetries and size relations.

What would settle it

An explicit counterexample of an any-dimensional function that cannot be represented as a continuous function on the proposed limit space, or a compact set where universality does not hold for a modified architecture.

read the original abstract

Several machine learning models are defined for inputs of any size, such as graphs with different numbers of nodes and point clouds containing varying numbers of points. The universality properties of such any-dimensional models remain poorly understood, as universality is traditionally studied for models accepting inputs of a fixed size, defined on a compact subset of their domain. In sharp contrast, any-dimensional models can be viewed as sequences of functions defined on growing-sized inputs, and it is not clear in which sense they can be universal. We develop a systematic approach to establish any-dimensional universality, by identifying any-dimensional functions with a unique function taking inputs in a suitable infinite-dimensional limit space containing inputs of all finite sizes as well as their limits. Using the symmetries of these inputs and relations between inputs of different sizes, we show that this limit space admits a natural topology with rich families of compact sets on which any-dimensional universality can be established. We illustrate our approach by showing that several existing architectures fail to be universal, and we propose simple modifications that restore universality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper maps any-dimensional models to functions on an infinite-dimensional limit space with a symmetry-induced topology to prove universality on compact sets.

read the letter

The core contribution is a framework that treats any-dimensional functions (for graphs, point clouds, sets) as single functions on a limit space containing all finite input sizes plus their limits. Symmetries and cross-size relations then induce a topology, allowing universality statements on families of compact subsets. This is new relative to the usual fixed-dimension approximation theorems. The paper applies the method to exhibit non-universality in several existing architectures and gives simple modifications that restore it. That concrete illustration is useful and shows the framework is not purely formal. The central construction appears coherent: the identification step and the topology are presented as the main technical work rather than assumed. No obvious circularity or parameter-fitting is visible in the stated approach. The main soft spot is that the abstract supplies no derivations, so one cannot yet check whether the topology is sufficiently rich or whether the compact sets are large enough to be practically relevant; that will determine how far the results travel. Minor additional point: the paper should clarify how the limit-space functions relate back to implementable finite models without hidden constants. Overall this is aimed at theorists working on expressive power of permutation-equivariant or size-agnostic architectures. It is worth a serious referee because it directly addresses a recognized gap and supplies both a general method and counter-examples.

Referee Report

0 major / 2 minor

Summary. The paper develops a systematic framework for any-dimensional universality in machine learning models handling variable-sized inputs (e.g., graphs, point clouds). It identifies any-dimensional functions with a unique function on an infinite-dimensional limit space containing all finite-sized inputs and their limits, then uses input symmetries and cross-size relations to induce a natural topology on this space. Universality is established on rich families of compact subsets of the limit space. The approach is illustrated by demonstrating that several existing architectures are not universal and by proposing simple modifications that restore universality.

Significance. If the central construction holds, the framework supplies a unified, topology-based method for proving universality results that apply simultaneously across all input dimensions, addressing a clear gap relative to the fixed-dimension case. The explicit use of symmetry-induced topologies and compact sets in the limit space is a concrete technical contribution that could be reused for other architectures.

minor comments (2)

[Abstract] The abstract states that the limit space 'admits a natural topology' but does not name the specific topology or the compactness criterion used; a one-sentence pointer in the abstract would help readers locate the definition in §3 or §4.
When the paper states that 'several existing architectures fail to be universal,' the precise notion of failure (e.g., which compact sets are not approximated) should be cross-referenced to the corresponding theorem or corollary.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript, accurate summary of the central construction, and recommendation for minor revision. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's core construction identifies any-dimensional functions with a single function on an infinite-dimensional limit space whose topology is induced by input symmetries and inter-size relations; this step is presented as a direct application of functional analysis to the problem setup rather than a reduction to fitted parameters, self-definitional equations, or load-bearing prior results by the same authors. No equations or claims in the abstract or described framework equate a derived universality statement to its own inputs by construction, and the approach is self-contained against external benchmarks in standard topology and approximation theory.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on the existence of the identification with a unique function on the limit space and the induced topology; these are domain assumptions introduced to enable the universality statements.

axioms (2)

domain assumption Any-dimensional functions can be identified with a unique function taking inputs in a suitable infinite-dimensional limit space containing inputs of all finite sizes as well as their limits.
This identification is the foundational step stated in the abstract for viewing any-dimensional models as single functions.
domain assumption The symmetries of these inputs and relations between inputs of different sizes induce a natural topology on the limit space with rich families of compact sets.
Invoked to establish the setting where universality can be proved.

invented entities (1)

Infinite-dimensional limit space no independent evidence
purpose: To serve as the domain containing all finite-sized inputs and their limits so that any-dimensional functions correspond to single functions on it.
Postulated as the key object enabling the topology and universality results.

pith-pipeline@v0.9.0 · 5708 in / 1580 out tokens · 31457 ms · 2026-05-25T04:51:51.285022+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

65 extracted references · 65 canonical work pages · 2 internal anchors

[1]

Deep sets.Advances in neural information processing systems, 30, 2017

Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Russ R Salakhutdinov, and Alexander J Smola. Deep sets.Advances in neural information processing systems, 30, 2017

work page 2017
[2]

A new model for learning in graph domains

Marco Gori, Gabriele Monfardini, and Franco Scarselli. A new model for learning in graph domains. InProceedings. 2005 IEEE international joint conference on neural networks, 2005., volume 2, pages 729–734. IEEE, 2005

work page 2005
[3]

Pointnet: Deep learning on point sets for 3d classification and segmentation

Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017. 13

work page 2017
[4]

On the universality of invariant networks

Haggai Maron, Ethan Fetaya, Nimrod Segol, and Yaron Lipman. On the universality of invariant networks. InInternational conference on machine learning, pages 4363–4371. PMLR, 2019

work page 2019
[5]

Universal invariant and equivariant graph neural networks

Nicolas Keriven and Gabriel Peyré. Universal invariant and equivariant graph neural networks. Advances in neural information processing systems, 32, 2019

work page 2019
[6]

Universal approximations of invariant maps by neural networks.Constructive Approximation, 55(1):407–474, 2022

Dmitry Yarotsky. Universal approximations of invariant maps by neural networks.Constructive Approximation, 55(1):407–474, 2022

work page 2022
[7]

On transferring transferability: Towards a theory for size generalization.arXiv preprint arXiv:2505.23599, 2025

Eitan Levin, Yuxin Ma, Mateo Díaz, and Soledad Villar. On transferring transferability: Towards a theory for size generalization.arXiv preprint arXiv:2505.23599, 2025

work page arXiv 2025
[8]

Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 2002

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 2002

work page 2002
[9]

Enlarging smaller images before inputting into convolutional neural network: zero-padding vs

Mahdi Hashemi. Enlarging smaller images before inputting into convolutional neural network: zero-padding vs. interpolation.Journal of Big Data, 6(1):1–13, 2019

work page 2019
[10]

Long short-term memory.Neural computation, 9(8):1735–1780, 1997

Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory.Neural computation, 9(8):1735–1780, 1997

work page 1997
[11]

Parsing natural scenes and natural language with recursive neural networks

Richard Socher, Cliff C Lin, Chris Manning, and Andrew Y Ng. Parsing natural scenes and natural language with recursive neural networks. InProceedings of the 28th international conference on machine learning (ICML-11), pages 129–136, 2011

work page 2011
[12]

Attention is all you need.Advances in neural information processing systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017

work page 2017
[13]

Learning nonlinear operators via deeponet based on the universal approximation theorem of operators

Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229, 2021

work page 2021
[14]

Fourier Neural Operator for Parametric Partial Differential Equations

Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations.arXiv preprint arXiv:2010.08895, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010
[15]

The graph neural network model.IEEE transactions on neural networks, 20(1):61–80, 2008

Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model.IEEE transactions on neural networks, 20(1):61–80, 2008

work page 2008
[16]

Scalars are universal: Equivariant machine learning, structured like classical physics.Advances in neural information processing systems, 34:28848–28863, 2021

Soledad Villar, David W Hogg, Kate Storey-Fisher, Weichi Yao, and Ben Blum-Smith. Scalars are universal: Equivariant machine learning, structured like classical physics.Advances in neural information processing systems, 34:28848–28863, 2021

work page 2021
[17]

Functions on symmetric matrices and point clouds via lightweight invariant features from galois theory.SIAM Journal on Applied Algebra and Geometry, 9(4):902–938, 2025

Ben Blum-Smith, Ningyuan Huang, Marco Cuturi, and Soledad Villar. Functions on symmetric matrices and point clouds via lightweight invariant features from galois theory.SIAM Journal on Applied Algebra and Geometry, 9(4):902–938, 2025

work page 2025
[18]

Free descriptions of convex sets.arXiv preprint arXiv:2307.04230, 2023

Eitan Levin and Venkat Chandrasekaran. Free descriptions of convex sets.arXiv preprint arXiv:2307.04230, 2023

work page arXiv 2023
[19]

Any-dimensional equivariant neural networks

Eitan Levin and Mateo Díaz. Any-dimensional equivariant neural networks. InInternational Conference on Artificial Intelligence and Statistics, pages 2773–2781. PMLR, 2024. 14

work page 2024
[20]

Invariant kernels: Rank stabilization and generalization across dimensions.arXiv preprint arXiv:2502.01886, 2025

Mateo Díaz, Dmitriy Drusvyatskiy, Jack Kendrick, and Rekha R Thomas. Invariant kernels: Rank stabilization and generalization across dimensions.arXiv preprint arXiv:2502.01886, 2025

work page arXiv 2025
[21]

Representation theory and homological stability.Advances in Mathematics, 245:250–314, 2013

Thomas Church and Benson Farb. Representation theory and homological stability.Advances in Mathematics, 245:250–314, 2013

work page 2013
[22]

Fi-modules and stability for represen- tations of symmetric groups.Duke Mathematical Journal, 164(9), 2015

Thomas Church, Jordan S Ellenberg, and Benson Farb. Fi-modules and stability for represen- tations of symmetric groups.Duke Mathematical Journal, 164(9), 2015

work page 2015
[23]

Graphon neural networks and the trans- ferability of graph neural networks.Advances in Neural Information Processing Systems, 33:1702–1712, 2020

Luana Ruiz, Luiz Chamon, and Alejandro Ribeiro. Graphon neural networks and the trans- ferability of graph neural networks.Advances in Neural Information Processing Systems, 33:1702–1712, 2020

work page 2020
[24]

Transferability of spectral graph convolutional neural networks.Journal of Machine Learning Research, 22(272):1–59, 2021

Ron Levie, Wei Huang, Lorenzo Bucci, Michael Bronstein, and Gitta Kutyniok. Transferability of spectral graph convolutional neural networks.Journal of Machine Learning Research, 22(272):1–59, 2021

work page 2021
[25]

Convergence and stability of graph convolutional networks on large random graphs.Advances in Neural Information Processing Systems, 33:21512–21523, 2020

Nicolas Keriven, Alberto Bietti, and Samuel Vaiter. Convergence and stability of graph convolutional networks on large random graphs.Advances in Neural Information Processing Systems, 33:21512–21523, 2020

work page 2020
[26]

Transferability properties of graph neural networks.IEEE Transactions on Signal Processing, 71:3474–3489, 2023

Luana Ruiz, Luiz FO Chamon, and Alejandro Ribeiro. Transferability properties of graph neural networks.IEEE Transactions on Signal Processing, 71:3474–3489, 2023

work page 2023
[27]

Transferability of graph neural networks: an extended graphon approach.Applied and Computational Harmonic Analysis, 63:48–83, 2023

Sohir Maskey, Ron Levie, and Gitta Kutyniok. Transferability of graph neural networks: an extended graphon approach.Applied and Computational Harmonic Analysis, 63:48–83, 2023

work page 2023
[28]

Convergence of message passing graph neural networks with generic aggregation on random graphs

Matthieu Cordonnier, Nicolas Keriven, Nicolas Tremblay, and Samuel Vaiter. Convergence of message passing graph neural networks with generic aggregation on random graphs. InGSP 2023-6th Graph Signal Processing workshop, pages 1–3, 2023

work page 2023
[29]

Convergence of invariant graph networks

Chen Cai and Yusu Wang. Convergence of invariant graph networks. InInternational Conference on Machine Learning, pages 2457–2484. PMLR, 2022

work page 2022
[30]

Approximation by superpositions of a sigmoidal function.Mathematics of control, signals and systems, 2(4):303–314, 1989

George Cybenko. Approximation by superpositions of a sigmoidal function.Mathematics of control, signals and systems, 2(4):303–314, 1989

work page 1989
[31]

Multilayer feedforward networks are universal approximators.Neural networks, 2(5):359–366, 1989

Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators.Neural networks, 2(5):359–366, 1989

work page 1989
[32]

Multilayer feedforward networks with a nonpolynomial activation function can approximate any function.Neural networks, 6(6):861–867, 1993

Moshe Leshno, Vladimir Ya Lin, Allan Pinkus, and Shimon Schocken. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function.Neural networks, 6(6):861–867, 1993

work page 1993
[33]

Approximation capabilities of multilayer feedforward networks.Neural networks, 4(2):251–257, 1991

Kurt Hornik. Approximation capabilities of multilayer feedforward networks.Neural networks, 4(2):251–257, 1991

work page 1991
[34]

Noncompact uniform universal approximation.Neural Networks, 173:106181, 2024

Teun DH van Nuland. Noncompact uniform universal approximation.Neural Networks, 173:106181, 2024

work page 2024
[35]

Universal approximation results for neural networks with non-polynomialactivationfunctionovernon-compactdomains.arXiv preprint arXiv:2410.14759, 2024

Ariel Neufeld and Philipp Schmocker. Universal approximation results for neural networks with non-polynomialactivationfunctionovernon-compactdomains.arXiv preprint arXiv:2410.14759, 2024. 15

work page arXiv 2024
[36]

Weighted sobolev approximation rates for neural networks on unbounded domains.arXiv preprint arXiv:2411.04108, 2024

Ahmed Abdeljawad and Thomas Dittrich. Weighted sobolev approximation rates for neural networks on unbounded domains.arXiv preprint arXiv:2411.04108, 2024

work page arXiv 2024
[37]

How Powerful are Graph Neural Networks?

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks?arXiv preprint arXiv:1810.00826, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[38]

A reduction of a graph to a canonical form and an algebra arising during this reduction.Nauchno-Technicheskaya Informatsiya, 2(9):12–16, 1968

Andrei Leman and Boris Weisfeiler. A reduction of a graph to a canonical form and an algebra arising during this reduction.Nauchno-Technicheskaya Informatsiya, 2(9):12–16, 1968

work page 1968
[39]

Provably powerful graph networks.Advances in neural information processing systems, 32, 2019

Haggai Maron, Heli Ben-Hamu, Hadar Serviansky, and Yaron Lipman. Provably powerful graph networks.Advances in neural information processing systems, 32, 2019

work page 2019
[40]

Weisfeiler and leman go neural: Higher-order graph neural networks

Christopher Morris, Martin Ritzert, Matthias Fey, William L Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe. Weisfeiler and leman go neural: Higher-order graph neural networks. InProceedings of the AAAI conference on artificial intelligence, volume 33, pages 4602–4609, 2019

work page 2019
[41]

A simple proof of the universality of invariant/equivariant graph neural networks.arXiv preprint arXiv:1910.03802, 2019

Takanori Maehara and Hoang NT. A simple proof of the universality of invariant/equivariant graph neural networks.arXiv preprint arXiv:1910.03802, 2019

work page arXiv 1910
[42]

Graph homomorphism convolution

Hoang Nguyen and Takanori Maehara. Graph homomorphism convolution. InInternational Conference on Machine Learning, pages 7306–7316. PMLR, 2020

work page 2020
[43]

Universal approximations of permuta- tion invariant/equivariant functions by deep neural networks.arXiv preprint arXiv:1903.01939, 2019

Akiyoshi Sannai, Yuuki Takai, and Matthieu Cordonnier. Universal approximations of permuta- tion invariant/equivariant functions by deep neural networks.arXiv preprint arXiv:1903.01939, 2019

work page arXiv 1903
[44]

On the representation power of set pooling networks

Christian Bueno and Alan Hylton. On the representation power of set pooling networks. Advances in Neural Information Processing Systems, 34:17170–17182, 2021

work page 2021
[45]

A functional perspective on learning symmetric functions with neural networks

Aaron Zweig and Joan Bruna. A functional perspective on learning symmetric functions with neural networks. InInternational Conference on Machine Learning, pages 13023–13032. PMLR, 2021

work page 2021
[46]

Higher-order graphon neural networks: Approximation and cut distance.arXiv preprint arXiv:2503.14338, 2025

Daniel Herbst and Stefanie Jegelka. Higher-order graphon neural networks: Approximation and cut distance.arXiv preprint arXiv:2503.14338, 2025

work page arXiv 2025
[47]

International Series in Pure and Applied Mathematics

Walter Rudin.Functional Analysis. International Series in Pure and Applied Mathematics. McGraw-Hill, New York, 2 edition, 1991

work page 1991
[48]

Springer Science & Business Media, 2012

Joseph Diestel.Sequences and series in Banach spaces, volume 92. Springer Science & Business Media, 2012

work page 2012
[49]

Springer, 2005

Luigi Ambrosio, Nicola Gigli, and Giuseppe Savaré.Gradient flows: in metric spaces and in the space of probability measures. Springer, 2005

work page 2005
[50]

American Mathematical Soc., 2012

László Lovász.Large networks and graph limits. American Mathematical Soc., 2012

work page 2012
[51]

A practical method for constructing equivariant multilayer perceptrons for arbitrary matrix groups

Marc Finzi, Max Welling, and Andrew Gordon Wilson. A practical method for constructing equivariant multilayer perceptrons for arbitrary matrix groups. InInternational conference on machine learning, pages 3318–3328. PMLR, 2021. 16

work page 2021
[52]

AnLp theory of sparse graph convergence I: Limits, sparse random graph models, and power law distributions.Transactions of the American Mathematical Society, 372(5):3019–3062, 2019

Christian Borgs, Jennifer Chayes, Henry Cohn, and Yufei Zhao. AnLp theory of sparse graph convergence I: Limits, sparse random graph models, and power law distributions.Transactions of the American Mathematical Society, 372(5):3019–3062, 2019

work page 2019
[53]

Chayes, Henry Cohn, and Nina Holden

Christian Borgs, Jennifer T. Chayes, Henry Cohn, and Nina Holden. Sparse exchangeable graphs and their limits via graphon processes.Journal of Machine Learning Research, 18(210):1–71, 2018

work page 2018
[54]

Action convergence of operators and graphs.Canadian Journal of Mathematics, 74(1):72–121, 2022

Ágnes Backhausz and Balázs Szegedy. Action convergence of operators and graphs.Canadian Journal of Mathematics, 74(1):72–121, 2022

work page 2022
[55]

Courier Corporation, 2012

Stephen Willard.General topology. Courier Corporation, 2012

work page 2012
[56]

Principles of mathematical analysis.3rd ed., 1976

Walter Rudin. Principles of mathematical analysis.3rd ed., 1976

work page 1976
[57]

Springer, 2008

Cédric Villani et al.Optimal transport: old and new, volume 338. Springer, 2008

work page 2008
[58]

Springer, 2007

Vladimir I Bogachev.Measure theory. Springer, 2007

work page 2007
[59]

Borgs, J.T

C. Borgs, J.T. Chayes, L. Lovász, V.T. Sós, and K. Vesztergombi. Convergent sequences of dense graphs I: Subgraph frequencies, metric properties and testing.Advances in Mathematics, 219(6):1801–1851, 2008

work page 2008
[60]

Differential calculus on graphon space.Journal of Combinatorial Theory, Series A, 133:183–227, 2015

Peter Diao, Dominique Guillot, Apoorva Khare, and Bala Rajaratnam. Differential calculus on graphon space.Journal of Combinatorial Theory, Series A, 133:183–227, 2015

work page 2015
[61]

Springer, 1998

Sashi Mohan Srivastava.A course on Borel sets. Springer, 1998

work page 1998
[62]

Prentice Hall, 1997

Stephen H Friedberg, Arnold J Insel, and Lawrence E Spence.Linear algebra. Prentice Hall, 1997. A Missing details from Section 2 In this section, we present missing details from Section 2. We start with a more detailed definition of a consistent sequence. Definition A.1(Detailed version of Def. 2.1).Aconsistent sequenceof group representations over direct...

work page 1997
[63]

Specifically, whenevern⪯Nthere is an injective group homomorphismθN,n: Gn→GN with θi,i= idGi for alli∈N, θk,j◦θj,i=θk,i wheneveri⪯j⪯kinN

(Groups) A sequence of groups(Gn) indexed by N that embed into each other. Specifically, whenevern⪯Nthere is an injective group homomorphismθN,n: Gn→GN with θi,i= idGi for alli∈N, θk,j◦θj,i=θk,i wheneveri⪯j⪯kinN

work page
[64]

(Vector spaces) A sequence of finite-dimensional, real vector spaces(Vn) indexed by N such that eachVn is aGn-representation

work page
[65]

17 and such that φi,i= idVi for alli∈N, φk,j◦φj,i=φk,i wheneveri⪯j⪯kinN

(Embeddings) A collection of embeddings(φN,n:V n ↪→VN)n⪯Nsuch that φN,nisG n- equivariant, i.e., φN,n(g·v) =θN,n(g)·φN,n(v)for allg∈Gn,v∈Vn. 17 and such that φi,i= idVi for alli∈N, φk,j◦φj,i=φk,i wheneveri⪯j⪯kinN. We can take direct sums of consistent sequences to obtain richer consistent sequences, as done in several of the examples of Section 4. Definit...

work page

[1] [1]

Deep sets.Advances in neural information processing systems, 30, 2017

Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Russ R Salakhutdinov, and Alexander J Smola. Deep sets.Advances in neural information processing systems, 30, 2017

work page 2017

[2] [2]

A new model for learning in graph domains

Marco Gori, Gabriele Monfardini, and Franco Scarselli. A new model for learning in graph domains. InProceedings. 2005 IEEE international joint conference on neural networks, 2005., volume 2, pages 729–734. IEEE, 2005

work page 2005

[3] [3]

Pointnet: Deep learning on point sets for 3d classification and segmentation

Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017. 13

work page 2017

[4] [4]

On the universality of invariant networks

Haggai Maron, Ethan Fetaya, Nimrod Segol, and Yaron Lipman. On the universality of invariant networks. InInternational conference on machine learning, pages 4363–4371. PMLR, 2019

work page 2019

[5] [5]

Universal invariant and equivariant graph neural networks

Nicolas Keriven and Gabriel Peyré. Universal invariant and equivariant graph neural networks. Advances in neural information processing systems, 32, 2019

work page 2019

[6] [6]

Universal approximations of invariant maps by neural networks.Constructive Approximation, 55(1):407–474, 2022

Dmitry Yarotsky. Universal approximations of invariant maps by neural networks.Constructive Approximation, 55(1):407–474, 2022

work page 2022

[7] [7]

On transferring transferability: Towards a theory for size generalization.arXiv preprint arXiv:2505.23599, 2025

Eitan Levin, Yuxin Ma, Mateo Díaz, and Soledad Villar. On transferring transferability: Towards a theory for size generalization.arXiv preprint arXiv:2505.23599, 2025

work page arXiv 2025

[8] [8]

Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 2002

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 2002

work page 2002

[9] [9]

Enlarging smaller images before inputting into convolutional neural network: zero-padding vs

Mahdi Hashemi. Enlarging smaller images before inputting into convolutional neural network: zero-padding vs. interpolation.Journal of Big Data, 6(1):1–13, 2019

work page 2019

[10] [10]

Long short-term memory.Neural computation, 9(8):1735–1780, 1997

Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory.Neural computation, 9(8):1735–1780, 1997

work page 1997

[11] [11]

Parsing natural scenes and natural language with recursive neural networks

Richard Socher, Cliff C Lin, Chris Manning, and Andrew Y Ng. Parsing natural scenes and natural language with recursive neural networks. InProceedings of the 28th international conference on machine learning (ICML-11), pages 129–136, 2011

work page 2011

[12] [12]

Attention is all you need.Advances in neural information processing systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017

work page 2017

[13] [13]

Learning nonlinear operators via deeponet based on the universal approximation theorem of operators

Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229, 2021

work page 2021

[14] [14]

Fourier Neural Operator for Parametric Partial Differential Equations

Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations.arXiv preprint arXiv:2010.08895, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010

[15] [15]

The graph neural network model.IEEE transactions on neural networks, 20(1):61–80, 2008

Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model.IEEE transactions on neural networks, 20(1):61–80, 2008

work page 2008

[16] [16]

Scalars are universal: Equivariant machine learning, structured like classical physics.Advances in neural information processing systems, 34:28848–28863, 2021

Soledad Villar, David W Hogg, Kate Storey-Fisher, Weichi Yao, and Ben Blum-Smith. Scalars are universal: Equivariant machine learning, structured like classical physics.Advances in neural information processing systems, 34:28848–28863, 2021

work page 2021

[17] [17]

Functions on symmetric matrices and point clouds via lightweight invariant features from galois theory.SIAM Journal on Applied Algebra and Geometry, 9(4):902–938, 2025

Ben Blum-Smith, Ningyuan Huang, Marco Cuturi, and Soledad Villar. Functions on symmetric matrices and point clouds via lightweight invariant features from galois theory.SIAM Journal on Applied Algebra and Geometry, 9(4):902–938, 2025

work page 2025

[18] [18]

Free descriptions of convex sets.arXiv preprint arXiv:2307.04230, 2023

Eitan Levin and Venkat Chandrasekaran. Free descriptions of convex sets.arXiv preprint arXiv:2307.04230, 2023

work page arXiv 2023

[19] [19]

Any-dimensional equivariant neural networks

Eitan Levin and Mateo Díaz. Any-dimensional equivariant neural networks. InInternational Conference on Artificial Intelligence and Statistics, pages 2773–2781. PMLR, 2024. 14

work page 2024

[20] [20]

Invariant kernels: Rank stabilization and generalization across dimensions.arXiv preprint arXiv:2502.01886, 2025

Mateo Díaz, Dmitriy Drusvyatskiy, Jack Kendrick, and Rekha R Thomas. Invariant kernels: Rank stabilization and generalization across dimensions.arXiv preprint arXiv:2502.01886, 2025

work page arXiv 2025

[21] [21]

Representation theory and homological stability.Advances in Mathematics, 245:250–314, 2013

Thomas Church and Benson Farb. Representation theory and homological stability.Advances in Mathematics, 245:250–314, 2013

work page 2013

[22] [22]

Fi-modules and stability for represen- tations of symmetric groups.Duke Mathematical Journal, 164(9), 2015

Thomas Church, Jordan S Ellenberg, and Benson Farb. Fi-modules and stability for represen- tations of symmetric groups.Duke Mathematical Journal, 164(9), 2015

work page 2015

[23] [23]

Graphon neural networks and the trans- ferability of graph neural networks.Advances in Neural Information Processing Systems, 33:1702–1712, 2020

Luana Ruiz, Luiz Chamon, and Alejandro Ribeiro. Graphon neural networks and the trans- ferability of graph neural networks.Advances in Neural Information Processing Systems, 33:1702–1712, 2020

work page 2020

[24] [24]

Transferability of spectral graph convolutional neural networks.Journal of Machine Learning Research, 22(272):1–59, 2021

Ron Levie, Wei Huang, Lorenzo Bucci, Michael Bronstein, and Gitta Kutyniok. Transferability of spectral graph convolutional neural networks.Journal of Machine Learning Research, 22(272):1–59, 2021

work page 2021

[25] [25]

Convergence and stability of graph convolutional networks on large random graphs.Advances in Neural Information Processing Systems, 33:21512–21523, 2020

Nicolas Keriven, Alberto Bietti, and Samuel Vaiter. Convergence and stability of graph convolutional networks on large random graphs.Advances in Neural Information Processing Systems, 33:21512–21523, 2020

work page 2020

[26] [26]

Transferability properties of graph neural networks.IEEE Transactions on Signal Processing, 71:3474–3489, 2023

Luana Ruiz, Luiz FO Chamon, and Alejandro Ribeiro. Transferability properties of graph neural networks.IEEE Transactions on Signal Processing, 71:3474–3489, 2023

work page 2023

[27] [27]

Transferability of graph neural networks: an extended graphon approach.Applied and Computational Harmonic Analysis, 63:48–83, 2023

Sohir Maskey, Ron Levie, and Gitta Kutyniok. Transferability of graph neural networks: an extended graphon approach.Applied and Computational Harmonic Analysis, 63:48–83, 2023

work page 2023

[28] [28]

Convergence of message passing graph neural networks with generic aggregation on random graphs

Matthieu Cordonnier, Nicolas Keriven, Nicolas Tremblay, and Samuel Vaiter. Convergence of message passing graph neural networks with generic aggregation on random graphs. InGSP 2023-6th Graph Signal Processing workshop, pages 1–3, 2023

work page 2023

[29] [29]

Convergence of invariant graph networks

Chen Cai and Yusu Wang. Convergence of invariant graph networks. InInternational Conference on Machine Learning, pages 2457–2484. PMLR, 2022

work page 2022

[30] [30]

Approximation by superpositions of a sigmoidal function.Mathematics of control, signals and systems, 2(4):303–314, 1989

George Cybenko. Approximation by superpositions of a sigmoidal function.Mathematics of control, signals and systems, 2(4):303–314, 1989

work page 1989

[31] [31]

Multilayer feedforward networks are universal approximators.Neural networks, 2(5):359–366, 1989

Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators.Neural networks, 2(5):359–366, 1989

work page 1989

[32] [32]

Multilayer feedforward networks with a nonpolynomial activation function can approximate any function.Neural networks, 6(6):861–867, 1993

Moshe Leshno, Vladimir Ya Lin, Allan Pinkus, and Shimon Schocken. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function.Neural networks, 6(6):861–867, 1993

work page 1993

[33] [33]

Approximation capabilities of multilayer feedforward networks.Neural networks, 4(2):251–257, 1991

Kurt Hornik. Approximation capabilities of multilayer feedforward networks.Neural networks, 4(2):251–257, 1991

work page 1991

[34] [34]

Noncompact uniform universal approximation.Neural Networks, 173:106181, 2024

Teun DH van Nuland. Noncompact uniform universal approximation.Neural Networks, 173:106181, 2024

work page 2024

[35] [35]

Universal approximation results for neural networks with non-polynomialactivationfunctionovernon-compactdomains.arXiv preprint arXiv:2410.14759, 2024

Ariel Neufeld and Philipp Schmocker. Universal approximation results for neural networks with non-polynomialactivationfunctionovernon-compactdomains.arXiv preprint arXiv:2410.14759, 2024. 15

work page arXiv 2024

[36] [36]

Weighted sobolev approximation rates for neural networks on unbounded domains.arXiv preprint arXiv:2411.04108, 2024

Ahmed Abdeljawad and Thomas Dittrich. Weighted sobolev approximation rates for neural networks on unbounded domains.arXiv preprint arXiv:2411.04108, 2024

work page arXiv 2024

[37] [37]

How Powerful are Graph Neural Networks?

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks?arXiv preprint arXiv:1810.00826, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[38] [38]

A reduction of a graph to a canonical form and an algebra arising during this reduction.Nauchno-Technicheskaya Informatsiya, 2(9):12–16, 1968

Andrei Leman and Boris Weisfeiler. A reduction of a graph to a canonical form and an algebra arising during this reduction.Nauchno-Technicheskaya Informatsiya, 2(9):12–16, 1968

work page 1968

[39] [39]

Provably powerful graph networks.Advances in neural information processing systems, 32, 2019

Haggai Maron, Heli Ben-Hamu, Hadar Serviansky, and Yaron Lipman. Provably powerful graph networks.Advances in neural information processing systems, 32, 2019

work page 2019

[40] [40]

Weisfeiler and leman go neural: Higher-order graph neural networks

Christopher Morris, Martin Ritzert, Matthias Fey, William L Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe. Weisfeiler and leman go neural: Higher-order graph neural networks. InProceedings of the AAAI conference on artificial intelligence, volume 33, pages 4602–4609, 2019

work page 2019

[41] [41]

A simple proof of the universality of invariant/equivariant graph neural networks.arXiv preprint arXiv:1910.03802, 2019

Takanori Maehara and Hoang NT. A simple proof of the universality of invariant/equivariant graph neural networks.arXiv preprint arXiv:1910.03802, 2019

work page arXiv 1910

[42] [42]

Graph homomorphism convolution

Hoang Nguyen and Takanori Maehara. Graph homomorphism convolution. InInternational Conference on Machine Learning, pages 7306–7316. PMLR, 2020

work page 2020

[43] [43]

Universal approximations of permuta- tion invariant/equivariant functions by deep neural networks.arXiv preprint arXiv:1903.01939, 2019

Akiyoshi Sannai, Yuuki Takai, and Matthieu Cordonnier. Universal approximations of permuta- tion invariant/equivariant functions by deep neural networks.arXiv preprint arXiv:1903.01939, 2019

work page arXiv 1903

[44] [44]

On the representation power of set pooling networks

Christian Bueno and Alan Hylton. On the representation power of set pooling networks. Advances in Neural Information Processing Systems, 34:17170–17182, 2021

work page 2021

[45] [45]

A functional perspective on learning symmetric functions with neural networks

Aaron Zweig and Joan Bruna. A functional perspective on learning symmetric functions with neural networks. InInternational Conference on Machine Learning, pages 13023–13032. PMLR, 2021

work page 2021

[46] [46]

Higher-order graphon neural networks: Approximation and cut distance.arXiv preprint arXiv:2503.14338, 2025

Daniel Herbst and Stefanie Jegelka. Higher-order graphon neural networks: Approximation and cut distance.arXiv preprint arXiv:2503.14338, 2025

work page arXiv 2025

[47] [47]

International Series in Pure and Applied Mathematics

Walter Rudin.Functional Analysis. International Series in Pure and Applied Mathematics. McGraw-Hill, New York, 2 edition, 1991

work page 1991

[48] [48]

Springer Science & Business Media, 2012

Joseph Diestel.Sequences and series in Banach spaces, volume 92. Springer Science & Business Media, 2012

work page 2012

[49] [49]

Springer, 2005

Luigi Ambrosio, Nicola Gigli, and Giuseppe Savaré.Gradient flows: in metric spaces and in the space of probability measures. Springer, 2005

work page 2005

[50] [50]

American Mathematical Soc., 2012

László Lovász.Large networks and graph limits. American Mathematical Soc., 2012

work page 2012

[51] [51]

A practical method for constructing equivariant multilayer perceptrons for arbitrary matrix groups

Marc Finzi, Max Welling, and Andrew Gordon Wilson. A practical method for constructing equivariant multilayer perceptrons for arbitrary matrix groups. InInternational conference on machine learning, pages 3318–3328. PMLR, 2021. 16

work page 2021

[52] [52]

AnLp theory of sparse graph convergence I: Limits, sparse random graph models, and power law distributions.Transactions of the American Mathematical Society, 372(5):3019–3062, 2019

Christian Borgs, Jennifer Chayes, Henry Cohn, and Yufei Zhao. AnLp theory of sparse graph convergence I: Limits, sparse random graph models, and power law distributions.Transactions of the American Mathematical Society, 372(5):3019–3062, 2019

work page 2019

[53] [53]

Chayes, Henry Cohn, and Nina Holden

Christian Borgs, Jennifer T. Chayes, Henry Cohn, and Nina Holden. Sparse exchangeable graphs and their limits via graphon processes.Journal of Machine Learning Research, 18(210):1–71, 2018

work page 2018

[54] [54]

Action convergence of operators and graphs.Canadian Journal of Mathematics, 74(1):72–121, 2022

Ágnes Backhausz and Balázs Szegedy. Action convergence of operators and graphs.Canadian Journal of Mathematics, 74(1):72–121, 2022

work page 2022

[55] [55]

Courier Corporation, 2012

Stephen Willard.General topology. Courier Corporation, 2012

work page 2012

[56] [56]

Principles of mathematical analysis.3rd ed., 1976

Walter Rudin. Principles of mathematical analysis.3rd ed., 1976

work page 1976

[57] [57]

Springer, 2008

Cédric Villani et al.Optimal transport: old and new, volume 338. Springer, 2008

work page 2008

[58] [58]

Springer, 2007

Vladimir I Bogachev.Measure theory. Springer, 2007

work page 2007

[59] [59]

Borgs, J.T

C. Borgs, J.T. Chayes, L. Lovász, V.T. Sós, and K. Vesztergombi. Convergent sequences of dense graphs I: Subgraph frequencies, metric properties and testing.Advances in Mathematics, 219(6):1801–1851, 2008

work page 2008

[60] [60]

Differential calculus on graphon space.Journal of Combinatorial Theory, Series A, 133:183–227, 2015

Peter Diao, Dominique Guillot, Apoorva Khare, and Bala Rajaratnam. Differential calculus on graphon space.Journal of Combinatorial Theory, Series A, 133:183–227, 2015

work page 2015

[61] [61]

Springer, 1998

Sashi Mohan Srivastava.A course on Borel sets. Springer, 1998

work page 1998

[62] [62]

Prentice Hall, 1997

Stephen H Friedberg, Arnold J Insel, and Lawrence E Spence.Linear algebra. Prentice Hall, 1997. A Missing details from Section 2 In this section, we present missing details from Section 2. We start with a more detailed definition of a consistent sequence. Definition A.1(Detailed version of Def. 2.1).Aconsistent sequenceof group representations over direct...

work page 1997

[63] [63]

Specifically, whenevern⪯Nthere is an injective group homomorphismθN,n: Gn→GN with θi,i= idGi for alli∈N, θk,j◦θj,i=θk,i wheneveri⪯j⪯kinN

(Groups) A sequence of groups(Gn) indexed by N that embed into each other. Specifically, whenevern⪯Nthere is an injective group homomorphismθN,n: Gn→GN with θi,i= idGi for alli∈N, θk,j◦θj,i=θk,i wheneveri⪯j⪯kinN

work page

[64] [64]

(Vector spaces) A sequence of finite-dimensional, real vector spaces(Vn) indexed by N such that eachVn is aGn-representation

work page

[65] [65]

17 and such that φi,i= idVi for alli∈N, φk,j◦φj,i=φk,i wheneveri⪯j⪯kinN

(Embeddings) A collection of embeddings(φN,n:V n ↪→VN)n⪯Nsuch that φN,nisG n- equivariant, i.e., φN,n(g·v) =θN,n(g)·φN,n(v)for allg∈Gn,v∈Vn. 17 and such that φi,i= idVi for alli∈N, φk,j◦φj,i=φk,i wheneveri⪯j⪯kinN. We can take direct sums of consistent sequences to obtain richer consistent sequences, as done in several of the examples of Section 4. Definit...

work page