Any-Dimensional Invariant Universality
Pith reviewed 2026-05-25 04:51 UTC · model grok-4.3
The pith
Any-dimensional models are universal when viewed as functions on an infinite-dimensional limit space with a symmetry-induced topology.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We develop a systematic approach to establish any-dimensional universality, by identifying any-dimensional functions with a unique function taking inputs in a suitable infinite-dimensional limit space containing inputs of all finite sizes as well as their limits. Using the symmetries of these inputs and relations between inputs of different sizes, we show that this limit space admits a natural topology with rich families of compact sets on which any-dimensional universality can be established. We illustrate our approach by showing that several existing architectures fail to be universal, and we propose simple modifications that restore universality.
What carries the argument
The infinite-dimensional limit space that embeds all finite-sized inputs and their limits, equipped with a natural topology induced by symmetries and inter-size relations.
If this is right
- Existing architectures for any-dimensional inputs can fail to be universal.
- Simple modifications to those architectures can restore universality.
- Universality holds on rich families of compact sets within the limit space.
- Any-dimensional models can be analyzed uniformly through their corresponding functions on the limit space.
Where Pith is reading between the lines
- This approach might extend to other domains with variable input sizes, such as sequences or trees.
- Designing new any-dimensional architectures could prioritize compatibility with the limit space topology.
- Practical implementations may need to approximate the limit space behavior for finite but large inputs.
Load-bearing premise
The assumption that any-dimensional functions correspond uniquely to functions on an infinite-dimensional limit space whose topology is naturally induced by symmetries and size relations.
What would settle it
An explicit counterexample of an any-dimensional function that cannot be represented as a continuous function on the proposed limit space, or a compact set where universality does not hold for a modified architecture.
read the original abstract
Several machine learning models are defined for inputs of any size, such as graphs with different numbers of nodes and point clouds containing varying numbers of points. The universality properties of such any-dimensional models remain poorly understood, as universality is traditionally studied for models accepting inputs of a fixed size, defined on a compact subset of their domain. In sharp contrast, any-dimensional models can be viewed as sequences of functions defined on growing-sized inputs, and it is not clear in which sense they can be universal. We develop a systematic approach to establish any-dimensional universality, by identifying any-dimensional functions with a unique function taking inputs in a suitable infinite-dimensional limit space containing inputs of all finite sizes as well as their limits. Using the symmetries of these inputs and relations between inputs of different sizes, we show that this limit space admits a natural topology with rich families of compact sets on which any-dimensional universality can be established. We illustrate our approach by showing that several existing architectures fail to be universal, and we propose simple modifications that restore universality.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper develops a systematic framework for any-dimensional universality in machine learning models handling variable-sized inputs (e.g., graphs, point clouds). It identifies any-dimensional functions with a unique function on an infinite-dimensional limit space containing all finite-sized inputs and their limits, then uses input symmetries and cross-size relations to induce a natural topology on this space. Universality is established on rich families of compact subsets of the limit space. The approach is illustrated by demonstrating that several existing architectures are not universal and by proposing simple modifications that restore universality.
Significance. If the central construction holds, the framework supplies a unified, topology-based method for proving universality results that apply simultaneously across all input dimensions, addressing a clear gap relative to the fixed-dimension case. The explicit use of symmetry-induced topologies and compact sets in the limit space is a concrete technical contribution that could be reused for other architectures.
minor comments (2)
- [Abstract] The abstract states that the limit space 'admits a natural topology' but does not name the specific topology or the compactness criterion used; a one-sentence pointer in the abstract would help readers locate the definition in §3 or §4.
- When the paper states that 'several existing architectures fail to be universal,' the precise notion of failure (e.g., which compact sets are not approximated) should be cross-referenced to the corresponding theorem or corollary.
Simulated Author's Rebuttal
We thank the referee for their positive assessment of the manuscript, accurate summary of the central construction, and recommendation for minor revision. No specific major comments were raised in the report.
Circularity Check
No significant circularity detected
full rationale
The paper's core construction identifies any-dimensional functions with a single function on an infinite-dimensional limit space whose topology is induced by input symmetries and inter-size relations; this step is presented as a direct application of functional analysis to the problem setup rather than a reduction to fitted parameters, self-definitional equations, or load-bearing prior results by the same authors. No equations or claims in the abstract or described framework equate a derived universality statement to its own inputs by construction, and the approach is self-contained against external benchmarks in standard topology and approximation theory.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Any-dimensional functions can be identified with a unique function taking inputs in a suitable infinite-dimensional limit space containing inputs of all finite sizes as well as their limits.
- domain assumption The symmetries of these inputs and relations between inputs of different sizes induce a natural topology on the limit space with rich families of compact sets.
invented entities (1)
-
Infinite-dimensional limit space
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Deep sets.Advances in neural information processing systems, 30, 2017
Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Russ R Salakhutdinov, and Alexander J Smola. Deep sets.Advances in neural information processing systems, 30, 2017
work page 2017
-
[2]
A new model for learning in graph domains
Marco Gori, Gabriele Monfardini, and Franco Scarselli. A new model for learning in graph domains. InProceedings. 2005 IEEE international joint conference on neural networks, 2005., volume 2, pages 729–734. IEEE, 2005
work page 2005
-
[3]
Pointnet: Deep learning on point sets for 3d classification and segmentation
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017. 13
work page 2017
-
[4]
On the universality of invariant networks
Haggai Maron, Ethan Fetaya, Nimrod Segol, and Yaron Lipman. On the universality of invariant networks. InInternational conference on machine learning, pages 4363–4371. PMLR, 2019
work page 2019
-
[5]
Universal invariant and equivariant graph neural networks
Nicolas Keriven and Gabriel Peyré. Universal invariant and equivariant graph neural networks. Advances in neural information processing systems, 32, 2019
work page 2019
-
[6]
Dmitry Yarotsky. Universal approximations of invariant maps by neural networks.Constructive Approximation, 55(1):407–474, 2022
work page 2022
-
[7]
Eitan Levin, Yuxin Ma, Mateo Díaz, and Soledad Villar. On transferring transferability: Towards a theory for size generalization.arXiv preprint arXiv:2505.23599, 2025
-
[8]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 2002
work page 2002
-
[9]
Enlarging smaller images before inputting into convolutional neural network: zero-padding vs
Mahdi Hashemi. Enlarging smaller images before inputting into convolutional neural network: zero-padding vs. interpolation.Journal of Big Data, 6(1):1–13, 2019
work page 2019
-
[10]
Long short-term memory.Neural computation, 9(8):1735–1780, 1997
Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory.Neural computation, 9(8):1735–1780, 1997
work page 1997
-
[11]
Parsing natural scenes and natural language with recursive neural networks
Richard Socher, Cliff C Lin, Chris Manning, and Andrew Y Ng. Parsing natural scenes and natural language with recursive neural networks. InProceedings of the 28th international conference on machine learning (ICML-11), pages 129–136, 2011
work page 2011
-
[12]
Attention is all you need.Advances in neural information processing systems, 30, 2017
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017
work page 2017
-
[13]
Learning nonlinear operators via deeponet based on the universal approximation theorem of operators
Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229, 2021
work page 2021
-
[14]
Fourier Neural Operator for Parametric Partial Differential Equations
Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations.arXiv preprint arXiv:2010.08895, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[15]
The graph neural network model.IEEE transactions on neural networks, 20(1):61–80, 2008
Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model.IEEE transactions on neural networks, 20(1):61–80, 2008
work page 2008
-
[16]
Soledad Villar, David W Hogg, Kate Storey-Fisher, Weichi Yao, and Ben Blum-Smith. Scalars are universal: Equivariant machine learning, structured like classical physics.Advances in neural information processing systems, 34:28848–28863, 2021
work page 2021
-
[17]
Ben Blum-Smith, Ningyuan Huang, Marco Cuturi, and Soledad Villar. Functions on symmetric matrices and point clouds via lightweight invariant features from galois theory.SIAM Journal on Applied Algebra and Geometry, 9(4):902–938, 2025
work page 2025
-
[18]
Free descriptions of convex sets.arXiv preprint arXiv:2307.04230, 2023
Eitan Levin and Venkat Chandrasekaran. Free descriptions of convex sets.arXiv preprint arXiv:2307.04230, 2023
-
[19]
Any-dimensional equivariant neural networks
Eitan Levin and Mateo Díaz. Any-dimensional equivariant neural networks. InInternational Conference on Artificial Intelligence and Statistics, pages 2773–2781. PMLR, 2024. 14
work page 2024
-
[20]
Mateo Díaz, Dmitriy Drusvyatskiy, Jack Kendrick, and Rekha R Thomas. Invariant kernels: Rank stabilization and generalization across dimensions.arXiv preprint arXiv:2502.01886, 2025
-
[21]
Representation theory and homological stability.Advances in Mathematics, 245:250–314, 2013
Thomas Church and Benson Farb. Representation theory and homological stability.Advances in Mathematics, 245:250–314, 2013
work page 2013
-
[22]
Thomas Church, Jordan S Ellenberg, and Benson Farb. Fi-modules and stability for represen- tations of symmetric groups.Duke Mathematical Journal, 164(9), 2015
work page 2015
-
[23]
Luana Ruiz, Luiz Chamon, and Alejandro Ribeiro. Graphon neural networks and the trans- ferability of graph neural networks.Advances in Neural Information Processing Systems, 33:1702–1712, 2020
work page 2020
-
[24]
Ron Levie, Wei Huang, Lorenzo Bucci, Michael Bronstein, and Gitta Kutyniok. Transferability of spectral graph convolutional neural networks.Journal of Machine Learning Research, 22(272):1–59, 2021
work page 2021
-
[25]
Nicolas Keriven, Alberto Bietti, and Samuel Vaiter. Convergence and stability of graph convolutional networks on large random graphs.Advances in Neural Information Processing Systems, 33:21512–21523, 2020
work page 2020
-
[26]
Luana Ruiz, Luiz FO Chamon, and Alejandro Ribeiro. Transferability properties of graph neural networks.IEEE Transactions on Signal Processing, 71:3474–3489, 2023
work page 2023
-
[27]
Sohir Maskey, Ron Levie, and Gitta Kutyniok. Transferability of graph neural networks: an extended graphon approach.Applied and Computational Harmonic Analysis, 63:48–83, 2023
work page 2023
-
[28]
Convergence of message passing graph neural networks with generic aggregation on random graphs
Matthieu Cordonnier, Nicolas Keriven, Nicolas Tremblay, and Samuel Vaiter. Convergence of message passing graph neural networks with generic aggregation on random graphs. InGSP 2023-6th Graph Signal Processing workshop, pages 1–3, 2023
work page 2023
-
[29]
Convergence of invariant graph networks
Chen Cai and Yusu Wang. Convergence of invariant graph networks. InInternational Conference on Machine Learning, pages 2457–2484. PMLR, 2022
work page 2022
-
[30]
George Cybenko. Approximation by superpositions of a sigmoidal function.Mathematics of control, signals and systems, 2(4):303–314, 1989
work page 1989
-
[31]
Multilayer feedforward networks are universal approximators.Neural networks, 2(5):359–366, 1989
Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators.Neural networks, 2(5):359–366, 1989
work page 1989
-
[32]
Moshe Leshno, Vladimir Ya Lin, Allan Pinkus, and Shimon Schocken. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function.Neural networks, 6(6):861–867, 1993
work page 1993
-
[33]
Approximation capabilities of multilayer feedforward networks.Neural networks, 4(2):251–257, 1991
Kurt Hornik. Approximation capabilities of multilayer feedforward networks.Neural networks, 4(2):251–257, 1991
work page 1991
-
[34]
Noncompact uniform universal approximation.Neural Networks, 173:106181, 2024
Teun DH van Nuland. Noncompact uniform universal approximation.Neural Networks, 173:106181, 2024
work page 2024
-
[35]
Ariel Neufeld and Philipp Schmocker. Universal approximation results for neural networks with non-polynomialactivationfunctionovernon-compactdomains.arXiv preprint arXiv:2410.14759, 2024. 15
-
[36]
Ahmed Abdeljawad and Thomas Dittrich. Weighted sobolev approximation rates for neural networks on unbounded domains.arXiv preprint arXiv:2411.04108, 2024
-
[37]
How Powerful are Graph Neural Networks?
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks?arXiv preprint arXiv:1810.00826, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[38]
Andrei Leman and Boris Weisfeiler. A reduction of a graph to a canonical form and an algebra arising during this reduction.Nauchno-Technicheskaya Informatsiya, 2(9):12–16, 1968
work page 1968
-
[39]
Provably powerful graph networks.Advances in neural information processing systems, 32, 2019
Haggai Maron, Heli Ben-Hamu, Hadar Serviansky, and Yaron Lipman. Provably powerful graph networks.Advances in neural information processing systems, 32, 2019
work page 2019
-
[40]
Weisfeiler and leman go neural: Higher-order graph neural networks
Christopher Morris, Martin Ritzert, Matthias Fey, William L Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe. Weisfeiler and leman go neural: Higher-order graph neural networks. InProceedings of the AAAI conference on artificial intelligence, volume 33, pages 4602–4609, 2019
work page 2019
-
[41]
Takanori Maehara and Hoang NT. A simple proof of the universality of invariant/equivariant graph neural networks.arXiv preprint arXiv:1910.03802, 2019
-
[42]
Graph homomorphism convolution
Hoang Nguyen and Takanori Maehara. Graph homomorphism convolution. InInternational Conference on Machine Learning, pages 7306–7316. PMLR, 2020
work page 2020
-
[43]
Akiyoshi Sannai, Yuuki Takai, and Matthieu Cordonnier. Universal approximations of permuta- tion invariant/equivariant functions by deep neural networks.arXiv preprint arXiv:1903.01939, 2019
-
[44]
On the representation power of set pooling networks
Christian Bueno and Alan Hylton. On the representation power of set pooling networks. Advances in Neural Information Processing Systems, 34:17170–17182, 2021
work page 2021
-
[45]
A functional perspective on learning symmetric functions with neural networks
Aaron Zweig and Joan Bruna. A functional perspective on learning symmetric functions with neural networks. InInternational Conference on Machine Learning, pages 13023–13032. PMLR, 2021
work page 2021
-
[46]
Daniel Herbst and Stefanie Jegelka. Higher-order graphon neural networks: Approximation and cut distance.arXiv preprint arXiv:2503.14338, 2025
-
[47]
International Series in Pure and Applied Mathematics
Walter Rudin.Functional Analysis. International Series in Pure and Applied Mathematics. McGraw-Hill, New York, 2 edition, 1991
work page 1991
-
[48]
Springer Science & Business Media, 2012
Joseph Diestel.Sequences and series in Banach spaces, volume 92. Springer Science & Business Media, 2012
work page 2012
-
[49]
Luigi Ambrosio, Nicola Gigli, and Giuseppe Savaré.Gradient flows: in metric spaces and in the space of probability measures. Springer, 2005
work page 2005
-
[50]
American Mathematical Soc., 2012
László Lovász.Large networks and graph limits. American Mathematical Soc., 2012
work page 2012
-
[51]
A practical method for constructing equivariant multilayer perceptrons for arbitrary matrix groups
Marc Finzi, Max Welling, and Andrew Gordon Wilson. A practical method for constructing equivariant multilayer perceptrons for arbitrary matrix groups. InInternational conference on machine learning, pages 3318–3328. PMLR, 2021. 16
work page 2021
-
[52]
Christian Borgs, Jennifer Chayes, Henry Cohn, and Yufei Zhao. AnLp theory of sparse graph convergence I: Limits, sparse random graph models, and power law distributions.Transactions of the American Mathematical Society, 372(5):3019–3062, 2019
work page 2019
-
[53]
Chayes, Henry Cohn, and Nina Holden
Christian Borgs, Jennifer T. Chayes, Henry Cohn, and Nina Holden. Sparse exchangeable graphs and their limits via graphon processes.Journal of Machine Learning Research, 18(210):1–71, 2018
work page 2018
-
[54]
Action convergence of operators and graphs.Canadian Journal of Mathematics, 74(1):72–121, 2022
Ágnes Backhausz and Balázs Szegedy. Action convergence of operators and graphs.Canadian Journal of Mathematics, 74(1):72–121, 2022
work page 2022
-
[55]
Stephen Willard.General topology. Courier Corporation, 2012
work page 2012
-
[56]
Principles of mathematical analysis.3rd ed., 1976
Walter Rudin. Principles of mathematical analysis.3rd ed., 1976
work page 1976
-
[57]
Cédric Villani et al.Optimal transport: old and new, volume 338. Springer, 2008
work page 2008
- [58]
-
[59]
C. Borgs, J.T. Chayes, L. Lovász, V.T. Sós, and K. Vesztergombi. Convergent sequences of dense graphs I: Subgraph frequencies, metric properties and testing.Advances in Mathematics, 219(6):1801–1851, 2008
work page 2008
-
[60]
Differential calculus on graphon space.Journal of Combinatorial Theory, Series A, 133:183–227, 2015
Peter Diao, Dominique Guillot, Apoorva Khare, and Bala Rajaratnam. Differential calculus on graphon space.Journal of Combinatorial Theory, Series A, 133:183–227, 2015
work page 2015
- [61]
-
[62]
Stephen H Friedberg, Arnold J Insel, and Lawrence E Spence.Linear algebra. Prentice Hall, 1997. A Missing details from Section 2 In this section, we present missing details from Section 2. We start with a more detailed definition of a consistent sequence. Definition A.1(Detailed version of Def. 2.1).Aconsistent sequenceof group representations over direct...
work page 1997
-
[63]
(Groups) A sequence of groups(Gn) indexed by N that embed into each other. Specifically, whenevern⪯Nthere is an injective group homomorphismθN,n: Gn→GN with θi,i= idGi for alli∈N, θk,j◦θj,i=θk,i wheneveri⪯j⪯kinN
-
[64]
(Vector spaces) A sequence of finite-dimensional, real vector spaces(Vn) indexed by N such that eachVn is aGn-representation
-
[65]
17 and such that φi,i= idVi for alli∈N, φk,j◦φj,i=φk,i wheneveri⪯j⪯kinN
(Embeddings) A collection of embeddings(φN,n:V n ↪→VN)n⪯Nsuch that φN,nisG n- equivariant, i.e., φN,n(g·v) =θN,n(g)·φN,n(v)for allg∈Gn,v∈Vn. 17 and such that φi,i= idVi for alli∈N, φk,j◦φj,i=φk,i wheneveri⪯j⪯kinN. We can take direct sums of consistent sequences to obtain richer consistent sequences, as done in several of the examples of Section 4. Definit...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.