Density of Neural Network Classes on Compact Subsets of Topological Vector Spaces
Pith reviewed 2026-05-22 01:46 UTC · model grok-4.3
The pith
Neural network classes using dual functionals are dense in continuous functions on compact sets of topological vector spaces.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The class Σ_X(Ψ) of all finite sums ∑ ω_j Ψ(f_j(x) + b_j), where f_j belongs to the continuous dual X*, is dense in C(K) with respect to the uniform norm whenever K is compact in X and X* separates points on X. As a direct consequence the same class is dense in L^p(K, μ) for every Radon probability measure μ and every 1 ≤ p < ∞.
What carries the argument
The class Σ_X(Ψ), consisting of finite sums of the squashing function Ψ applied to affine functionals built from elements of the continuous dual X*.
If this is right
- Uniform approximation holds for every compact K in any topological vector space whose dual separates points.
- The approximation property transfers to L^p spaces with respect to every Radon probability measure on K.
- The result applies whenever the squashing function Ψ is continuous.
Where Pith is reading between the lines
- The construction may be specialized to common infinite-dimensional spaces such as Banach or Fréchet spaces used in functional analysis.
- Finite-dimensional truncations of the dual functionals could be tested numerically to observe convergence rates toward the infinite-dimensional case.
Load-bearing premise
The continuous dual of the topological vector space separates points.
What would settle it
A concrete continuous function on some compact K in a space whose dual separates points that cannot be approximated uniformly to within any prescribed epsilon by any element of Σ_X(Ψ).
read the original abstract
We prove density results for neural-network classes on compact sets \(K\subset X\), where \(X\) is a topological vector space whose continuous dual \(X^*\) separates points. Let \(\Psi:\mathbb R\to\mathbb R\) be a continuous squashing function. We show that the class \[ \Sigma_X(\Psi) = \left\{ \sum_{j=1}^{N}\omega_j\Psi(f_j(x)+b_j): N\in\mathbb N,\ \omega_j,b_j\in\mathbb R,\ f_j\in X^* \right\} \] is dense in \(C(K)\) with respect to the uniform norm. As a consequence, if \(\mu\) is a Radon probability measure supported on \(K\), then \(\Sigma_X(\Psi)\) is dense in \(L^p(K,\mu)\) for every \(1\le p<\infty\).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proves density results for neural-network classes on compact sets K subset X, where X is a topological vector space whose continuous dual X* separates points. Let Ψ: R to R be a continuous squashing function. It shows that the class Σ_X(Ψ) = { sum_{j=1}^N ω_j Ψ(f_j(x) + b_j) : N in N, ω_j, b_j in R, f_j in X* } is dense in C(K) with respect to the uniform norm. As a consequence, if μ is a Radon probability measure supported on K, then Σ_X(Ψ) is dense in L^p(K, μ) for every 1 ≤ p < ∞.
Significance. If the central derivation holds, the result generalizes classical neural-network approximation theorems from finite-dimensional Euclidean spaces to general topological vector spaces under the separation assumption on X*. This is of interest in functional analysis and infinite-dimensional approximation theory. The direct existence proof with no fitted parameters or self-referential definitions is a strength, as is the clean reduction to the L^p case via standard measure-theoretic arguments.
major comments (1)
- §3, proof of Theorem 3.2: the argument that the algebra generated by {f|K : f in X*} is dense in C(K) via Stone-Weierstrass requires an explicit verification that this algebra is closed under pointwise multiplication and separates points on K; while the separation of points follows from the hypothesis on X*, the multiplication closure step is only sketched and should be written out to confirm it does not rely on additional structure of X.
Simulated Author's Rebuttal
We thank the referee for their careful reading of the manuscript and for the constructive comment, which helps improve the clarity of the proof. We address the point below and have incorporated the suggested expansion into the revised version.
read point-by-point responses
-
Referee: [—] §3, proof of Theorem 3.2: the argument that the algebra generated by {f|K : f in X*} is dense in C(K) via Stone-Weierstrass requires an explicit verification that this algebra is closed under pointwise multiplication and separates points on K; while the separation of points follows from the hypothesis on X*, the multiplication closure step is only sketched and should be written out to confirm it does not rely on additional structure of X.
Authors: We agree with the referee that an explicit verification strengthens the argument. In the revised manuscript, we have expanded the relevant paragraph in the proof of Theorem 3.2 to state explicitly that the set A of all finite sums of finite products of elements from {f|K : f ∈ X*} forms a subalgebra of C(K): it is closed under pointwise addition and scalar multiplication by definition, and closed under pointwise multiplication because the product of two such polynomials is again a finite sum of products of the generating functionals. The algebra contains the constant functions (via the zero functional or constant multiples) and separates points on K because X* separates points on X and hence on K. This verification uses only the algebraic operations on the restrictions and the definition of the algebra generated by a set; it does not invoke any further topological or linear structure on X beyond the given hypotheses. revision: yes
Circularity Check
No circularity in direct density proof
full rationale
The paper establishes density of Σ_X(Ψ) in C(K) via a direct existence argument under the explicit hypothesis that X* separates points on X. This separation ensures the relevant topologies coincide on compact K, enabling standard approximation via the algebra generated by the functionals or explicit constructions with the continuous squashing map Ψ. No step reduces by construction to a fitted parameter, self-definition, or load-bearing self-citation; the logical chain is self-contained against external benchmarks such as Stone-Weierstrass and basic topological vector space facts.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The continuous dual X* separates points on X
- domain assumption Ψ is a continuous squashing function
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We prove density results for neural-network classes on compact sets K⊂X, where X is a topological vector space whose continuous dual X* separates points. ... ΣX(Ψ) is dense in C(K) with respect to the uniform norm.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the algebra generated by the restrictions of elements of X* to K ... By Theorem 2.4, A(X*) is dense in C(K)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
V . I. Bogachev,Measure Theory, V ols. I–II, Springer, Berlin–Heidelberg, 2007
work page 2007
-
[2]
´A. Capel and J. Oc ´ariz, Approximation with neural networks in variable Lebesgue spaces, arXiv:2007.04166, 2020
-
[3]
G. Cybenko, Approximation by superpositions of a sigmoidal function,Mathematics of Control, Signals and Systems2(1989), 303–314
work page 1989
-
[4]
G. B. Folland,Real Analysis: Modern Techniques and Their Applications, 2nd ed., Wiley, New York, 1999
work page 1999
- [5]
-
[6]
Hornik, Approximation capabilities of multilayer feedforward networks,Neural Networks4(1991), no
K. Hornik, Approximation capabilities of multilayer feedforward networks,Neural Networks4(1991), no. 2, 251–257
work page 1991
-
[7]
V . E. Ismailov, Universal approximation theorem for neural networks with inputs from a topological vector space,Information Processing Letters193(2026), Article 106623
work page 2026
- [8]
- [9]
-
[10]
J. Park and I. W. Sandberg, Approximation and radial-basis-function networks,Neural Computation5 (1993), no. 2, 305–316
work page 1993
-
[11]
Pinkus, Approximation theory of the MLP model in neural networks,Acta Numerica8(1999), 143–195
A. Pinkus, Approximation theory of the MLP model in neural networks,Acta Numerica8(1999), 143–195
work page 1999
-
[12]
S. Saini, A universal approximation theorem for neural networks with outputs in locally convex spaces, arXiv:2603.07242, 2026
-
[13]
J. R. Munkres,Topology, 2nd ed., Prentice Hall, Upper Saddle River, NJ, 2000
work page 2000
-
[14]
Rudin,Principles of Mathematical Analysis, 3rd ed., McGraw–Hill, New York, 1976
W. Rudin,Principles of Mathematical Analysis, 3rd ed., McGraw–Hill, New York, 1976
work page 1976
-
[15]
Rudin,Functional Analysis, 2nd ed., McGraw–Hill, New York, 1991
W. Rudin,Functional Analysis, 2nd ed., McGraw–Hill, New York, 1991. MOHAMMAD JAVADBAGHBANBASHI, DEPARTMENT OFMATHEMATICS, INSTITUTE FORADVANCED STUDIES INBASICSCIENCES(IASBS), ZANJAN, 45137-66731, IRAN Email address:baghban.mj@iasbs.ac.ir ARASHGHORBANALIZADEH, DEPARTMENT OFMATHEMATICS, INSTITUTE FORADVANCEDSTUDIES INBASICSCIENCES(IASBS), ZANJAN, 45137-667...
work page 1991
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.