pith. sign in

arxiv: 2605.22482 · v1 · pith:ZYTYDO65new · submitted 2026-05-21 · 🧮 math.FA

Density of Neural Network Classes on Compact Subsets of Topological Vector Spaces

Pith reviewed 2026-05-22 01:46 UTC · model grok-4.3

classification 🧮 math.FA
keywords neural networksdensityuniversal approximationtopological vector spacescontinuous functionscompact setssquashing functionsRadon measures
0
0 comments X

The pith

Neural network classes using dual functionals are dense in continuous functions on compact sets of topological vector spaces.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that neural network classes formed from a continuous squashing function applied to affine combinations drawn from the continuous dual are dense in the continuous functions on any compact subset. This extends classical approximation results to general topological vector spaces rather than restricting to Euclidean domains. The separation property of the dual ensures that the linear parts can distinguish points sufficiently to achieve uniform approximation. If correct, the same class becomes dense in the corresponding L^p spaces for any Radon probability measure supported on the compact set.

Core claim

The class Σ_X(Ψ) of all finite sums ∑ ω_j Ψ(f_j(x) + b_j), where f_j belongs to the continuous dual X*, is dense in C(K) with respect to the uniform norm whenever K is compact in X and X* separates points on X. As a direct consequence the same class is dense in L^p(K, μ) for every Radon probability measure μ and every 1 ≤ p < ∞.

What carries the argument

The class Σ_X(Ψ), consisting of finite sums of the squashing function Ψ applied to affine functionals built from elements of the continuous dual X*.

If this is right

  • Uniform approximation holds for every compact K in any topological vector space whose dual separates points.
  • The approximation property transfers to L^p spaces with respect to every Radon probability measure on K.
  • The result applies whenever the squashing function Ψ is continuous.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The construction may be specialized to common infinite-dimensional spaces such as Banach or Fréchet spaces used in functional analysis.
  • Finite-dimensional truncations of the dual functionals could be tested numerically to observe convergence rates toward the infinite-dimensional case.

Load-bearing premise

The continuous dual of the topological vector space separates points.

What would settle it

A concrete continuous function on some compact K in a space whose dual separates points that cannot be approximated uniformly to within any prescribed epsilon by any element of Σ_X(Ψ).

read the original abstract

We prove density results for neural-network classes on compact sets \(K\subset X\), where \(X\) is a topological vector space whose continuous dual \(X^*\) separates points. Let \(\Psi:\mathbb R\to\mathbb R\) be a continuous squashing function. We show that the class \[ \Sigma_X(\Psi) = \left\{ \sum_{j=1}^{N}\omega_j\Psi(f_j(x)+b_j): N\in\mathbb N,\ \omega_j,b_j\in\mathbb R,\ f_j\in X^* \right\} \] is dense in \(C(K)\) with respect to the uniform norm. As a consequence, if \(\mu\) is a Radon probability measure supported on \(K\), then \(\Sigma_X(\Psi)\) is dense in \(L^p(K,\mu)\) for every \(1\le p<\infty\).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript proves density results for neural-network classes on compact sets K subset X, where X is a topological vector space whose continuous dual X* separates points. Let Ψ: R to R be a continuous squashing function. It shows that the class Σ_X(Ψ) = { sum_{j=1}^N ω_j Ψ(f_j(x) + b_j) : N in N, ω_j, b_j in R, f_j in X* } is dense in C(K) with respect to the uniform norm. As a consequence, if μ is a Radon probability measure supported on K, then Σ_X(Ψ) is dense in L^p(K, μ) for every 1 ≤ p < ∞.

Significance. If the central derivation holds, the result generalizes classical neural-network approximation theorems from finite-dimensional Euclidean spaces to general topological vector spaces under the separation assumption on X*. This is of interest in functional analysis and infinite-dimensional approximation theory. The direct existence proof with no fitted parameters or self-referential definitions is a strength, as is the clean reduction to the L^p case via standard measure-theoretic arguments.

major comments (1)
  1. §3, proof of Theorem 3.2: the argument that the algebra generated by {f|K : f in X*} is dense in C(K) via Stone-Weierstrass requires an explicit verification that this algebra is closed under pointwise multiplication and separates points on K; while the separation of points follows from the hypothesis on X*, the multiplication closure step is only sketched and should be written out to confirm it does not rely on additional structure of X.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for the constructive comment, which helps improve the clarity of the proof. We address the point below and have incorporated the suggested expansion into the revised version.

read point-by-point responses
  1. Referee: [—] §3, proof of Theorem 3.2: the argument that the algebra generated by {f|K : f in X*} is dense in C(K) via Stone-Weierstrass requires an explicit verification that this algebra is closed under pointwise multiplication and separates points on K; while the separation of points follows from the hypothesis on X*, the multiplication closure step is only sketched and should be written out to confirm it does not rely on additional structure of X.

    Authors: We agree with the referee that an explicit verification strengthens the argument. In the revised manuscript, we have expanded the relevant paragraph in the proof of Theorem 3.2 to state explicitly that the set A of all finite sums of finite products of elements from {f|K : f ∈ X*} forms a subalgebra of C(K): it is closed under pointwise addition and scalar multiplication by definition, and closed under pointwise multiplication because the product of two such polynomials is again a finite sum of products of the generating functionals. The algebra contains the constant functions (via the zero functional or constant multiples) and separates points on K because X* separates points on X and hence on K. This verification uses only the algebraic operations on the restrictions and the definition of the algebra generated by a set; it does not invoke any further topological or linear structure on X beyond the given hypotheses. revision: yes

Circularity Check

0 steps flagged

No circularity in direct density proof

full rationale

The paper establishes density of Σ_X(Ψ) in C(K) via a direct existence argument under the explicit hypothesis that X* separates points on X. This separation ensures the relevant topologies coincide on compact K, enabling standard approximation via the algebra generated by the functionals or explicit constructions with the continuous squashing map Ψ. No step reduces by construction to a fitted parameter, self-definition, or load-bearing self-citation; the logical chain is self-contained against external benchmarks such as Stone-Weierstrass and basic topological vector space facts.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The result rests on two standard domain assumptions from functional analysis plus the definition of a squashing function; no free parameters or new entities are introduced in the abstract.

axioms (2)
  • domain assumption The continuous dual X* separates points on X
    Invoked in the abstract as the key hypothesis on the topological vector space X that enables the density.
  • domain assumption Ψ is a continuous squashing function
    Stated as the activation function; the precise definition of squashing is assumed known from prior literature.

pith-pipeline@v0.9.0 · 5682 in / 1246 out tokens · 56350 ms · 2026-05-22T01:46:56.523049+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

  1. [1]

    V . I. Bogachev,Measure Theory, V ols. I–II, Springer, Berlin–Heidelberg, 2007

  2. [2]

    Capel and J

    ´A. Capel and J. Oc ´ariz, Approximation with neural networks in variable Lebesgue spaces, arXiv:2007.04166, 2020

  3. [3]

    Cybenko, Approximation by superpositions of a sigmoidal function,Mathematics of Control, Signals and Systems2(1989), 303–314

    G. Cybenko, Approximation by superpositions of a sigmoidal function,Mathematics of Control, Signals and Systems2(1989), 303–314

  4. [4]

    G. B. Folland,Real Analysis: Modern Techniques and Their Applications, 2nd ed., Wiley, New York, 1999

  5. [5]

    Hornik, M

    K. Hornik, M. Stinchcombe, and H. White, Multilayer feedforward networks are universal approximators, Neural Networks2(1989), no. 5, 359–366

  6. [6]

    Hornik, Approximation capabilities of multilayer feedforward networks,Neural Networks4(1991), no

    K. Hornik, Approximation capabilities of multilayer feedforward networks,Neural Networks4(1991), no. 2, 251–257

  7. [7]

    V . E. Ismailov, Universal approximation theorem for neural networks with inputs from a topological vector space,Information Processing Letters193(2026), Article 106623

  8. [8]

    Izuki, T

    M. Izuki, T. Noi, Y . Sawano, and H. Tanaka, Some density theorems in neural network with variable exponent,Mediterranean Journal of Mathematics22(2025), Article 180

  9. [9]

    Leshno, V

    M. Leshno, V . Ya. Lin, A. Pinkus, and S. Schocken, Multilayer feedforward networks with a nonpolynomial activation function can approximate any function,Neural Networks6(1993), no. 6, 861–867

  10. [10]

    Park and I

    J. Park and I. W. Sandberg, Approximation and radial-basis-function networks,Neural Computation5 (1993), no. 2, 305–316

  11. [11]

    Pinkus, Approximation theory of the MLP model in neural networks,Acta Numerica8(1999), 143–195

    A. Pinkus, Approximation theory of the MLP model in neural networks,Acta Numerica8(1999), 143–195

  12. [12]

    Saini, A universal approximation theorem for neural networks with outputs in locally convex spaces, arXiv:2603.07242, 2026

    S. Saini, A universal approximation theorem for neural networks with outputs in locally convex spaces, arXiv:2603.07242, 2026

  13. [13]

    J. R. Munkres,Topology, 2nd ed., Prentice Hall, Upper Saddle River, NJ, 2000

  14. [14]

    Rudin,Principles of Mathematical Analysis, 3rd ed., McGraw–Hill, New York, 1976

    W. Rudin,Principles of Mathematical Analysis, 3rd ed., McGraw–Hill, New York, 1976

  15. [15]

    Rudin,Functional Analysis, 2nd ed., McGraw–Hill, New York, 1991

    W. Rudin,Functional Analysis, 2nd ed., McGraw–Hill, New York, 1991. MOHAMMAD JAVADBAGHBANBASHI, DEPARTMENT OFMATHEMATICS, INSTITUTE FORADVANCED STUDIES INBASICSCIENCES(IASBS), ZANJAN, 45137-66731, IRAN Email address:baghban.mj@iasbs.ac.ir ARASHGHORBANALIZADEH, DEPARTMENT OFMATHEMATICS, INSTITUTE FORADVANCEDSTUDIES INBASICSCIENCES(IASBS), ZANJAN, 45137-667...