Approximation by superpositions of a sigmoidal function.Mathematics of control, signals and systems, 2(4):303–314

George Cybenko · 1989

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

browse 7 citing papers

citation-role summary

background 1 other 1

citation-polarity summary

background 1 unclear 1

representative citing papers

Any-Dimensional Invariant Universality

cs.LG · 2026-05-22 · unverdicted · novelty 8.0

A systematic approach maps any-dimensional invariant functions to a unique function on an infinite-dimensional limit space admitting a topology with compact sets where universality holds, with examples of non-universal architectures and fixes.

Approximation of Maximally Monotone Operators : A Graph Convergence Perspective

cs.LG · 2026-05-12 · unverdicted · novelty 7.0

Any maximally monotone operator can be approximated in local graph convergence by continuous encoder-decoder networks, with structure-preserving versions that retain maximal monotonicity via resolvent parameterizations.

Learning on the Temporal Tangent Bundle for Physics-Informed Neural Networks

math.NA · 2026-04-11 · unverdicted · novelty 7.0

Parameterizing the temporal derivative in PINNs and reconstructing via Volterra integral yields 100-200x lower errors on advection, Burgers, and Klein-Gordon equations while proving equivalence to the original PDE.

Isotropic Activation Functions Enable Deindividuated Neurons and Adaptive Topologies

cs.NE · 2026-02-26 · unverdicted · novelty 7.0

Isotropic activation functions derived from reparameterisation symmetries and SVD diagonalisation enable function-preserving neuron removal and addition in dense networks, supporting up to 50% sparsification and real-time topology adaptation.

Training Transformers for KV Cache Compressibility

cs.LG · 2026-05-07 · unverdicted · novelty 6.0 · 2 refs

Training transformers with KV sparsification during continued pretraining produces representations that admit better post-hoc KV cache compression, improving quality under memory budgets for long-context tasks.

Structural Correspondence and Universal Approximation in Diagonal plus Low-Rank Neural Networks

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

Diagonal plus Low-Rank (DLoR) neural networks achieve universal approximation for general activations by additive or multiplicative decompositions of full-rank transformations.

Solving and learning advective multiscale Darcian dynamics with the Neural Basis Method

math.NA · 2026-02-19 · unverdicted · novelty 6.0

The Neural Basis Method uses a predefined neural basis space and operator residual metric to deliver accurate single solves and fast parametric learning for multiscale Darcian dynamics.

citing papers explorer

Showing 7 of 7 citing papers.

Any-Dimensional Invariant Universality cs.LG · 2026-05-22 · unverdicted · none · ref 30
A systematic approach maps any-dimensional invariant functions to a unique function on an infinite-dimensional limit space admitting a topology with compact sets where universality holds, with examples of non-universal architectures and fixes.
Approximation of Maximally Monotone Operators : A Graph Convergence Perspective cs.LG · 2026-05-12 · unverdicted · none · ref 21
Any maximally monotone operator can be approximated in local graph convergence by continuous encoder-decoder networks, with structure-preserving versions that retain maximal monotonicity via resolvent parameterizations.
Learning on the Temporal Tangent Bundle for Physics-Informed Neural Networks math.NA · 2026-04-11 · unverdicted · none · ref 9
Parameterizing the temporal derivative in PINNs and reconstructing via Volterra integral yields 100-200x lower errors on advection, Burgers, and Klein-Gordon equations while proving equivalence to the original PDE.
Isotropic Activation Functions Enable Deindividuated Neurons and Adaptive Topologies cs.NE · 2026-02-26 · unverdicted · none · ref 21
Isotropic activation functions derived from reparameterisation symmetries and SVD diagonalisation enable function-preserving neuron removal and addition in dense networks, supporting up to 50% sparsification and real-time topology adaptation.
Training Transformers for KV Cache Compressibility cs.LG · 2026-05-07 · unverdicted · none · ref 14 · 2 links
Training transformers with KV sparsification during continued pretraining produces representations that admit better post-hoc KV cache compression, improving quality under memory budgets for long-context tasks.
Structural Correspondence and Universal Approximation in Diagonal plus Low-Rank Neural Networks cs.LG · 2026-05-07 · unverdicted · none · ref 14
Diagonal plus Low-Rank (DLoR) neural networks achieve universal approximation for general activations by additive or multiplicative decompositions of full-rank transformations.
Solving and learning advective multiscale Darcian dynamics with the Neural Basis Method math.NA · 2026-02-19 · unverdicted · none · ref 18
The Neural Basis Method uses a predefined neural basis space and operator residual metric to deliver accurate single solves and fast parametric learning for multiscale Darcian dynamics.

Approximation by superpositions of a sigmoidal function.Mathematics of control, signals and systems, 2(4):303–314

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer