Explicit Construction of Approximate Kolmogorov Superpositions with C2 Smoothness

Juan Diego Toscano; Li-Lian Wang; Lunji Song; Zilan Cheng

arxiv: 2508.04392 · v2 · pith:QCDU63LDnew · submitted 2025-08-06 · 🧮 math.NA · cs.NA

Explicit Construction of Approximate Kolmogorov Superpositions with C2 Smoothness

Lunji Song , Zilan Cheng , Juan Diego Toscano , Li-Lian Wang This is my paper

Pith reviewed 2026-05-21 23:30 UTC · model grok-4.3

classification 🧮 math.NA cs.NA

keywords Kolmogorov superpositionC2 smoothnessHolder continuous functionspiecewise interpolationshape functionsneural networksapproximation theoryexplicit construction

0 comments

The pith

Kolmogorov superpositions can be explicitly approximated by C2 smooth inner and outer functions to reach N to the power minus alpha accuracy for any alpha-Holder function.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper gives an explicit construction for approximate Kolmogorov superpositions that use only C2 smooth functions. The inner functions arise from translations and dilations of one fixed piecewise C2 strictly increasing function. The outer functions are assembled row by row via piecewise C2 interpolation that employs specially designed shape functions. With N outer terms the scheme approximates every alpha-Holder continuous function to within an error of order N to the minus alpha. Readers may care because the classical Kolmogorov functions are too irregular for direct use, while this version keeps the representation idea yet supplies the smoothness needed for many practical calculations.

Core claim

What carries the argument

Rowwise piecewise C2 interpolation of the outer functions via newly designed shape functions, together with translated and dilated copies of a single piecewise C2 strictly increasing function serving as the inner functions.

If this is right

The construction removes the pathological irregularity that normally appears in Kolmogorov superpositions.
The original Kolmogorov strategy of building multivariate functions from univariate ones is retained in approximate form.
The same explicit functions can be inserted directly into neural-network architectures that require C2 regularity.
The error bound scales as N to the minus alpha for any Holder exponent alpha between zero and one.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Smooth Kolmogorov-type representations may now be substituted into existing numerical schemes that already demand twice-differentiable approximants.
The explicit shape-function construction suggests a template for obtaining higher-order smoothness versions of the same superposition by redesigning the interpolation pieces.
Because the paper already notes applicability to neural networks, the C2 property could be used to equip those networks with analytic derivatives for optimization or sensitivity analysis.

Load-bearing premise

The outer functions can be constructed rowwise through piecewise C2 interpolation using the newly designed shape functions while preserving the required approximation rate and C2 regularity.

What would settle it

Pick a concrete alpha-Holder function such as |x|^alpha on the unit interval, compute the constructed superposition for increasing N, and check whether the maximum error decreases exactly like N to the power minus alpha while the second derivatives of every outer function remain continuous across all knots.

Figures

Figures reproduced from arXiv: 2508.04392 by Juan Diego Toscano, Li-Lian Wang, Lunji Song, Zilan Cheng.

**Figure 2.1.** Figure 2.1: (a) Partition of [0, 1] by closed sub-intervals and gaps in (2.14) with N = J = 5 and δ = 1/25. (b)-(d) Plots of ψq(x) at Lq with q = 1, 2, 3. With the above setup, we define the corresponding Kolmogorov maps (see Brattka [3] for this concept). Definition 2.1 (Kolmogorov maps). Let λ = (λ1, . . . , λd) be a given sequence of positive numbers, and let ψq(t) be the piecewise C 2 - function defined in (2.16… view at source ↗

**Figure 2.2.** Figure 2.2: (a)-(c) Illustration of disjoint hypercubes, “centers” and gaps at levels Lq with q = 1, 2, 3 in 2D. (d) Covering of [0, 1]2 at different levels by δ-shifting. Here, N = J = 5 and δ = 1/25. Correspondingly, we denote the mapped hypercube “centers” by a j q := Ψq(c j q ) = Ψq(c j q ;λ) = X d p=1 λpψq(c jp q ), 1 ≤ q ≤ N. (2.24) In the construction of outer functions, it is crucial to understand the distri… view at source ↗

**Figure 2.3.** Figure 2.3: Distributions of {a (k,j2) q } J−1 k=−1 for j2 = −1, . . . , J − 1 from bottom to top. Here, q = 1, N = J = 5, δ = 1/25 and µ = 1/2. Left: (λ1, λ2) ≈ (0.4772, 0.5228). Right: (λ1, λ2) ≈ (0.1321, 0.8679) which satisfies (2.31). We next show that the interlacing of a j q in different “rows” can be avoided for a special choice of λ (see [PITH_FULL_IMAGE:figures/full_fig_p010_2_3.png] view at source ↗

**Figure 2.4.** Figure 2.4: (a) Surface plot of f(x1, x2) in (2.37). (b) Plots of the curves f(:, c j2 1 )/N with c j2 1 being the x2-coordinate of the “centers” as in [PITH_FULL_IMAGE:figures/full_fig_p013_2_4.png] view at source ↗

**Figure 3.1.** Figure 3.1: Convergence rate of the approximate superpositions. Errors against J = N (left), and J with N = 10 (right) in log-log scale. We next test higher dimensions and still take f(x) = u(x) ln(|u(x)|) with u(x) = tanh(ln(|sin(x1 + x2 + · · · + xd)| + ϵ) + e x1 tan(x2+···+xd) )/2 + X 5 i=1 αi e −(x1−β1,i) 2/σ2 i −(x2−β2,i) 2/σ2 i ···−(xd−βd,i) 2/σ2 i , (3.11) where we choose the constants so that |u(x)| can take… view at source ↗

**Figure 3.2.** Figure 3.2: Convergence rate of the approximate superpositions with J = N for d = 3 (left), d = 4 (middle) and d = 9 (right) in log-log scale. 3.3. Concluding remarks and discussions. It is well known that the univariate functions involved in the exact Kolmogorov–Arnold representation lack differentiability and smoothness, often exhibiting “wild” behavior—even when the target function f(x) is smooth. This significa… view at source ↗

read the original abstract

We explicitly construct an approximate version of the Kolmogorov superpositions, which is composed of C2-inner and outer functions, and can approximate an arbitrary alpha Holder continuous function with accuracy of N to the power -alpha, where N denotes the number of outer summations. The inner functions are generated by applying suitable translations and dilations to a piecewise C2, strictly increasing function, while the outer functions are constructed rowwise through piecewise C2 interpolation using newly designed shape functions. This novel variant of Kolmogorov superpositions overcomes the wild and pathological behaviors of the inherent single variable functions, but retains the essence of Kolmogorov strategy of exact representation-an objective that Sprecher (Neural Netw. 144(2021)438-442) has actively pursued. We also discuss the implications of this new construction and demonstrate its applicability to related neural networks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives an explicit C2 construction for approximate Kolmogorov superpositions achieving the N^{-alpha} rate for Holder functions, with the outer interpolation as the part to verify.

read the letter

This paper gives an explicit construction for approximate Kolmogorov superpositions using C2 inner and outer functions that approximate any alpha-Holder continuous function at rate N to the minus alpha. The inner functions come from translations and dilations of one piecewise C2 strictly increasing base, while the outers are built rowwise via piecewise C2 interpolation with new shape functions. That combination is the concrete advance here. It keeps the functions reasonably behaved instead of the usual pathological ones and keeps the Kolmogorov-style representation idea intact, which is useful for people who want to apply these to neural network analysis or constructive approximation. The setup is presented as self-contained without obvious circularity or fitted parameters. The rate claim follows from the standard Holder modulus once the pieces are in place. The soft spot sits in the outer-function construction. The interpolation must deliver global C2 regularity and keep local errors from accumulating across the N terms or forcing extra mesh refinements that would spoil the fixed-N scaling. If the shape functions fail to match first derivatives cleanly at knots or if the error constants pick up hidden N or alpha dependence, the claimed bound could slip. The abstract states that it works, but the full error analysis and smoothness verification are what would settle whether the rate holds without extra factors. This is for readers in approximation theory or neural-net regularity who need explicit smooth constructions rather than existence proofs. Someone looking for a workable variant with controlled smoothness would find the details worth reading. It deserves a serious referee to check the interpolation scheme and the error estimates in detail.

Referee Report

1 major / 3 minor

Summary. The paper explicitly constructs an approximate version of Kolmogorov superpositions using C²-smooth inner and outer functions. Inner functions are obtained via translations and dilations of a fixed piecewise C² strictly increasing base function. Outer functions are built rowwise by piecewise C² interpolation with newly designed shape functions. The resulting superposition approximates arbitrary α-Hölder continuous functions with accuracy O(N^{-α}), where N is the number of outer summation terms. The construction is presented as overcoming pathological behaviors of classical Kolmogorov functions while retaining the superposition strategy, with discussion of implications for neural networks.

Significance. If the error analysis and regularity claims hold, the work supplies an explicit, C²-smooth realization of approximate Kolmogorov superpositions with a concrete rate for Hölder classes. This addresses a persistent difficulty in the field by replacing wild inner/outer functions with controllable smooth ones, potentially aiding both theoretical approximation results and practical neural-network constructions. The parameter-free character of the rate (no fitted constants or self-referential scaling) and the focus on explicit shape functions are strengths that would strengthen the contribution if fully verified.

major comments (1)

[Outer function construction] Outer-function construction (as described following the inner-function definition): the claim that rowwise piecewise C² interpolation with the new shape functions simultaneously preserves global C² regularity and the overall N^{-α} rate for arbitrary α-Hölder targets is load-bearing. The local interpolation error must be shown to scale as O(h^α) (or better) with mesh size h chosen independently of N, without accumulation across the N terms or loss of C² matching at knots that would force h to depend on N. An explicit error bound relating the Hölder modulus, the shape-function properties, and the final approximation constant is required to substantiate the central theorem.

minor comments (3)

[Abstract] The abstract states the approximation rate but does not specify the domain (e.g., [0,1]^d); adding this would improve precision.
[Introduction] A short table comparing the smoothness, explicitness, and rate of the present construction with Sprecher (2021) and other recent variants would clarify the incremental advance.
[Shape functions] Notation for the shape functions (e.g., their support, knot placement, and C² matching conditions) should be introduced with a small diagram or explicit formulas in the main text rather than deferred to an appendix.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback on our manuscript. We address the single major comment below, providing clarifications and indicating the revisions made.

read point-by-point responses

Referee: Outer-function construction (as described following the inner-function definition): the claim that rowwise piecewise C² interpolation with the new shape functions simultaneously preserves global C² regularity and the overall N^{-α} rate for arbitrary α-Hölder targets is load-bearing. The local interpolation error must be shown to scale as O(h^α) (or better) with mesh size h chosen independently of N, without accumulation across the N terms or loss of C² matching at knots that would force h to depend on N. An explicit error bound relating the Hölder modulus, the shape-function properties, and the final approximation constant is required to substantiate the central theorem.

Authors: We thank the referee for identifying this key point requiring greater explicitness. We agree that a fully detailed error analysis strengthens the presentation. In the revised manuscript we have added a dedicated subsection deriving the interpolation error bound. The analysis establishes that the local error of the rowwise piecewise C² interpolant scales as O(h^α) for any α-Hölder target, with mesh size h chosen independently of N. Global C² regularity is preserved because the newly designed shape functions enforce exact matching of function value, first derivative, and second derivative at every knot; these matching conditions depend only on the fixed properties of the shape functions and not on N. The overall superposition error is then shown to be O(N^{-α}) by combining the uniform bound on each outer function with the structure of the translated-dilated inner functions; no accumulation across the N terms occurs because each term’s contribution is controlled by the same N-independent constant. An explicit relation is now stated between the Hölder modulus of continuity, the supremum norms of the shape functions and their derivatives, and the constant appearing in the main theorem. These additions directly address the load-bearing claim without altering the construction. revision: yes

Circularity Check

0 steps flagged

No significant circularity; explicit construction is self-contained

full rationale

The paper presents an explicit construction of approximate Kolmogorov superpositions using translated and dilated piecewise C2 inner functions together with rowwise piecewise C2 interpolation of outer functions via newly designed shape functions. The claimed N^{-alpha} rate for arbitrary alpha-Holder targets is asserted to follow directly from the approximation properties of these constructions. No equation reduces the rate or smoothness claim to a fitted parameter, prior self-citation, or definitional equivalence; the central steps rely on independent design choices whose error control is stated to be verified within the paper. The single external citation to Sprecher is not load-bearing for the derivation and does not import a uniqueness theorem or ansatz from the present authors.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on standard properties of Holder spaces and the ability to design C2 shape functions that support the interpolation while delivering the rate; no free parameters are introduced and no new physical entities are postulated.

axioms (1)

domain assumption Holder continuous functions admit approximation by sums of univariate compositions under suitable smoothness constraints on the components.
Invoked to define the target class and the desired approximation rate.

invented entities (1)

Newly designed shape functions for piecewise C2 interpolation no independent evidence
purpose: Enable rowwise construction of outer functions that remain C2 while supporting the N^{-alpha} error bound.
Introduced in the outer-function construction without reference to prior literature.

pith-pipeline@v0.9.0 · 5675 in / 1386 out tokens · 35513 ms · 2026-05-21T23:30:56.304451+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The inner functions are generated by applying suitable translations and dilations to a piecewise C2, strictly increasing function, while the outer functions are constructed rowwise through piecewise C2 interpolation using newly designed shape functions.
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean J_uniquely_calibrated_via_higher_derivative unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We explicitly construct an approximate version of the Kolmogorov superpositions... with accuracy O(N^{-α(1+γ)})

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

GRAFT-ATHENA: Self-Improving Agentic Teams for Autonomous Discovery and Evolutionary Numerical Algorithms
cs.LG 2026-05 unverdicted novelty 6.0

GRAFT-ATHENA projects combinatorial method choices into factored trees that embed as fingerprints in a metric space, enabling an agentic system to accumulate experience across domains and autonomously discover new num...
ATHENA: Agentic Team for Hierarchical Evolutionary Numerical Algorithms
cs.LG 2025-12 unverdicted novelty 5.0

ATHENA introduces an agentic team framework that autonomously manages the end-to-end computational research lifecycle via a knowledge-driven HENA loop to achieve validation errors of 10^{-14} in scientific computing a...

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · cited by 2 Pith papers

[1]

On functions of three variables

Vladimir Arnold. On functions of three variables. Proceedings of the USSR Academy of Sciences, 114:679– 681, 1957. English translation: Amer. Math. Soc. Transl., 28: Sixteen Papers on Analysis (1963), pp. 51–54

work page 1957
[2]

On the representation of continuous functions of three variables as superpositions of continuous functions of two variables

Vladimir Arnold. On the representation of continuous functions of three variables as superpositions of continuous functions of two variables. Doklady Akademii Nauk SSSR , 114(4):679–681, 1957. Available on SpringerLink

work page 1957
[3]

From Hilbert’s 13th Problem to the theory of neural networks: Constructive aspects of Kolmogorov’s Superposition Theorem , pages 253–280

Vasco Brattka. From Hilbert’s 13th Problem to the theory of neural networks: Constructive aspects of Kolmogorov’s Superposition Theorem , pages 253–280. Springer Berlin Heidelberg, Berlin, Heidelberg, 2007

work page 2007
[4]

PhD thesis, Universit¨ ats-und Landesbibliothek Bonn, 2009

J¨ urgen Braun.An Application of Kolmogorov’s Superposition Theorem to Function Reconstruction in Higher Dimensions. PhD thesis, Universit¨ ats-und Landesbibliothek Bonn, 2009

work page 2009
[5]

On a constructive proof of Kolmogorov’s superposition theorem

J¨ urgen Braun and Michael Griebel. On a constructive proof of Kolmogorov’s superposition theorem. Constructive Approximation, 30:653–675, 2009

work page 2009
[6]

A note on computing with Kolmogorov Superpositions without iter- ations

Robert Demb and David Sprecher. A note on computing with Kolmogorov Superpositions without iter- ations. Neural Networks, 144:438–442, 2021

work page 2021
[7]

Representation properties of networks: Kolmogorov’s theorem is irrelevant

Federico Girosi and Tomaso Poggio. Representation properties of networks: Kolmogorov’s theorem is irrelevant. Neural Computation, 1(4):465–469, 1989

work page 1989
[8]

Deep learning alternatives of the Kolmogorov superpo- sition theorem

Leonardo Ferreira Guilhoto and Paris Perdikaris. Deep learning alternatives of the Kolmogorov superpo- sition theorem. In The Thirteenth International Conference on Learning Representations , 2025

work page 2025
[9]

Guliyev and Vugar E

Namig J. Guliyev and Vugar E. Ismailov. Approximation capability of two hidden layer feedforward neural networks with fixed weights. Neurocomputing, 316:262–269, 2018

work page 2018
[10]

On the optimal expressive power of ReLU DNNs and its application in approximation with the Kolmogorov superposition theorem

Juncai He. On the optimal expressive power of ReLU DNNs and its application in approximation with the Kolmogorov superposition theorem. IEEE Transactions on Neural Networks and Learning Systems , pages 1–14, 2024

work page 2024
[11]

Kolmogorov’s mapping neural network existence theorem

Robert Hecht-Nielsen. Kolmogorov’s mapping neural network existence theorem. In Proceedings of the IEEE First International Conference on Neural Networks , volume III, pages 11–13, Piscataway, NJ,

work page
[12]

Kolmogorov’s spline network

Boris Igelnik and Neel Parikh. Kolmogorov’s spline network. IEEE Transactions on Neural Networks , 14(4):725–733, 2003

work page 2003
[13]

Addressing common misinterpretations of KART and UAT in neural network literature

Vugar E Ismailov. Addressing common misinterpretations of KART and UAT in neural network literature. arXiv preprint arXiv:2408.16389 , 2024

work page arXiv 2024
[14]

On the Kolmogorov neural networks

Aysu Ismayilova and Vugar E Ismailov. On the Kolmogorov neural networks. Neural Networks , 176:106333, 2024

work page 2024
[15]

Sur le th´ eor` eme de superposition de Kolmogorov.Journal of Approximation Theory, 13:229–234, 1975

Jean-Pierre Kahane. Sur le th´ eor` eme de superposition de Kolmogorov.Journal of Approximation Theory, 13:229–234, 1975

work page 1975
[16]

Kolmogorov’s theorem is relevant.Neural Computation, 3(4):617–622, 1991

Vˇ era K ˙ urkov´ a. Kolmogorov’s theorem is relevant.Neural Computation, 3(4):617–622, 1991

work page 1991
[17]

Kolmogorov’s theorem and multilayer neural networks.Neural Networks, 5(3):501–506, 1992

Vˇ era K ˙ urkov´ a. Kolmogorov’s theorem and multilayer neural networks.Neural Networks, 5(3):501–506, 1992

work page 1992
[18]

On the representation of continuous functions of several variables by superpositions of continuous functions of a smaller number of variables

Andrey Kolmogorov. On the representation of continuous functions of several variables by superpositions of continuous functions of a smaller number of variables. Proceedings of the USSR Academy of Sciences, 108:179–182, 1956. English translation: Amer. Math. Soc. Transl., 17: Twelve Papers on Algebra and Real Functions (1961), pp. 369–373

work page 1956
[19]

On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition

Andrey Kolmogorov. On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition. Doklady Akademii Nauk SSSR , 114(5):953–956, 1957. 20 APPROXIMATE KOLMOGOROV-ARNOLD SUPERPOSITIONS

work page 1957
[20]

On the training of a kolmogorov network

Mario K¨ oppen. On the training of a kolmogorov network. InICANN 2002: International Conference on Artificial Neural Networks, volume 2415 of Lecture Notes in Computer Science, pages 474–479. Springer, 2002

work page 2002
[21]

A superposition theorem of Kolmogorov type for bounded continuous functions.Jour- nal of Approximation Theory , 269:105609, 2021

Mikl´ os Laczkovich. A superposition theorem of Kolmogorov type for bounded continuous functions.Jour- nal of Approximation Theory , 269:105609, 2021

work page 2021
[22]

The optimal linear b-splines approximation via kolmogorov super- position theorem and its application

Ming-Jun Lai and Zhaiming Shen. The optimal rate for linear KB-splines and LKB-splines approximation of high dimensional continuous functions and its application. arXiv preprint arXiv:2401.03956 , 2024

work page arXiv 2024
[23]

Progressive transmission of secured images with authentication using decompositions into monovariate functions.Journal of Electronic Imag- ing, 23(3):033006:1–033006:12, May 2014

Pierre-Emmanuel Leni, Yohan Fougerolle, and Frederic Truchetet. Progressive transmission of secured images with authentication using decompositions into monovariate functions.Journal of Electronic Imag- ing, 23(3):033006:1–033006:12, May 2014

work page 2014
[24]

Kolmogorov Superposition Theorem and Its Applications

Xing Liu. Kolmogorov Superposition Theorem and Its Applications. PhD thesis, Imperial College London, London, UK, September 2015

work page 2015
[25]

Hou, and Max Tegmark

Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljacic, Thomas Y. Hou, and Max Tegmark. KAN: Kolmogorov–Arnold networks. In The Thirteenth International Confer- ence on Learning Representations, 2025

work page 2025
[26]

George G. Lorentz. Metric entropy, widths, and superpositions of functions. The American Mathematical Monthly, 69(6):469–485, 1962

work page 1962
[27]

George G. Lorentz. Approximation of Functions. Holt, Rinehart and Winston, Inc., 1966

work page 1966
[28]

Lorentz, Manfred v

George G. Lorentz, Manfred v. Golitschek, and Yuly Makovoz. Constructive Approximation, volume 304 of Grundlehren der Mathematischen Wissenschaften . Springer, Berlin, 1996

work page 1996
[29]

Deep network approximation for smooth functions

Jianfeng Lu, Zouwei Shen, Haizhao Yang, and Shijun Zhang. Deep network approximation for smooth functions. SIAM Journal on Mathematical Analysis , 53(5):5465–5506, 2021

work page 2021
[30]

Error bounds for deep ReLU networks using the Kolmogorov– Arnold superposition theorem

Hadrien Montanelli and Haizhao Yang. Error bounds for deep ReLU networks using the Kolmogorov– Arnold superposition theorem. Neural Networks, 129:1–6, 2020

work page 2020
[31]

Level Set Methods and Dynamic Implicit Surfaces , volume 153 of Applied Mathematical Sciences

Stanley Osher and Ronald Fedkiw. Level Set Methods and Dynamic Implicit Surfaces , volume 153 of Applied Mathematical Sciences. Springer-Verlag, New York, 2003

work page 2003
[32]

Mathematical Theory of Deep Learning

Philipp Petersen and Jakob Zech. Mathematical Theory of Deep Learning. arXiv, 2024. arXiv:2407.18384 [cs.LG]

work page arXiv 2024
[33]

The Kolmogorov–Arnold representation theorem revisited

Johannes Schmidt-Hieber. The Kolmogorov–Arnold representation theorem revisited. Neural Networks, 137:119–126, 2021

work page 2021
[34]

Neural network approximation: Three hidden layers are enough

Zouwei Shen, Haizhao Yang, and Shijun Zhang. Neural network approximation: Three hidden layers are enough. Neural Networks, 141:160–173, 2021

work page 2021
[35]

Optimal approximation rate of ReLU networks in terms of width and depth

Zouwei Shen, Haizhao Yang, and Shijun Zhang. Optimal approximation rate of ReLU networks in terms of width and depth. Journal de Math´ ematiques Pures et Appliqu´ ees, 157:101–135, 2022

work page 2022
[36]

Shidlovskii

Andrei B. Shidlovskii. Transcendental Numbers. De Gruyter Studies in Mathematics. W. de Gruyter, 1989

work page 1989
[37]

A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks

Khemraj Shukla, Juan Diego Toscano, Zhicheng Wang, Zongren Zou, and George Em Karniadakis. A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks. Computer Methods in Applied Mechanics and Engineering , 431:117290, 2024

work page 2024
[38]

David A Sprecher. Ph.D. Dissertation. PhD thesis, University of Maryland, 1963

work page 1963
[39]

On the structure of continuous functions of several variables

David A Sprecher. On the structure of continuous functions of several variables. Transactions of the American Mathematical Society, 115:340–355, 1965

work page 1965
[40]

A numerical implementation of Kolmogorov’s superpositions

David A Sprecher. A numerical implementation of Kolmogorov’s superpositions. Neural Networks , 9(5):765–772, 1996

work page 1996
[41]

A numerical implementation of Kolmogorov’s superpositions II

David A Sprecher. A numerical implementation of Kolmogorov’s superpositions II. Neural Networks , 10(3):447–457, 1997

work page 1997
[42]

From Algebra to Computational Algorithms: Kolmogorov and Hilbert’s Problem 13

David A Sprecher. From Algebra to Computational Algorithms: Kolmogorov and Hilbert’s Problem 13 . Docent Press, 2017

work page 2017
[43]

AIVT: Inference of turbulent thermal convection from measured 3D velocity data by physics- informed Kolmogorov-Arnold networks

Juan Diego Toscano, Theo K¨ aufer, Zhibo Wang, Martin Maxey, Christian Cierpka, and George Em Karniadakis. AIVT: Inference of turbulent thermal convection from measured 3D velocity data by physics- informed Kolmogorov-Arnold networks. Science Advances, 11(19):eads5236, 2025

work page 2025
[44]

From PINNS to PIKANs: Recent advances in physics-informed machine learning

Juan Diego Toscano, Vivek Oommen, Alan John Varghese, Zongren Zou, Nazanin Ahmadi Daryakenari, Chenxi Wu, and George Em Karniadakis. From PINNS to PIKANs: Recent advances in physics-informed machine learning. Machine Learning for Computational Science and Engineering , 1(1):1–43, 2025

work page 2025
[45]

KKANs: Kurkova-Kolmogorov-Arnold networks and their learning dynamics

Juan Diego Toscano, Li-Lian Wang, and George Em Karniadakis. KKANs: Kurkova-Kolmogorov-Arnold networks and their learning dynamics. Neural Networks, page 107831, 2025

work page 2025
[46]

On Hilbert’s Thirteenth Problem

Anatoli Georgievich Vitushkin. On Hilbert’s Thirteenth Problem. Doklady Akademii Nauk SSSR, 95:701– 704, 1954

work page 1954
[47]

On Hilbert’s Thirteenth problem and related questions

Anatoli Georgievich Vitushkin. On Hilbert’s Thirteenth problem and related questions. Russian Mathe- matical Surveys, 59(1):11, 2004. APPROXIMATE KOLMOGOROV-ARNOLD REPRESENTATION 21

work page 2004
[48]

Error analysis of a first-order IMEX scheme for the logarithmic Schr¨ odinger equation.SIAM Journal on Numerical Analysis , 62(1):119–137, 2024

Li-Lian Wang, Jingye Yan, and Xiaolong Zhang. Error analysis of a first-order IMEX scheme for the logarithmic Schr¨ odinger equation.SIAM Journal on Numerical Analysis , 62(1):119–137, 2024

work page 2024

[1] [1]

On functions of three variables

Vladimir Arnold. On functions of three variables. Proceedings of the USSR Academy of Sciences, 114:679– 681, 1957. English translation: Amer. Math. Soc. Transl., 28: Sixteen Papers on Analysis (1963), pp. 51–54

work page 1957

[2] [2]

On the representation of continuous functions of three variables as superpositions of continuous functions of two variables

Vladimir Arnold. On the representation of continuous functions of three variables as superpositions of continuous functions of two variables. Doklady Akademii Nauk SSSR , 114(4):679–681, 1957. Available on SpringerLink

work page 1957

[3] [3]

From Hilbert’s 13th Problem to the theory of neural networks: Constructive aspects of Kolmogorov’s Superposition Theorem , pages 253–280

Vasco Brattka. From Hilbert’s 13th Problem to the theory of neural networks: Constructive aspects of Kolmogorov’s Superposition Theorem , pages 253–280. Springer Berlin Heidelberg, Berlin, Heidelberg, 2007

work page 2007

[4] [4]

PhD thesis, Universit¨ ats-und Landesbibliothek Bonn, 2009

J¨ urgen Braun.An Application of Kolmogorov’s Superposition Theorem to Function Reconstruction in Higher Dimensions. PhD thesis, Universit¨ ats-und Landesbibliothek Bonn, 2009

work page 2009

[5] [5]

On a constructive proof of Kolmogorov’s superposition theorem

J¨ urgen Braun and Michael Griebel. On a constructive proof of Kolmogorov’s superposition theorem. Constructive Approximation, 30:653–675, 2009

work page 2009

[6] [6]

A note on computing with Kolmogorov Superpositions without iter- ations

Robert Demb and David Sprecher. A note on computing with Kolmogorov Superpositions without iter- ations. Neural Networks, 144:438–442, 2021

work page 2021

[7] [7]

Representation properties of networks: Kolmogorov’s theorem is irrelevant

Federico Girosi and Tomaso Poggio. Representation properties of networks: Kolmogorov’s theorem is irrelevant. Neural Computation, 1(4):465–469, 1989

work page 1989

[8] [8]

Deep learning alternatives of the Kolmogorov superpo- sition theorem

Leonardo Ferreira Guilhoto and Paris Perdikaris. Deep learning alternatives of the Kolmogorov superpo- sition theorem. In The Thirteenth International Conference on Learning Representations , 2025

work page 2025

[9] [9]

Guliyev and Vugar E

Namig J. Guliyev and Vugar E. Ismailov. Approximation capability of two hidden layer feedforward neural networks with fixed weights. Neurocomputing, 316:262–269, 2018

work page 2018

[10] [10]

On the optimal expressive power of ReLU DNNs and its application in approximation with the Kolmogorov superposition theorem

Juncai He. On the optimal expressive power of ReLU DNNs and its application in approximation with the Kolmogorov superposition theorem. IEEE Transactions on Neural Networks and Learning Systems , pages 1–14, 2024

work page 2024

[11] [11]

Kolmogorov’s mapping neural network existence theorem

Robert Hecht-Nielsen. Kolmogorov’s mapping neural network existence theorem. In Proceedings of the IEEE First International Conference on Neural Networks , volume III, pages 11–13, Piscataway, NJ,

work page

[12] [12]

Kolmogorov’s spline network

Boris Igelnik and Neel Parikh. Kolmogorov’s spline network. IEEE Transactions on Neural Networks , 14(4):725–733, 2003

work page 2003

[13] [13]

Addressing common misinterpretations of KART and UAT in neural network literature

Vugar E Ismailov. Addressing common misinterpretations of KART and UAT in neural network literature. arXiv preprint arXiv:2408.16389 , 2024

work page arXiv 2024

[14] [14]

On the Kolmogorov neural networks

Aysu Ismayilova and Vugar E Ismailov. On the Kolmogorov neural networks. Neural Networks , 176:106333, 2024

work page 2024

[15] [15]

Sur le th´ eor` eme de superposition de Kolmogorov.Journal of Approximation Theory, 13:229–234, 1975

Jean-Pierre Kahane. Sur le th´ eor` eme de superposition de Kolmogorov.Journal of Approximation Theory, 13:229–234, 1975

work page 1975

[16] [16]

Kolmogorov’s theorem is relevant.Neural Computation, 3(4):617–622, 1991

Vˇ era K ˙ urkov´ a. Kolmogorov’s theorem is relevant.Neural Computation, 3(4):617–622, 1991

work page 1991

[17] [17]

Kolmogorov’s theorem and multilayer neural networks.Neural Networks, 5(3):501–506, 1992

Vˇ era K ˙ urkov´ a. Kolmogorov’s theorem and multilayer neural networks.Neural Networks, 5(3):501–506, 1992

work page 1992

[18] [18]

On the representation of continuous functions of several variables by superpositions of continuous functions of a smaller number of variables

Andrey Kolmogorov. On the representation of continuous functions of several variables by superpositions of continuous functions of a smaller number of variables. Proceedings of the USSR Academy of Sciences, 108:179–182, 1956. English translation: Amer. Math. Soc. Transl., 17: Twelve Papers on Algebra and Real Functions (1961), pp. 369–373

work page 1956

[19] [19]

On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition

Andrey Kolmogorov. On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition. Doklady Akademii Nauk SSSR , 114(5):953–956, 1957. 20 APPROXIMATE KOLMOGOROV-ARNOLD SUPERPOSITIONS

work page 1957

[20] [20]

On the training of a kolmogorov network

Mario K¨ oppen. On the training of a kolmogorov network. InICANN 2002: International Conference on Artificial Neural Networks, volume 2415 of Lecture Notes in Computer Science, pages 474–479. Springer, 2002

work page 2002

[21] [21]

A superposition theorem of Kolmogorov type for bounded continuous functions.Jour- nal of Approximation Theory , 269:105609, 2021

Mikl´ os Laczkovich. A superposition theorem of Kolmogorov type for bounded continuous functions.Jour- nal of Approximation Theory , 269:105609, 2021

work page 2021

[22] [22]

The optimal linear b-splines approximation via kolmogorov super- position theorem and its application

Ming-Jun Lai and Zhaiming Shen. The optimal rate for linear KB-splines and LKB-splines approximation of high dimensional continuous functions and its application. arXiv preprint arXiv:2401.03956 , 2024

work page arXiv 2024

[23] [23]

Progressive transmission of secured images with authentication using decompositions into monovariate functions.Journal of Electronic Imag- ing, 23(3):033006:1–033006:12, May 2014

Pierre-Emmanuel Leni, Yohan Fougerolle, and Frederic Truchetet. Progressive transmission of secured images with authentication using decompositions into monovariate functions.Journal of Electronic Imag- ing, 23(3):033006:1–033006:12, May 2014

work page 2014

[24] [24]

Kolmogorov Superposition Theorem and Its Applications

Xing Liu. Kolmogorov Superposition Theorem and Its Applications. PhD thesis, Imperial College London, London, UK, September 2015

work page 2015

[25] [25]

Hou, and Max Tegmark

Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljacic, Thomas Y. Hou, and Max Tegmark. KAN: Kolmogorov–Arnold networks. In The Thirteenth International Confer- ence on Learning Representations, 2025

work page 2025

[26] [26]

George G. Lorentz. Metric entropy, widths, and superpositions of functions. The American Mathematical Monthly, 69(6):469–485, 1962

work page 1962

[27] [27]

George G. Lorentz. Approximation of Functions. Holt, Rinehart and Winston, Inc., 1966

work page 1966

[28] [28]

Lorentz, Manfred v

George G. Lorentz, Manfred v. Golitschek, and Yuly Makovoz. Constructive Approximation, volume 304 of Grundlehren der Mathematischen Wissenschaften . Springer, Berlin, 1996

work page 1996

[29] [29]

Deep network approximation for smooth functions

Jianfeng Lu, Zouwei Shen, Haizhao Yang, and Shijun Zhang. Deep network approximation for smooth functions. SIAM Journal on Mathematical Analysis , 53(5):5465–5506, 2021

work page 2021

[30] [30]

Error bounds for deep ReLU networks using the Kolmogorov– Arnold superposition theorem

Hadrien Montanelli and Haizhao Yang. Error bounds for deep ReLU networks using the Kolmogorov– Arnold superposition theorem. Neural Networks, 129:1–6, 2020

work page 2020

[31] [31]

Level Set Methods and Dynamic Implicit Surfaces , volume 153 of Applied Mathematical Sciences

Stanley Osher and Ronald Fedkiw. Level Set Methods and Dynamic Implicit Surfaces , volume 153 of Applied Mathematical Sciences. Springer-Verlag, New York, 2003

work page 2003

[32] [32]

Mathematical Theory of Deep Learning

Philipp Petersen and Jakob Zech. Mathematical Theory of Deep Learning. arXiv, 2024. arXiv:2407.18384 [cs.LG]

work page arXiv 2024

[33] [33]

The Kolmogorov–Arnold representation theorem revisited

Johannes Schmidt-Hieber. The Kolmogorov–Arnold representation theorem revisited. Neural Networks, 137:119–126, 2021

work page 2021

[34] [34]

Neural network approximation: Three hidden layers are enough

Zouwei Shen, Haizhao Yang, and Shijun Zhang. Neural network approximation: Three hidden layers are enough. Neural Networks, 141:160–173, 2021

work page 2021

[35] [35]

Optimal approximation rate of ReLU networks in terms of width and depth

Zouwei Shen, Haizhao Yang, and Shijun Zhang. Optimal approximation rate of ReLU networks in terms of width and depth. Journal de Math´ ematiques Pures et Appliqu´ ees, 157:101–135, 2022

work page 2022

[36] [36]

Shidlovskii

Andrei B. Shidlovskii. Transcendental Numbers. De Gruyter Studies in Mathematics. W. de Gruyter, 1989

work page 1989

[37] [37]

A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks

Khemraj Shukla, Juan Diego Toscano, Zhicheng Wang, Zongren Zou, and George Em Karniadakis. A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks. Computer Methods in Applied Mechanics and Engineering , 431:117290, 2024

work page 2024

[38] [38]

David A Sprecher. Ph.D. Dissertation. PhD thesis, University of Maryland, 1963

work page 1963

[39] [39]

On the structure of continuous functions of several variables

David A Sprecher. On the structure of continuous functions of several variables. Transactions of the American Mathematical Society, 115:340–355, 1965

work page 1965

[40] [40]

A numerical implementation of Kolmogorov’s superpositions

David A Sprecher. A numerical implementation of Kolmogorov’s superpositions. Neural Networks , 9(5):765–772, 1996

work page 1996

[41] [41]

A numerical implementation of Kolmogorov’s superpositions II

David A Sprecher. A numerical implementation of Kolmogorov’s superpositions II. Neural Networks , 10(3):447–457, 1997

work page 1997

[42] [42]

From Algebra to Computational Algorithms: Kolmogorov and Hilbert’s Problem 13

David A Sprecher. From Algebra to Computational Algorithms: Kolmogorov and Hilbert’s Problem 13 . Docent Press, 2017

work page 2017

[43] [43]

AIVT: Inference of turbulent thermal convection from measured 3D velocity data by physics- informed Kolmogorov-Arnold networks

Juan Diego Toscano, Theo K¨ aufer, Zhibo Wang, Martin Maxey, Christian Cierpka, and George Em Karniadakis. AIVT: Inference of turbulent thermal convection from measured 3D velocity data by physics- informed Kolmogorov-Arnold networks. Science Advances, 11(19):eads5236, 2025

work page 2025

[44] [44]

From PINNS to PIKANs: Recent advances in physics-informed machine learning

Juan Diego Toscano, Vivek Oommen, Alan John Varghese, Zongren Zou, Nazanin Ahmadi Daryakenari, Chenxi Wu, and George Em Karniadakis. From PINNS to PIKANs: Recent advances in physics-informed machine learning. Machine Learning for Computational Science and Engineering , 1(1):1–43, 2025

work page 2025

[45] [45]

KKANs: Kurkova-Kolmogorov-Arnold networks and their learning dynamics

Juan Diego Toscano, Li-Lian Wang, and George Em Karniadakis. KKANs: Kurkova-Kolmogorov-Arnold networks and their learning dynamics. Neural Networks, page 107831, 2025

work page 2025

[46] [46]

On Hilbert’s Thirteenth Problem

Anatoli Georgievich Vitushkin. On Hilbert’s Thirteenth Problem. Doklady Akademii Nauk SSSR, 95:701– 704, 1954

work page 1954

[47] [47]

On Hilbert’s Thirteenth problem and related questions

Anatoli Georgievich Vitushkin. On Hilbert’s Thirteenth problem and related questions. Russian Mathe- matical Surveys, 59(1):11, 2004. APPROXIMATE KOLMOGOROV-ARNOLD REPRESENTATION 21

work page 2004

[48] [48]

Error analysis of a first-order IMEX scheme for the logarithmic Schr¨ odinger equation.SIAM Journal on Numerical Analysis , 62(1):119–137, 2024

Li-Lian Wang, Jingye Yan, and Xiaolong Zhang. Error analysis of a first-order IMEX scheme for the logarithmic Schr¨ odinger equation.SIAM Journal on Numerical Analysis , 62(1):119–137, 2024

work page 2024