Reliable sampling-based RKHS norm estimation via superconvergence
Pith reviewed 2026-05-20 03:32 UTC · model grok-4.3
The pith
Sampling-based estimation using superconvergence delivers reliable RKHS norm values for kernel methods.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors introduce a sampling-based procedure for estimating the reproducing kernel Hilbert space norm of a function by drawing on superconvergence properties of kernel interpolants. The approach yields reliable estimates for a wide class of target functions provided modest prior knowledge is available to tune the sampling.
What carries the argument
Superconvergence in kernel methods, which produces faster convergence rates for certain error functionals when sampling points satisfy specific conditions.
If this is right
- Error bounds for kernel approximations become computable from samples alone.
- Safety certificates for learned controllers can be obtained without direct access to the norm.
- Kernel methods gain practical support for quantitative performance claims in system identification.
- Surrogate modeling applications receive verifiable accuracy statements from data.
Where Pith is reading between the lines
- The estimator could be embedded in iterative learning loops to track norm changes over time.
- Similar sampling ideas might extend to norm estimation for other linear functionals of kernel models.
- High-dimensional test cases would clarify how the number of samples scales with input dimension.
Load-bearing premise
The target function belongs to a reproducing kernel Hilbert space where superconvergence results hold and reasonable prior knowledge is available to set the sampling parameters correctly.
What would settle it
A concrete function in a qualifying RKHS for which the sampling estimator returns a value substantially different from the true norm even when all parameter choices follow the stated rules.
Figures
read the original abstract
Kernel methods are one of the cornerstones of learning-based control, modern system identification, surrogate modelling, and related fields. A key advantage of this class of learning and function approximation methods is the availability of quantitative error bounds, which in turn play a key role in guaranteeing the safety of learned controllers and related learning-based algorithms. However, these error bounds rely on a particular property of the target function -- its reproducing kernel Hilbert space (RKHS) norm -- which is usually impossible to obtain in practice. Motivated by this severe shortcoming, we present a novel sampling-based RKHS norm estimation approach with a solid theoretical foundation, leveraging very recent advances in the theory of superconvergence in kernel methods. Our method is applicable to a broad range of practically relevant function classes and requires only reasonable prior knowledge about the target function. Extensive numerical experiments demonstrate the efficacy and practical applicability of the proposed method. By providing a reliable RKHS norm estimation approach, we remove a major obstacle to the practical deployment of learning-based control algorithms.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces a sampling-based estimator for the RKHS norm of a target function, grounded in recent superconvergence results for kernel quadrature and interpolation. The approach is presented as theoretically sound for a range of function classes, requiring only reasonable prior knowledge (such as function class membership) to set sampling parameters, and is supported by numerical experiments demonstrating practical performance. The central motivation is to enable quantitative, verifiable error bounds for kernel methods in learning-based control and system identification.
Significance. A reliable, sampling-based RKHS-norm estimator would remove a key practical obstacle to deploying kernel methods with rigorous safety guarantees in control and surrogate modeling. The explicit use of superconvergence to achieve faster rates than standard Monte-Carlo sampling is a promising direction; if the guarantees hold under the stated 'reasonable prior knowledge' assumptions, the work would be a useful contribution to numerical analysis of kernel methods.
major comments (2)
- [Abstract and §3] Abstract and §3 (theoretical development): the claim that superconvergence yields a 'reliable' estimator with only 'reasonable prior knowledge' is load-bearing for the central contribution. Standard superconvergence results for kernel methods require the sampling distribution or regularization parameter to be tuned to the precise Mercer eigenvalue decay rate or Sobolev index; an upper bound alone typically collapses the rate to the standard O(n^{-1/2}) Monte-Carlo regime. The manuscript should state explicitly (via a theorem or corollary) whether the estimator retains a provably faster rate under approximate knowledge of these quantities, or whether the reliability claim is only asymptotic under exact knowledge.
- [§4] §4 (numerical experiments): the reported experiments use well-specified function classes and exact parameter knowledge. To support the practical-applicability claim, at least one table or figure should demonstrate performance under deliberately misspecified smoothness indices or eigenvalue-decay bounds (e.g., using an index 20 % higher than the true value). Without such a stress test, the numerical evidence does not yet address the robustness concern raised by the method's dependence on prior knowledge.
minor comments (2)
- [§2] Notation for the sampling measure and the superconvergence constant should be introduced once and used consistently; currently the same symbol appears to denote both the empirical measure and its continuous counterpart in different paragraphs.
- A short remark comparing the proposed estimator's sample complexity to existing Monte-Carlo or cross-validation approaches for RKHS-norm estimation would help readers gauge the practical gain.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. The comments raise important points about the scope of our theoretical guarantees and the need for robustness checks in the experiments. We address each major comment below and indicate the revisions we will make to the manuscript.
read point-by-point responses
-
Referee: [Abstract and §3] Abstract and §3 (theoretical development): the claim that superconvergence yields a 'reliable' estimator with only 'reasonable prior knowledge' is load-bearing for the central contribution. Standard superconvergence results for kernel methods require the sampling distribution or regularization parameter to be tuned to the precise Mercer eigenvalue decay rate or Sobolev index; an upper bound alone typically collapses the rate to the standard O(n^{-1/2}) Monte-Carlo regime. The manuscript should state explicitly (via a theorem or corollary) whether the estimator retains a provably faster rate under approximate knowledge of these quantities, or whether the reliability claim is only asymptotic under exact knowledge.
Authors: We appreciate this observation, which helps us clarify the assumptions. Our theoretical development in §3 assumes that the user provides a bound on the eigenvalue decay rate that is at least as slow as the true rate (corresponding to a conservative estimate of the smoothness). Under this condition, we can prove that the sampling-based estimator achieves a convergence rate strictly faster than the standard Monte Carlo rate of O(n^{-1/2}). We will add Corollary 3.4 to explicitly state this result and show that for modest misspecifications (e.g., assuming a smoothness index up to 20% higher), the rate remains improved. The abstract will be revised to specify that 'reasonable prior knowledge' means a conservative upper bound on smoothness. We disagree that the claim is only asymptotic under exact knowledge; the faster rate holds under the stated approximate knowledge. revision: yes
-
Referee: [§4] §4 (numerical experiments): the reported experiments use well-specified function classes and exact parameter knowledge. To support the practical-applicability claim, at least one table or figure should demonstrate performance under deliberately misspecified smoothness indices or eigenvalue-decay bounds (e.g., using an index 20 % higher than the true value). Without such a stress test, the numerical evidence does not yet address the robustness concern raised by the method's dependence on prior knowledge.
Authors: We agree that additional experiments under misspecification would strengthen the practical applicability claim. We will add a new figure and accompanying table in §4 that tests the estimator when the assumed Sobolev index is set 20% higher than the true value for the test functions. The results will show the effect on the estimated RKHS norm and the resulting error bounds. This revision will directly address the robustness concern. revision: yes
Circularity Check
No circularity: derivation relies on external superconvergence theory
full rationale
The paper presents a sampling-based RKHS norm estimator whose theoretical guarantees are explicitly grounded in recent external advances in superconvergence for kernel methods. No equations or steps in the abstract or described construction reduce the target norm estimate to a fitted parameter or self-defined quantity by construction. The method requires reasonable prior knowledge to set sampling parameters, but this is an input assumption rather than a self-referential fit. Self-citations, if present for the superconvergence results, are not load-bearing in a way that collapses the central claim; the derivation remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Superconvergence properties hold for the kernel methods and function classes considered
- domain assumption Reasonable prior knowledge about the target function is available to set up the sampling procedure
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We leverage this theoretical result to propose two algorithms for estimating the RKHS norm... Fit model h↦c1−c′1hβ1 ... to (hXi,∥sf,Xi∥2Hk(Ω))
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Theorem 2. Under Assumption 1, it holds for ϑ∈(1,2]: ∀ϑ′<ϑ f∈(Hk(Ω))ϑ′ ⇔ ∀ϑ′<ϑ ∃C>0 ∀X⊂Ω : ∥f−sf,X∥Hk(Ω)≤Ch(ϑ′−1)τX
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Claude Code [Large language model].https://code.claude
Anthropic. Claude Code [Large language model].https://code.claude. com, 2026
work page 2026
-
[2]
S. Avesani, L. Ling, F. Marchetti, and T. Wenzel. Sobolev algorithm for local smoothness analysis (SALSA) via sharp direct and inverse statements. arXiv preprint arXiv:2512.17377, 2025
-
[3]
G. E. Fasshauer and M. J. McCourt.Kernel-based approximation methods using Matlab, volume 19. World Scientific Publishing Company, 2015
work page 2015
-
[4]
C. Fiedler, J. Menn, L. Kreisk¨ other, and S. Trimpe. On safety in safe bayesian optimization.Transactions on Machine Learning Research, 2024
work page 2024
-
[5]
C. Fiedler, C. W. Scherer, and S. Trimpe. Learning-enhanced robust con- troller synthesis with rigorous statistical and control-theoretic guarantees. In2021 60th IEEE Conference on Decision and Control (CDC), pages 5122–5129. IEEE, 2021
work page 2021
-
[6]
C. Fiedler, C. W. Scherer, and S. Trimpe. Learning functions and uncer- tainty sets using geometrically constrained kernel regression. In2022 IEEE 61st Conference on Decision and Control (CDC), pages 2141–2146. IEEE, 2022
work page 2022
-
[7]
Github Copilot [Large language model].https://github.com/ features/copilot, 2026
Github. Github Copilot [Large language model].https://github.com/ features/copilot, 2026
work page 2026
-
[8]
R. B. Gramacy.Surrogates: Gaussian process modeling, design, and opti- mization for the applied sciences. Chapman and Hall/CRC, 2020
work page 2020
-
[9]
K. Hashimoto, A. Saoud, M. Kishida, T. Ushio, and D. V. Dimarogonas. Learning-based symbolic abstractions for nonlinear control systems.Auto- matica, 146:110646, 2022
work page 2022
-
[10]
M. Hertneck, J. K¨ ohler, S. Trimpe, and F. Allg¨ ower. Learning an approx- imate model predictive controller with guarantees.IEEE Control Systems Letters, 2(3):543–548, 2018
work page 2018
- [11]
-
[12]
T. Karvonen, G. Santin, and T. Wenzel. General superconvergence for kernel-based approximation.arXiv preprint arXiv:2505.11435, 2025. 14
-
[13]
T. Karvonen, G. Santin, and T. Wenzel. Piecewise linear interpolation via kernels.arXiv preprint arXiv:2603.01555, 2026
-
[14]
J. K¨ ohler, R. Soloperto, M. A. M¨ uller, and F. Allg¨ ower. A computationally efficient robust model predictive control framework for uncertain nonlinear systems.IEEE Transactions on Automatic Control, 66(2):794–801, 2020
work page 2020
-
[15]
A. Lederer, J. Umlauft, and S. Hirche. Uniform error bounds for gaussian process regression with application to safe control.Advances in Neural Information Processing Systems, 32, 2019
work page 2019
-
[16]
F. Narcowich, J. Ward, and H. Wendland. Sobolev Error Estimates and a Bernstein Inequality for Scattered Data Interpolation via Radial Basis Functions.Constructive Approximation, 24(2):175–186, 2006
work page 2006
- [17]
-
[18]
ChatGPT [Large language model].https://chatgpt.com/, 2026
OpenAI. ChatGPT [Large language model].https://chatgpt.com/, 2026
work page 2026
-
[19]
G. Pillonetto, F. Dinuzzo, T. Chen, G. De Nicolao, and L. Ljung. Kernel methods in system identification, machine learning and function estimation: A survey.Automatica, 50(3):657–682, 2014
work page 2014
-
[20]
C. Rieger and B. Zwicknagl. Sampling inequalities for infinitely smooth functions, with applications to interpolation and machine learning.Ad- vances in Computational Mathematics, 32(1):103, 2010
work page 2010
-
[21]
C. Rieger and B. Zwicknagl. Improved exponential convergence rates by oversampling near the boundary.Constructive Approximation, 39(2):323– 341, 2014
work page 2014
-
[22]
G. Santin and B. Haasdonk. Convergence rate of the data-independentP- greedy algorithm in kernel-based approximation.Dolomites Research Notes on Approximation, 10:68–78, 2017
work page 2017
- [23]
-
[24]
P. Scharnhorst, E. T. Maddalena, Y. Jiang, and C. N. Jones. Robust un- certainty bounds in reproducing kernel hilbert spaces: A convex optimiza- tion approach.IEEE Transactions on Automatic Control, 68(5):2848–2861, 2022
work page 2022
- [25]
-
[26]
I. Steinwart, D. Hush, and C. Scovel. An explicit description of the repro- ducing kernel Hilbert spaces of Gaussian RBF kernels.IEEE Transactions on Information Theory, 52(10):4635–4643, 2006
work page 2006
-
[27]
I. Steinwart and C. Scovel. Mercer’s theorem on general domains: On the interaction between measures, kernels, and RKHSs.Constructive Approx- imation, 35(3):363–417, 2012. 15
work page 2012
-
[28]
Y. Sui, A. Gotovos, J. Burdick, and A. Krause. Safe exploration for opti- mization with gaussian processes. InInternational conference on machine learning, pages 997–1005. PMLR, 2015
work page 2015
- [29]
- [30]
- [31]
-
[32]
Wendland.Scattered data approximation, volume 17
H. Wendland.Scattered data approximation, volume 17. Cambridge uni- versity press, 2004
work page 2004
-
[33]
T. Wenzel. Sharp inverse statements for kernel interpolation.Mathematics of Computation, 2025
work page 2025
- [34]
-
[35]
T. Wenzel and G. Santin. On the optimal shape parameter for kernel meth- ods: Sharp direct and inverse statements.arXiv preprint arXiv:2601.14070, 2026
-
[36]
T. Wenzel, G. Santin, and B. Haasdonk. A novel class of stabilized greedy kernel approximation algorithms: Convergence, stability and uniform point distribution.Journal of Approximation Theory, 262:105508, 2021. 16 −0.25 −0.15 −0.05 0.05 0.15 0.25 x1 −0.25 −0.15 −0.05 0.05 0.15 0.25 x2 −0.25 0.25 0.75 1.25 1.75 2.25 sμ,X(x) Interpolant sμ, X Data points −...
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.