Approximation of Functions: Optimal Sampling and Complexity

David Krieg; Mario Ullrich

arxiv: 2602.02066 · v2 · submitted 2026-02-02 · 🧮 math.NA · cs.IT· cs.NA· math.IT

Approximation of Functions: Optimal Sampling and Complexity

David Krieg , Mario Ullrich This is my paper

Pith reviewed 2026-05-16 08:41 UTC · model grok-4.3

classification 🧮 math.NA cs.ITcs.NAmath.IT

keywords function approximationoptimal samplinginformation-based complexityoptimal recoverynumerical analysisfunction recovery

0 comments

The pith

Finite function evaluations impose strict information-theoretic limits on approximation accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reviews how recovering or approximating a function from only a finite number of its point evaluations is constrained by the sheer amount of data available. Recent results have clarified these fundamental bounds in the settings of optimal recovery and information-based complexity. It surveys sampling strategies and algorithms designed to approach the best possible accuracy. The discussion also covers broader measurement types including nonlinear, adaptive, and random ones, and shows relations between these settings. The aim is to give both an introduction and an up-to-date overview of what is now known.

Core claim

The central claim is that any finite collection of function values creates an information-theoretic barrier to how well a function can be recovered, and that certain sampling rules and recovery procedures come close to this barrier across linear and nonlinear, adaptive and non-adaptive, deterministic and randomized measurements.

What carries the argument

The information-theoretic limit on worst-case or average-case error that is determined solely by the number and type of allowed measurements.

If this is right

For many standard smoothness classes the minimal error decays at a specific rate determined only by the number of evaluations.
Adaptive and non-adaptive sampling achieve the same rates in linear settings but may differ for nonlinear measurements.
Randomized sampling can match the performance of the best deterministic choice in expectation.
The same limits apply whether measurements are point evaluations or other linear functionals.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

These bounds give a benchmark that any practical numerical method or machine-learning surrogate must eventually respect.
The relations between measurement types suggest unified design principles for choosing data in high-dimensional or infinite-dimensional problems.
Direct extensions could include deriving explicit constants for concrete function spaces used in PDE solvers or signal processing.

Load-bearing premise

The results presuppose standard function classes and error measures from optimal recovery without always spelling out the precise norms or smoothness assumptions.

What would settle it

A concrete function class and norm for which no choice of n sampling points (even adaptive or random) yields approximation error matching the claimed lower bound would disprove the stated limits.

read the original abstract

We consider approximation or recovery of functions based on a finite number of function evaluations. This is a well-studied problem in optimal recovery, machine learning, and numerical analysis in general, but many fundamental insights were obtained only recently. We discuss different aspects of the information-theoretic limit that appears because of the limited amount of data available, as well as algorithms and sampling strategies that come as close to it as possible. We also discuss (optimal) sampling in a broader sense, allowing other types of measurements that may be nonlinear, adaptive and random, and present several relations between the different settings in the spirit of information-based complexity. We hope that this article provides both, a basic introduction to the subject and a contemporary summary of the current state of research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a clear but high-level survey of recent work on optimal sampling for function approximation that organizes existing insights without adding new theorems.

read the letter

The main thing to know is that this paper is an expository survey on approximating functions from finite evaluations. It pulls together ideas from optimal recovery and information-based complexity but does not present any new theorems or algorithms of its own. The abstract makes this framing explicit, and the content stays at the level of summarizing limits and strategies rather than deriving fresh results. That said, it does a solid job of laying out the information-theoretic bounds that arise from limited data and of showing how different sampling approaches, including nonlinear, adaptive, and random ones, relate to those bounds. The connections drawn between settings are useful for seeing the broader picture across machine learning and numerical analysis. The discussion of near-optimal strategies is the part that feels most current and could save a reader time hunting through scattered papers. The soft spot is the one flagged in the stress-test note. The paper assumes familiarity with standard function classes and norms without always restating the exact spaces or radii for each cited result. This is common in surveys, but it does make the claims harder to check without going back to the references, especially for readers new to the area. It is not a load-bearing flaw, just a limitation on how self-contained the text is. This paper is aimed at researchers or students who already know the basics of approximation theory and want a contemporary map of the literature. It could work in a reading group as background reading. It deserves serious refereeing because the topic is active and a well-organized summary has value even if it needs some added detail on the settings.

Referee Report

2 major / 2 minor

Summary. The manuscript is a survey on the approximation and recovery of functions from a finite number of evaluations. It reviews information-theoretic limits arising from limited data, near-optimal sampling strategies (including nonlinear, adaptive, and random measurements), and relations between these settings in the spirit of information-based complexity. The article positions itself as both a basic introduction and a contemporary summary of recent insights in optimal recovery, machine learning, and numerical analysis.

Significance. If the survey accurately consolidates recent results on information-theoretic limits and sampling strategies, it could serve as a useful overview for researchers working at the intersection of approximation theory, optimal recovery, and information-based complexity, particularly by highlighting connections across linear/nonlinear and adaptive/non-adaptive regimes.

major comments (2)

[Introduction] Introduction and abstract: The text repeatedly invokes 'standard settings from optimal recovery and information-based complexity' without restating the precise function classes (e.g., Sobolev W^{k,p}, analytic functions) or the underlying norms and radii of the unit balls for the cited minimax rates and sampling results. This omission is load-bearing because the information-theoretic limits, role of adaptivity, and linear-vs-nonlinear gaps are class-dependent.
[Relations between settings] Section discussing relations between measurement settings: The claimed relations between linear, nonlinear, adaptive, and random measurements are presented at a high level, but without anchoring each relation to a concrete function space and norm, it is impossible to assess whether the summarized 'near-optimal' strategies hold uniformly or only in specific regimes assumed by the referenced literature.

minor comments (2)

[Abstract] The abstract states that 'many fundamental insights were obtained only recently' but does not provide even one or two concrete examples with citations in the opening paragraphs; adding such signposts would improve readability for non-specialists.
[Notation] Notation for sampling operators and recovery maps is introduced without an early consolidated table or list of symbols, which would help readers track the distinctions between different measurement types.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We address each major comment below and will incorporate clarifications in the revised manuscript.

read point-by-point responses

Referee: [Introduction] Introduction and abstract: The text repeatedly invokes 'standard settings from optimal recovery and information-based complexity' without restating the precise function classes (e.g., Sobolev W^{k,p}, analytic functions) or the underlying norms and radii of the unit balls for the cited minimax rates and sampling results. This omission is load-bearing because the information-theoretic limits, role of adaptivity, and linear-vs-nonlinear gaps are class-dependent.

Authors: We agree that explicitly restating the function classes and norms strengthens the presentation. In the revision we will expand the introduction to briefly recall the standard settings (e.g., Sobolev balls in W^{k,p} with the corresponding norm and radius, and analytic functions on the unit disk or strip) together with the associated minimax rates that appear in the cited literature. This will anchor the information-theoretic limits without altering the survey character of the article. revision: yes
Referee: [Relations between settings] Section discussing relations between measurement settings: The claimed relations between linear, nonlinear, adaptive, and random measurements are presented at a high level, but without anchoring each relation to a concrete function space and norm, it is impossible to assess whether the summarized 'near-optimal' strategies hold uniformly or only in specific regimes assumed by the referenced literature.

Authors: We accept the point. The revision will add short, concrete illustrations for each relation, using well-known cases such as Sobolev balls (where linear vs. nonlinear gaps are characterized) and analytic classes (where adaptivity yields no improvement). Each example will cite the precise norm and radius so that the reader can verify the regime in which the stated near-optimality holds. revision: yes

Circularity Check

0 steps flagged

Survey paper summarizes established results without self-referential derivations or fitted predictions

full rationale

This is a survey article that reviews information-theoretic limits, sampling strategies, and relations between measurement settings in optimal recovery and information-based complexity. No original derivations, equations, or predictions are introduced that could reduce to the paper's own inputs by construction. All cited results are drawn from prior external literature, with no load-bearing self-citations, self-definitional steps, or renaming of known results as new insights. The discussion presupposes standard settings but does not attempt to derive or predict quantities from its own fitted parameters or ansatzes.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a review paper; no new free parameters, axioms, or invented entities are introduced.

pith-pipeline@v0.9.0 · 5419 in / 827 out tokens · 27648 ms · 2026-05-16T08:41:35.553484+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We consider approximation or recovery of functions based on a finite number of function evaluations... sampling numbers g_n(F,Y) ... least squares algorithms ... relations between different settings in the spirit of information-based complexity.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The sampling numbers are well-studied for many (classical) classes of functions... bounds by best sparse approximation, entropy numbers, Hilbert numbers.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.