Super-resolution meets machine learning: approximation of measures

H. N. Mhaskar

arxiv: 1907.04895 · v1 · pith:LWAJQ7YXnew · submitted 2019-07-10 · 🧮 math.FA · cs.LG· stat.ML

Super-resolution meets machine learning: approximation of measures

H. N. Mhaskar This is my paper

Pith reviewed 2026-05-24 23:16 UTC · model grok-4.3

classification 🧮 math.FA cs.LGstat.ML

keywords super-resolutionmeasure approximationtotal variationFourier coefficientsrecuperation operatormachine learningdeconvolutionfunctional analysis

0 comments

The pith

An explicit recuperation operator approximates any finite-total-variation measure from its coefficients with error bounds that cannot be improved in general.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper treats the task of recovering an approximating measure from a finite collection of its coefficients with respect to an orthonormal system when the only given information is that the target measure has finite total variation. It introduces a distance between the original and recovered measures, supplies an explicit formula for the recuperation operator, and derives explicit bounds on this distance that remain valid even when the support is a continuum. These bounds are shown to be optimal in several distinct senses. The same framework yields a lower bound on approximation quality for finitely supported measures whenever the number of available coefficients falls below the threshold required for classical super-resolution. The setting also covers related problems in machine learning and deconvolution.

Core claim

We define an appropriate notion of a distance between the target measure and its recuperated version, give an explicit expression for the recuperation operator, and estimate the distance between μ and its approximation. We show that these estimates are the best possible in many different ways. We also explain why for a finitely supported measure the approximation quality of its recuperation is bounded from below if the amount of information is smaller than what is demanded in the super-resolution problem.

What carries the argument

The recuperation operator that maps the given finite set of coefficients to an approximating measure while controlling the defined distance between measures.

If this is right

Error bounds hold uniformly for measures supported on continua as well as on discrete sets.
The derived estimates are sharp and cannot be replaced by strictly smaller quantities that work for all measures of finite total variation.
When the number of coefficients is below the super-resolution threshold, any approximation of a finitely supported measure must incur a positive lower bound on the distance error.
The same recuperation construction applies directly to inverse problems of deconvolution and to certain machine-learning tasks that involve recovering measures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The operator construction could be adapted to other orthonormal systems or kernels that arise in density estimation.
The optimality statements imply information-theoretic limits on how well any method can recover continuous measures from limited coefficient data.
Numerical implementation of the explicit operator would allow direct testing on synthetic continuum-supported measures.
Connections to existing deconvolution algorithms in signal processing may yield hybrid methods that inherit the same optimality guarantees.

Load-bearing premise

The target measure has finite total variation and nothing more is assumed about its support or the separation of its points.

What would settle it

Exhibit a specific measure of finite total variation together with its first N coefficients such that every possible recuperation operator produces a distance error strictly larger than the claimed bound.

read the original abstract

The problem of super-resolution in general terms is to recuperate a finitely supported measure $\mu$ given finitely many of its coefficients $\hat{\mu}(k)$ with respect to some orthonormal system. The interesting case concerns situations, where the number of coefficients required is substantially smaller than a power of the reciprocal of the minimal separation among the points in the support of $\mu$. In this paper, we consider the more severe problem of recuperating $\mu$ approximately without any assumption on $\mu$ beyond having a finite total variation. In particular, $\mu$ may be supported on a continuum, so that the minimal separation among the points in the support of $\mu$ is $0$. A variant of this problem is also of interest in machine learning as well as the inverse problem of de-convolution. We define an appropriate notion of a distance between the target measure and its recuperated version, give an explicit expression for the recuperation operator, and estimate the distance between $\mu$ and its approximation. We show that these estimates are the best possible in many different ways. We also explain why for a finitely supported measure the approximation quality of its recuperation is bounded from below if the amount of information is smaller than what is demanded in the super-resolution problem.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives an explicit recovery operator for any finite total variation measure from its Fourier coefficients and proves matching upper and lower bounds that hold without separation assumptions.

read the letter

The core contribution is an explicit operator that turns a finite set of Fourier coefficients into an approximation to any measure with finite total variation, together with error bounds that remain valid when the support is continuous. The authors also show these rates are sharp in several senses and note that the usual super-resolution separation requirement still applies if the target is known to be discrete. That combination of explicit construction and optimality statements for the harder continuum case is the part worth attention. The total-variation-only assumption is minimal and the argument avoids hidden discretization steps, which keeps the result clean on its own terms. The lower-bound discussion for the discrete case is consistent with existing literature rather than contradictory. The main limitation visible from the abstract is that we still need the full derivations to confirm the operator stays computable and that the distance they define between measures does not introduce extra constants that weaken the optimality claim. If those steps check out, the work sits squarely in approximation theory and inverse problems. Readers working on deconvolution or measure recovery will find the explicit operator and the continuum extension useful; others can skip it. The paper is coherent enough and the claims are stated sharply enough that it should go to referees rather than be desk-rejected.

Referee Report

0 major / 3 minor

Summary. The paper considers the super-resolution problem of recovering a measure μ with finite total variation (possibly with continuous support and zero minimal separation) from finitely many Fourier coefficients. It defines a suitable distance between μ and its recovered version, supplies an explicit recuperation operator, derives error bounds between μ and the approximation, and establishes that these bounds are sharp in multiple senses. It further shows that for finitely supported measures the approximation quality is bounded away from zero whenever the number of coefficients falls below the threshold required by classical super-resolution.

Significance. If the explicit operator construction and the matching upper/lower bounds hold for arbitrary finite-TV measures, the result would provide a parameter-free approximation scheme linking super-resolution to machine-learning inverse problems such as deconvolution. The absence of separation assumptions and the sharpness statements constitute the main potential contribution.

minor comments (3)

The abstract states that the estimates are 'the best possible in many different ways,' yet the precise senses of optimality (e.g., which norms or which classes of measures) are not enumerated; a short clarifying sentence or reference to the relevant theorem would improve readability.
Notation for the recuperation operator and the distance functional should be introduced with a displayed equation early in the introduction so that the subsequent claims can be stated more compactly.
The final paragraph on the limitation for finitely supported measures would benefit from an explicit comparison (perhaps in a remark) to the classical minimal-separation condition in the super-resolution literature.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our work and the recommendation for minor revision. The referee's description accurately reflects the paper's contributions on super-resolution for measures of finite total variation without separation assumptions, the explicit recuperation operator, and the optimality of the error bounds.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper constructs an explicit recuperation operator from given Fourier coefficients and the total-variation norm, then derives matching upper and lower bounds on a defined distance for arbitrary finite-TV measures. These steps rely on standard properties of orthonormal systems and the definition of total variation; no equation reduces to a fitted parameter renamed as a prediction, no self-citation supplies a load-bearing uniqueness theorem, and the finite-TV premise is used directly without smuggling an ansatz or renaming a known empirical pattern. The derivation is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Ledger extracted from abstract only; full paper may introduce additional parameters or axioms.

axioms (2)

domain assumption The target measure μ has finite total variation.
Stated explicitly as the sole assumption placed on μ.
standard math Coefficients are taken with respect to some orthonormal system.
The problem is defined in terms of these coefficients.

pith-pipeline@v0.9.0 · 5752 in / 1269 out tokens · 23147 ms · 2026-05-24T23:16:48.790577+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We define an appropriate notion of a distance between the target measure and its recuperated version, give an explicit expression for the recuperation operator, and estimate the distance between mu and its approximation.
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 4.1 ... | | |nu_n - mu | | |G;p <= c {2^{-n(beta-q/p')} |mu|(X) + ||Pn||1}

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages · 2 internal anchors

[1]

A. R. Barron. Neural net approximation. In Proc. 7th Yale Workshop on Adaptive and Learning Systems , volume 1, pages 69–72, 1992

work page 1992
[2]

Batenkov

D. Batenkov. Stability and super-resolution of generalized spike recovery. Applied and Computational Harmonic Analysis, 2016

work page 2016
[3]

Batenkov and Y

D. Batenkov and Y. Yomdin. On the accuracy of solving conﬂuent Prony systems. SIAM Journal on Applied Mathematics, 73(1):134–154, 2013

work page 2013
[4]

Bendory, S

T. Bendory, S. Dekel, and A. Feuer. Exact recovery of dirac en sembles from the projection onto spaces of spherical harmonics. Constructive Approximation , 42(2):183–207, 2015

work page 2015
[5]

Bendory, S

T. Bendory, S. Dekel, and A. Feuer. Super-resolution on the sp here using convex optimization. IEEE transac- tions on signal processing , 63(9):2253–2262, 2015

work page 2015
[6]

E. J. Cand` es and C. Fernandez-Granda. Super-resolution fr om noisy data. Journal of Fourier Analysis and Applications, 19(6):1229–1254, 2013

work page 2013
[7]

E. J. Cand` es and C. Fernandez-Granda. Towards a mathemat ical theory of super-resolution. Communications on Pure and Applied Mathematics , 67(6):906–956, 2014

work page 2014
[8]

B. G. R. De Prony. Essai ´ experimental et analytique: sur les lois de la dilatabilit´ e de ﬂuides ´ elastique et sur celles de la force expansive de la vapeur de l’alkool,a diﬀ´ erentes temp´eratures. Journal de l’´ ecole polytechnique, 1(22):24–76, 1795

work page
[9]

R. A. DeVore, R. Howard, and C. A. Micchelli. Optimal nonlinear app roximation. Manuscripta mathematica, 63(4):469–478, 1989

work page 1989
[10]

R. A. DeVore and G. G. Lorentz. Constructive approximation, volume 303. Springer Science & Business Media, 1993

work page 1993
[11]

Dick and F

J. Dick and F. Pillichshammer. Digital nets and sequences: discrepancy theory and quasi–M onte Carlo inte- gration. Cambridge University Press, 2010. 12

work page 2010
[12]

D. L. Donoho. Superresolution via sparsity constraints. SIAM Journal on Mathematical Analysis , 23(5):1309– 1331, 1992

work page 1992
[13]

K. S. Eckhoﬀ. Accurate reconstructions of functions of ﬁnit e regularity from truncated fourier series expansions. Mathematics of Computation , 64(210):671–690, 1995

work page 1995
[14]

Ehler, F

M. Ehler, F. Filbir, and H. N. Mhaskar. Locally learning biomedical d ata using diﬀusion frames. Journal of Computational Biology, 19(11):1251–1264, 2012

work page 2012
[15]

Filbir and H

F. Filbir and H. N. Mhaskar. A quadrature formula for diﬀusion po lynomials corresponding to a generalized heat kernel. Journal of Fourier Analysis and Applications , 16(5):629–657, 2010

work page 2010
[16]

Filbir and H

F. Filbir and H. N. Mhaskar. Marcinkiewicz–Zygmund measures on manifolds. Journal of Complexity , 27(6):568–596, 2011

work page 2011
[17]

Filbir, H

F. Filbir, H. N. Mhaskar, and J. Prestin. On the problem of param eter estimation in exponential sums. Constructive Approximation , 35(3):323–343, 2012

work page 2012
[18]

Exact Recovery of Discrete Measures from Wigner D-Moments

F. Filbir and K. Schr¨ oder. Exact recovery of discrete measur es from wigner d-moments. arXiv preprint arXiv:1606.05306, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[19]

Gelb and E

A. Gelb and E. Tadmor. Detection of edges in spectral data. Applied and computational harmonic analysis , 7(1):101–135, 1999

work page 1999
[20]

J. M. Klusowski and A. R. Barron. Uniform approximation by neu ral networks activated by ﬁrst and second order ridge splines. arXiv preprint arXiv:1607.07819 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[21]

Krim and M

H. Krim and M. Viberg. Two decades of array signal processing r esearch: the parametric approach. Signal Processing Magazine, IEEE , 13(4):67–94, 1996

work page 1996
[22]

Kurkov´ a and M

V. Kurkov´ a and M. Sanguineti. Bounds on rates of variable bas is and neural network approximation. IEEE Transactions on Information Theory , 47(6):2659–2665, 2001

work page 2001
[23]

Kurkov´ a and M

V. Kurkov´ a and M. Sanguineti. Comparison of worst case erro rs in linear and neural network approximation. IEEE Transactions on Information Theory , 48(1):264–275, 2002

work page 2002
[24]

C. Lanczos. Applied analysis . Courier Dover Publications, 1988

work page 1988
[25]

Maggioni and H

M. Maggioni and H. N. Mhaskar. Diﬀusion polynomial frames on me tric measure spaces. Applied and Com- putational Harmonic Analysis , 24(3):329–353, 2008

work page 2008
[26]

H. N. Mhaskar. Neural networks for optimal approximation of smooth and analytic functions. Neural Compu- tation, 8(1):164–177, 1996

work page 1996
[27]

H. N. Mhaskar. On the tractability of multivariate integration an d approximation by neural networks. Journal of Complexity , 20(4):561–590, 2004

work page 2004
[28]

H. N. Mhaskar. Eignets for function approximation on manifolds . Applied and Computational Harmonic Analysis, 29(1):63–87, 2010

work page 2010
[29]

H. N. Mhaskar. A uniﬁed framework for harmonic analysis of fun ctions on directed graphs and changing data. Appl. Comput. Harm. Anal. , 44(3):611–644, 2018

work page 2018
[30]

H. N. Mhaskar and T. Poggio. Deep vs. shallow networks: An app roximation theory perspective. Analysis and Applications, 14(06):829–848, 2016

work page 2016
[31]

H. N. Mhaskar and J. Prestin. On local smoothness classes of p eriodic functions. Journal of Fourier Analysis and Applications, 11(3):353–373, 2005

work page 2005
[32]

Novak and H

E. Novak and H. Wo´ zniakowski. Tractability of Multivariate Problems: Standard informat ion for functionals , volume 12. European Mathematical Society, 2008

work page 2008
[33]

Potts and M

D. Potts and M. Tasche. Parameter estimation for exponentia l sums by approximate Prony method. Signal Processing, 90(5):1631–1642, 2010. 13

work page 2010
[34]

G. Szeg¨ o. Orthogonal polynomials. In Colloquium publications/American mathematical society , volume 23. Providence, 1975

work page 1975
[35]

Tadmor and J

E. Tadmor and J. Tanner. Adaptive ﬁlters for piecewise smooth spectral data. IMA journal of numerical analysis, 25(4):635–647, 2005

work page 2005
[36]

A. Zygmund. Trigonometric series, volume 1. Cambridge University Press, 2002. 14

work page 2002

[1] [1]

A. R. Barron. Neural net approximation. In Proc. 7th Yale Workshop on Adaptive and Learning Systems , volume 1, pages 69–72, 1992

work page 1992

[2] [2]

Batenkov

D. Batenkov. Stability and super-resolution of generalized spike recovery. Applied and Computational Harmonic Analysis, 2016

work page 2016

[3] [3]

Batenkov and Y

D. Batenkov and Y. Yomdin. On the accuracy of solving conﬂuent Prony systems. SIAM Journal on Applied Mathematics, 73(1):134–154, 2013

work page 2013

[4] [4]

Bendory, S

T. Bendory, S. Dekel, and A. Feuer. Exact recovery of dirac en sembles from the projection onto spaces of spherical harmonics. Constructive Approximation , 42(2):183–207, 2015

work page 2015

[5] [5]

Bendory, S

T. Bendory, S. Dekel, and A. Feuer. Super-resolution on the sp here using convex optimization. IEEE transac- tions on signal processing , 63(9):2253–2262, 2015

work page 2015

[6] [6]

E. J. Cand` es and C. Fernandez-Granda. Super-resolution fr om noisy data. Journal of Fourier Analysis and Applications, 19(6):1229–1254, 2013

work page 2013

[7] [7]

E. J. Cand` es and C. Fernandez-Granda. Towards a mathemat ical theory of super-resolution. Communications on Pure and Applied Mathematics , 67(6):906–956, 2014

work page 2014

[8] [8]

B. G. R. De Prony. Essai ´ experimental et analytique: sur les lois de la dilatabilit´ e de ﬂuides ´ elastique et sur celles de la force expansive de la vapeur de l’alkool,a diﬀ´ erentes temp´eratures. Journal de l’´ ecole polytechnique, 1(22):24–76, 1795

work page

[9] [9]

R. A. DeVore, R. Howard, and C. A. Micchelli. Optimal nonlinear app roximation. Manuscripta mathematica, 63(4):469–478, 1989

work page 1989

[10] [10]

R. A. DeVore and G. G. Lorentz. Constructive approximation, volume 303. Springer Science & Business Media, 1993

work page 1993

[11] [11]

Dick and F

J. Dick and F. Pillichshammer. Digital nets and sequences: discrepancy theory and quasi–M onte Carlo inte- gration. Cambridge University Press, 2010. 12

work page 2010

[12] [12]

D. L. Donoho. Superresolution via sparsity constraints. SIAM Journal on Mathematical Analysis , 23(5):1309– 1331, 1992

work page 1992

[13] [13]

K. S. Eckhoﬀ. Accurate reconstructions of functions of ﬁnit e regularity from truncated fourier series expansions. Mathematics of Computation , 64(210):671–690, 1995

work page 1995

[14] [14]

Ehler, F

M. Ehler, F. Filbir, and H. N. Mhaskar. Locally learning biomedical d ata using diﬀusion frames. Journal of Computational Biology, 19(11):1251–1264, 2012

work page 2012

[15] [15]

Filbir and H

F. Filbir and H. N. Mhaskar. A quadrature formula for diﬀusion po lynomials corresponding to a generalized heat kernel. Journal of Fourier Analysis and Applications , 16(5):629–657, 2010

work page 2010

[16] [16]

Filbir and H

F. Filbir and H. N. Mhaskar. Marcinkiewicz–Zygmund measures on manifolds. Journal of Complexity , 27(6):568–596, 2011

work page 2011

[17] [17]

Filbir, H

F. Filbir, H. N. Mhaskar, and J. Prestin. On the problem of param eter estimation in exponential sums. Constructive Approximation , 35(3):323–343, 2012

work page 2012

[18] [18]

Exact Recovery of Discrete Measures from Wigner D-Moments

F. Filbir and K. Schr¨ oder. Exact recovery of discrete measur es from wigner d-moments. arXiv preprint arXiv:1606.05306, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[19] [19]

Gelb and E

A. Gelb and E. Tadmor. Detection of edges in spectral data. Applied and computational harmonic analysis , 7(1):101–135, 1999

work page 1999

[20] [20]

J. M. Klusowski and A. R. Barron. Uniform approximation by neu ral networks activated by ﬁrst and second order ridge splines. arXiv preprint arXiv:1607.07819 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[21] [21]

Krim and M

H. Krim and M. Viberg. Two decades of array signal processing r esearch: the parametric approach. Signal Processing Magazine, IEEE , 13(4):67–94, 1996

work page 1996

[22] [22]

Kurkov´ a and M

V. Kurkov´ a and M. Sanguineti. Bounds on rates of variable bas is and neural network approximation. IEEE Transactions on Information Theory , 47(6):2659–2665, 2001

work page 2001

[23] [23]

Kurkov´ a and M

V. Kurkov´ a and M. Sanguineti. Comparison of worst case erro rs in linear and neural network approximation. IEEE Transactions on Information Theory , 48(1):264–275, 2002

work page 2002

[24] [24]

C. Lanczos. Applied analysis . Courier Dover Publications, 1988

work page 1988

[25] [25]

Maggioni and H

M. Maggioni and H. N. Mhaskar. Diﬀusion polynomial frames on me tric measure spaces. Applied and Com- putational Harmonic Analysis , 24(3):329–353, 2008

work page 2008

[26] [26]

H. N. Mhaskar. Neural networks for optimal approximation of smooth and analytic functions. Neural Compu- tation, 8(1):164–177, 1996

work page 1996

[27] [27]

H. N. Mhaskar. On the tractability of multivariate integration an d approximation by neural networks. Journal of Complexity , 20(4):561–590, 2004

work page 2004

[28] [28]

H. N. Mhaskar. Eignets for function approximation on manifolds . Applied and Computational Harmonic Analysis, 29(1):63–87, 2010

work page 2010

[29] [29]

H. N. Mhaskar. A uniﬁed framework for harmonic analysis of fun ctions on directed graphs and changing data. Appl. Comput. Harm. Anal. , 44(3):611–644, 2018

work page 2018

[30] [30]

H. N. Mhaskar and T. Poggio. Deep vs. shallow networks: An app roximation theory perspective. Analysis and Applications, 14(06):829–848, 2016

work page 2016

[31] [31]

H. N. Mhaskar and J. Prestin. On local smoothness classes of p eriodic functions. Journal of Fourier Analysis and Applications, 11(3):353–373, 2005

work page 2005

[32] [32]

Novak and H

E. Novak and H. Wo´ zniakowski. Tractability of Multivariate Problems: Standard informat ion for functionals , volume 12. European Mathematical Society, 2008

work page 2008

[33] [33]

Potts and M

D. Potts and M. Tasche. Parameter estimation for exponentia l sums by approximate Prony method. Signal Processing, 90(5):1631–1642, 2010. 13

work page 2010

[34] [34]

G. Szeg¨ o. Orthogonal polynomials. In Colloquium publications/American mathematical society , volume 23. Providence, 1975

work page 1975

[35] [35]

Tadmor and J

E. Tadmor and J. Tanner. Adaptive ﬁlters for piecewise smooth spectral data. IMA journal of numerical analysis, 25(4):635–647, 2005

work page 2005

[36] [36]

A. Zygmund. Trigonometric series, volume 1. Cambridge University Press, 2002. 14

work page 2002