pith. sign in

arxiv: 2503.24209 · v4 · submitted 2025-03-31 · 🧮 math.ST · math.PR· stat.TH

Optimal low-rank posterior mean and distribution approximation in linear Gaussian inverse problems on Hilbert spaces

Pith reviewed 2026-05-22 22:29 UTC · model grok-4.3

classification 🧮 math.ST math.PRstat.TH
keywords low-rank approximationGaussian posteriorinverse problemsKullback-Leibler divergenceHilbert spacesprojected forward modelBayesian inference
0
0 comments X

The pith

For reverse Kullback-Leibler loss the optimal low-rank posterior mean and covariance in Hilbert-space linear Gaussian inverse problems are recovered by solving the problem with a projected forward model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs optimal low-rank approximations to the Gaussian posterior in linear Gaussian inverse problems set in possibly infinite-dimensional separable Hilbert parameter spaces. It first derives necessary and sufficient conditions for mean-only approximations with fixed covariance to be equivalent to the exact posterior, then finds the optimal such approximations under Kullback-Leibler, Rényi, Amari alpha-divergences and Hellinger distance. For the reverse Kullback-Leibler divergence the separately optimal mean and covariance approximations together give the jointly optimal low-rank approximation, which is shown to be identical to the posterior obtained from the Bayesian inverse problem whose forward model has been composed with an optimal projector onto a low-dimensional subspace.

Core claim

For the reverse Kullback-Leibler divergence, the optimal approximations of the mean and of the covariance yield an optimal joint approximation of the mean and covariance. We interpret one such joint approximation in terms of an optimal projector in parameter space, and show that this approximation amounts to solving a Bayesian inverse problem with projected forward model.

What carries the argument

The optimal projector in parameter space that reduces the forward model so the resulting posterior is the jointly optimal low-rank approximation under reverse Kullback-Leibler divergence.

If this is right

  • The joint low-rank approximation under reverse Kullback-Leibler divergence equals the posterior of the Bayesian inverse problem whose forward model is composed with the optimal projector.
  • Necessary and sufficient conditions on the approximating means guarantee equivalence to the exact posterior when the covariance is held fixed.
  • Explicit optimal approximations exist for the mean when the covariance is fixed, under Kullback-Leibler, Rényi, Amari alpha and Hellinger measures.
  • The optimality and equivalence results hold in the infinite-dimensional separable Hilbert setting and extend the corresponding finite-dimensional statements.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Existing solvers for finite-dimensional inverse problems can be reused directly on the projected model to obtain the optimal low-rank posterior without first computing the full infinite-dimensional posterior.
  • The same projection construction may supply cheap approximate posteriors in related sampling or uncertainty-quantification tasks that already employ projected models.
  • If analogous optimality relations hold for other divergences, the same projector could serve as a universal dimension-reduction device across multiple loss functions.

Load-bearing premise

The forward model is linear and the noise is Gaussian, so the posterior remains exactly Gaussian even in the infinite-dimensional Hilbert setting.

What would settle it

Compute the jointly optimal low-rank posterior under reverse Kullback-Leibler divergence by direct optimization of mean and covariance; if it differs from the posterior obtained by solving the inverse problem with the corresponding optimal projected forward model, the claimed equivalence fails.

read the original abstract

We construct optimal low-rank approximations for the Gaussian posterior distribution in linear Gaussian inverse problems with possibly infinite-dimensional separable Hilbert parameter spaces and finite-dimensional data spaces. We first consider approximate posteriors in which the means vary and the posterior covariance is kept fixed, for all possible realisations of the data simultaneously. We give necessary and sufficient conditions for these approximating posteriors to be equivalent to the exact posterior. For such approximations, we measure the data-averaged approximation error with the Kullback-Leibler, R\'enyi and Amari $\alpha$-divergences for $\alpha\in(0,1)$, and the Hellinger distance. With the loss in Kullback-Leibler and R\'enyi divergences, we find the optimal approximations and formulate an equivalent condition for their uniqueness, extending the work in finite dimensions of Spantini et al. (SIAM J. Sci. Comput. 2015). We then consider joint low-rank approximation of the mean and covariance. For the reverse Kullback-Leibler divergence, the optimal approximations of the mean and of the covariance yield an optimal joint approximation of the mean and covariance. We interpret one such joint approximation in terms of an optimal projector in parameter space, and show that this approximation amounts to solving a Bayesian inverse problem with projected forward model. Extensive numerical examples demonstrate some of our theoretical findings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper constructs optimal low-rank approximations to the posterior mean (with fixed covariance) and to the joint mean-covariance pair for the exact Gaussian posterior arising from linear-Gaussian inverse problems on separable Hilbert spaces with finite-dimensional data. It supplies necessary and sufficient conditions for equivalence of the approximating and exact measures, derives the optimal approximations under averaged KL, Rényi, Amari α-divergences and Hellinger distance, and shows that, for the reverse KL divergence, the separately optimal mean and covariance approximations are jointly optimal; one such joint approximation is realized by the posterior of the forward model composed with an optimal orthogonal projector in parameter space. The finite-dimensional results of Spantini et al. are extended, and numerical examples are presented.

Significance. If the derivations are correct, the work supplies a rigorous infinite-dimensional extension of low-rank posterior approximation theory together with an explicit link to projected forward models. This is useful for computational Bayesian inversion on function spaces. The paper is credited for stating necessary and sufficient equivalence conditions and for the clean reduction of the joint reverse-KL problem to a projected Bayesian inverse problem.

minor comments (3)
  1. [§2] §2 (or the notation section): the precise domain of the data-averaged divergences (e.g., whether the expectation is taken with respect to the marginal on the data or the joint) should be stated explicitly before the optimality theorems, as this affects the interpretation of the necessary-and-sufficient conditions.
  2. [Numerical examples] The numerical examples section would benefit from a short table listing the Hilbert-space dimensions, ranks, and divergence values attained, to make the claimed agreement with theory immediately verifiable.
  3. A reference to the precise statement of the Feldman–Hajek theorem used for equivalence of Gaussian measures on Hilbert space should be added when the necessary-and-sufficient conditions are first stated.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the careful reading and positive summary of our manuscript, as well as the recommendation for minor revision. The referee's description accurately captures the main contributions. No specific major comments appear in the report.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper derives optimal low-rank approximations for Gaussian posteriors in linear-Gaussian Hilbert-space inverse problems by applying standard properties of Gaussian measures (Feldman-Hajek equivalence) and common divergences (KL, Rényi, Hellinger) to the exact posterior; the finite-dimensional results of Spantini et al. are invoked only as an external benchmark that is extended, not as a load-bearing self-citation or self-definition. No equation or claim reduces by construction to a fitted input renamed as prediction, an ansatz smuggled via citation, or a uniqueness theorem imported from the authors' prior work. The joint-optimality statement for reverse KL is obtained directly from separate optimality conditions within the linear-Gaussian setting, without circular reduction to the paper's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on standard properties of Gaussian measures on separable Hilbert spaces and finite-dimensional data spaces; no free parameters or invented entities are introduced.

axioms (2)
  • domain assumption The parameter space is a separable Hilbert space and the data space is finite-dimensional.
    Explicitly stated in the abstract as the setting for the inverse problems.
  • domain assumption The forward model is linear and the noise is additive Gaussian.
    Required for the posterior to remain exactly Gaussian, as presupposed by the title and abstract.

pith-pipeline@v0.9.0 · 5774 in / 1476 out tokens · 73000 ms · 2026-05-22T22:29:32.468647+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · 1 internal anchor

  1. [1]

    S. Amari. Information geometry and its applications , volume 194 of Applied Mathematical Sciences. Springer, Tokyo, 2016

  2. [2]

    Ben-Israel and T

    A. Ben-Israel and T. N. Greville. Generalized Inverses. CMS Books in Mathematics. Springer-Verlag, 2003

  3. [3]

    Beskos, F

    A. Beskos, F. J. Pinski, J. M. Sanz-Serna, and A. M. Stuart. Hybrid Monte Carlo on Hilbert spaces. Stochastic Processes and their Applications , 121(10):2201–2230, 2011

  4. [4]

    Bogachev

    V. Bogachev. Gaussian Measures, volume 62 of Mathematical Surveys and Monographs . American Mathematical Society, 1998. 32

  5. [5]

    H. Brezis. Functional Analysis, Sobolev Spaces and Partial Differential Equations . Springer, 2011

  6. [6]

    Bui-Thanh, C

    T. Bui-Thanh, C. Burstedde, O. Ghattas, J. Martin, G. Stadler, and L. C. Wilcox. Extreme-scale UQ for Bayesian inverse problems governed by PDEs. In 2012 Int. Conf. High Perform. Comput. Netw. Storage Anal., pages 1–11. IEEE, 2012

  7. [7]

    Bui-Thanh, O

    T. Bui-Thanh, O. Ghattas, J. Martin, and G. Stadler. A Computational Framework for Infinite- Dimensional Bayesian Inverse Problems Part I: The Linearized Case, with Application to Global Seismic Inversion. SIAM J. Sci. Comput. , 35(6):A2494–A2523, 2013

  8. [8]

    Bui-Thanh and Q

    T. Bui-Thanh and Q. P. Nguyen. FEM-based discretization-invariant MCMC methods for PDE- constrained Bayesian inverse problems. Inverse Probl. Imaging , 10(4):943–975, Sat Oct 01 00:00:00 UTC 2016

  9. [9]

    Carere and H

    G. Carere and H. C. Lie. Generalised rank-constrained approximations of Hilbert-Schmidt operators on separable Hilbert spaces and applications. arXiv:2408.05104, 2024

  10. [10]

    Carere and H

    G. Carere and H. C. Lie. Optimal low-rank posterior covariance approximation in linear Gaussian inverse problems on Hilbert spaces. arXiv:2411.01112, 2025

  11. [11]

    J. B. Conway. A Course in Functional Analysis , volume 96 of Graduate Texts in Mathematics . Springer, 2007

  12. [12]

    S. L. Cotter, G. O. Roberts, A. M. Stuart, and D. White. MCMC Methods for Functions: Modifying Old Algorithms to Make Them Faster. Statist. Sci., 28(3), 2013

  13. [13]

    T. Cui, K. J. H. Law, and Y. M. Marzouk. Dimension-independent likelihood-informed MCMC. J. Comput. Phys., 304:109–137, 2016

  14. [14]

    T. Cui, J. Martin, Y. Marzouk, A. Solonen, and A. Spantini. Likelihood-informed dimension reduc- tion for nonlinear inverse problems. Inverse Problems, 30(11):28, 2014

  15. [15]

    Cui and X

    T. Cui and X. T. Tong. A unified performance analysis of likelihood-informed subspace methods. Bernoulli, 28(4), 2022

  16. [16]

    T. Cui, X. T. Tong, and O. Zahm. Prior normalization for certified likelihood-informed subspace detection of Bayesian inverse problems. Inverse Problems, 38(12):124002, 2022

  17. [17]

    Da Prato and J

    G. Da Prato and J. Zabczyk. Stochastic Equations in Infinite Dimensions . Encyclopedia of Math- ematics and Its Applications. Cambridge University Press, second edition, 2014

  18. [18]

    H. W. Engl, M. Hanke, and A. Neubauer. Regularization of Inverse Problems , volume 375 of Mathematics and Its Applications . Springer Dordrecht, first edition, 1996

  19. [19]

    L. C. Evans. Partial Differential Equations, volume 19 of Graduate Studies in Mathematics . Amer- ican Mathematical Society, second edition, 2010

  20. [20]

    H. P. Flath, L. C. Wilcox, V. Akcelik, J. Hill, B. Van Bloemen Waanders, and O. Ghattas. Fast algorithms for Bayesian uncertainty quantification in large-scale linear inverse problems based on low-rank partial Hessian approximations. SIAM J. Sci. Comput. , 33(1):407–342, 2011

  21. [21]

    Friedland and A

    S. Friedland and A. Torokhti. Generalized Rank-Constrained Matrix Approximations. SIAM J. Matrix Anal. Appl. , 29(2):656–659, 2007

  22. [22]

    M. Hairer. An Introduction to Stochastic PDEs. arXiv:0907.4178, 2023

  23. [23]

    Hsing and R

    T. Hsing and R. Eubank. Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators. Wiley Series in Probability and Statistics. John Wiley & Sons, Ltd, Hoboken, 2015

  24. [24]

    K¨ onig.Eigenvalue Distribution of Compact Operators , volume 16 of Operator Theory: Advances and Applications

    H. K¨ onig.Eigenvalue Distribution of Compact Operators , volume 16 of Operator Theory: Advances and Applications. Birkh¨ auser, 1986

  25. [25]

    Kretschmann

    R. Kretschmann. Nonparametric Bayesian Inverse Problems with Laplacian Noise . PhD thesis, University of Duisburg-Essen, 2019. 33

  26. [26]

    M. T. C. Li, T. Cui, F. Li, Y. Marzouk, and O. Zahm. Sharp detection of low-dimensional structure in probability measures via dimensional logarithmic Sobolev inequalities. arXiv:2406.13036, 2024

  27. [27]

    M. T. C. Li, Y. Marzouk, and O. Zahm. Principal feature detection via ϕ-Sobolev inequalities. Bernoulli, 30(4):2979 – 3003, 2024

  28. [28]

    H. Q. Minh. Regularized Divergences Between Covariance Operators and Gaussian Measures on Hilbert Spaces. J. Theor. Probab., 34(2):580–643, 2021

  29. [29]

    F. Nielsen. The many faces of information geometry. Notices Amer. Math. Soc. , 69(1):36–45, 2022

  30. [30]

    A. Pazy. Semigroups of Linear Operators and Applications to Partial Differential Equations , vol- ume 44 of Applied Mathematical Sciences. Springer, 1983

  31. [31]

    F. J. Pinski, G. Simpson, A. M. Stuart, and H. Weber. Kullback–Leibler approximation for proba- bility measures on infinite dimensional spaces. SIAM J. Math. Anal. , 47(6):4091–4122, 2015

  32. [32]

    Ray and B

    K. Ray and B. Szab´ o. Variational Bayes for High-Dimensional Linear Regression With Sparse Priors. J. Amer. Statist. Assoc. , 117(539):1270–1281, 2022

  33. [33]

    Reed and B

    M. Reed and B. Simon. Methods of Modern Mathematical Physics. I: Functional Analysis. Rev. and Enl. Ed, volume 1 of Methods of Modern Mathematical Physics . Academic Press, 1980

  34. [34]

    Y. Saad. Iterative Methods for Sparse Linear Systems | SIAM Publications Library. SIAM Society for Industrial and Applied Mathematics, 2nd ed. edition, 2003

  35. [35]

    B. Simon. Notes on infinite determinants of Hilbert space operators. Adv. Math., 24(3):244–273, 1977

  36. [36]

    B. Simon. Trace Ideals and Their Applications , volume 120 of Mathematical Surveys and Mono- graphs. American Mathematical Society, Providence, second edition, 2005

  37. [37]

    Sondermann

    D. Sondermann. Best approximate solutions to matrix equations under rank restrictions. Statistische Hefte, 27(1):57–66, 1986

  38. [38]

    Spantini, T

    A. Spantini, T. Cui, K. Willcox, L. Tenorio, and Y. Marzouk. Goal-oriented optimal approximations of Bayesian linear inverse problems. SIAM J. Sci. Comput. , 39(5):S167–S196, 2017

  39. [39]

    Spantini, A

    A. Spantini, A. Solonen, T. Cui, J. Martin, L. Tenorio, and Y. Marzouk. Optimal low-rank approx- imations of Bayesian linear inverse problems. SIAM J. Sci. Comput. , 37(6):A2451–A2487, 2015

  40. [40]

    A. M. Stuart. Inverse problems: A Bayesian perspective. Acta Numer., 19:451–559, 2010

  41. [41]

    Thomas-Agnan

    C. Thomas-Agnan. Computing a family of reproducing kernels for statistical applications. Numer. Algor., 13(1):21–32, 1996

  42. [42]

    van Erven and P

    T. van Erven and P. Harremos. R´ enyi Divergence and Kullback-Leibler Divergence. IEEE Trans. Inform. Theory, 60(7):3797–3820, 2014

  43. [43]

    O. Zahm, T. Cui, K. Law, A. Spantini, and Y. Marzouk. Certified dimension reduction in nonlinear Bayesian inverse problems. Math. Comp., 91(336):1789–1835, 2022. 34