pith. sign in

arxiv: 2508.06179 · v2 · submitted 2025-08-08 · 🧮 math.ST · stat.TH

Consistency of variational inference for Besov priors in non-linear inverse problems

Pith reviewed 2026-05-19 00:57 UTC · model grok-4.3

classification 🧮 math.ST stat.TH
keywords variational inferenceBesov priorsnonlinear inverse problemsPDEposterior convergence ratesminimax optimalitywavelet expansionsDarcy flow
0
0 comments X

The pith

Variational posteriors with Besov priors match exact posterior convergence rates in nonlinear PDE inverse problems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that variational inference can replace exact Bayesian posterior sampling in inverse problems for partial differential equations while preserving the same rates of convergence. The setup uses Besov priors built from random wavelet expansions whose coefficients follow p-exponential distributions, which naturally model parameters in Besov spaces B^α_pp. Under general conditions on the PDE forward operator, the variational posteriors attain the same contraction rates as the exact posterior for standard variational families such as Besov-type measures or mean-field families. These rates are minimax optimal over the Besov classes and improve on the rates from Gaussian priors by a polynomial factor. The theory is illustrated on the Darcy flow problem and the inverse potential problem for a subdiffusion equation, where prediction-loss rates are also shown to be optimal.

Core claim

Under general conditions on the PDE operator, variational posteriors constructed with Besov priors achieve convergence rates matching those of the exact posterior. These rates are minimax-optimal over the Besov classes B^α_pp with p ≥ 1 and outperform the suboptimal rates obtained with Gaussian priors by a polynomial factor. The results hold for widely used variational families and extend to prediction loss for PDE-constrained regression problems, as verified on the Darcy flow and inverse potential examples.

What carries the argument

The refined prior-mass-and-testing framework that controls the variational approximation error while preserving the contraction rate of the exact posterior.

If this is right

  • Variational posteriors attain minimax-optimal rates over Besov classes B^α_pp.
  • Prediction-loss rates for PDE-constrained regression problems are also minimax optimal.
  • Besov priors improve on Gaussian priors by a polynomial factor in the same inverse-problem setting.
  • The matching rates hold for Besov-type and mean-field variational families under the stated operator conditions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same framework could be tested on other nonlinear operators that satisfy similar general conditions but arise in different application domains.
  • Efficient numerical implementations of the variational optimization step might now be benchmarked directly against exact posterior sampling on the Darcy flow example.
  • The polynomial improvement over Gaussian priors suggests examining whether other wavelet-based or sparsity-promoting priors yield comparable gains in related inverse problems.

Load-bearing premise

The PDE forward operator must satisfy the general conditions that let the prior-mass-and-testing framework bound the error from the variational approximation.

What would settle it

A concrete nonlinear PDE inverse problem in which the variational posterior with a Besov prior contracts at a rate slower than the exact posterior by more than a polynomial factor.

read the original abstract

This study investigates the variational posterior convergence rates of inverse problems for partial differential equations (PDEs) with parameters in Besov spaces $B_{pp}^\alpha$ ($p \geq 1$) which are modeled naturally in a Bayesian manner using Besov priors constructed via random wavelet expansions with $p$-exponentially distributed coefficients. Departing from exact Bayesian inference, variational inference transforms the inference problem into an optimization problem by introducing variational sets. Building on a refined ``prior mass and testing'' framework, we derive general conditions on PDE operators and guarantee that variational posteriors achieve convergence rates matching those of the exact posterior under widely adopted variational families (Besov-type measures or mean-field families). Moreover, our results achieve minimax-optimal rates over $B^{\alpha}_{pp}$ classes, significantly outperforming the suboptimal rates of Gaussian priors (by a polynomial factor). As specific examples, two typical nonlinear inverse problems, the Darcy flow problems and the inverse potential problem for a subdiffusion equation, are investigated to validate our theory. Besides, we show that our convergence rates of ``prediction'' loss for these ``PDE-constrained regression problems'' are minimax optimal.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript claims that under general conditions on the PDE forward operators, variational posteriors with Besov priors achieve convergence rates in nonlinear inverse problems that match those of the exact Bayesian posterior and are minimax optimal over Besov classes B_{p p}^α. This is shown using a refined prior-mass-and-testing framework, with applications to Darcy flow and subdiffusion problems where prediction losses are also optimal. The variational families considered include Besov-type measures and mean-field families.

Significance. If the central claims hold, this paper would be significant for providing theoretical guarantees on variational inference in nonparametric Bayesian inverse problems with PDE constraints. It improves upon Gaussian prior results by a polynomial factor and extends to nonlinear settings. The strength lies in deriving general conditions on operators and demonstrating optimality for prediction loss in PDE-constrained regression.

major comments (2)
  1. [Applications to nonlinear inverse problems (Section 5)] The abstract and theory section assert that the Darcy flow problem and the inverse potential problem for the subdiffusion equation satisfy the general conditions on PDE operators required for the refined prior-mass-and-testing framework to bound the variational approximation error. However, the manuscript does not include an explicit verification that these specific nonlinear operators meet all the listed hypotheses, such as the local Lipschitz condition or the testing function requirements. This verification is essential because the rate-matching result for variational posteriors depends directly on these conditions holding for the examples.
  2. [Main theoretical result (Theorem 3.1)] The derivation shows that variational posteriors achieve the same rates as exact posteriors under the general conditions, but it would strengthen the paper to include a brief discussion or reference to how the minimax lower bounds are established or matched for the variational case specifically, to confirm the optimality claim is not just inherited but verified.
minor comments (2)
  1. The introduction could benefit from a clearer statement of the main contributions in a bulleted list to improve readability.
  2. [Notation section] Ensure consistent use of the parameter p in Besov spaces B^α_pp throughout the manuscript, particularly in the statements of rates.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments on our manuscript. We address each major comment point by point below, indicating the revisions we plan to incorporate.

read point-by-point responses
  1. Referee: [Applications to nonlinear inverse problems (Section 5)] The abstract and theory section assert that the Darcy flow problem and the inverse potential problem for the subdiffusion equation satisfy the general conditions on PDE operators required for the refined prior-mass-and-testing framework to bound the variational approximation error. However, the manuscript does not include an explicit verification that these specific nonlinear operators meet all the listed hypotheses, such as the local Lipschitz condition or the testing function requirements. This verification is essential because the rate-matching result for variational posteriors depends directly on these conditions holding for the examples.

    Authors: We agree that an explicit verification would strengthen the presentation. Although Section 5 states that the Darcy flow and subdiffusion examples satisfy the general hypotheses of Section 3, we did not provide a line-by-line check. In the revised manuscript we will add a dedicated appendix (or subsection) that verifies each required condition, including the local Lipschitz property of the forward map, the existence of suitable testing functions, and the remaining technical assumptions, for both nonlinear PDE examples. revision: yes

  2. Referee: [Main theoretical result (Theorem 3.1)] The derivation shows that variational posteriors achieve the same rates as exact posteriors under the general conditions, but it would strengthen the paper to include a brief discussion or reference to how the minimax lower bounds are established or matched for the variational case specifically, to confirm the optimality claim is not just inherited but verified.

    Authors: We appreciate the suggestion. The minimax optimality follows because Theorem 3.1 shows that the variational posterior attains the same rate as the exact posterior, and the exact posterior is already known to achieve the minimax rate over B_{pp}^α (see the references cited in the introduction and Section 2). To make this explicit rather than implicit, we will insert a short remark immediately after Theorem 3.1 that recalls the relevant minimax lower-bound results from the literature and explains why the rate-matching upper bound for the variational posterior directly implies minimax optimality in the variational setting. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation self-contained via abstract framework

full rationale

The paper derives general conditions on PDE operators from a refined prior-mass-and-testing framework and shows that variational posteriors inherit exact-posterior convergence rates over Besov classes, with minimax optimality following as a consequence. Specific nonlinear examples (Darcy flow, subdiffusion) are presented to validate the abstract transfer rather than to fit parameters that are then renamed as predictions. No self-definitional loops, fitted-input predictions, or load-bearing self-citations appear in the derivation chain; the central rate-matching result is obtained by applying the framework's hypotheses to the variational families, remaining independent of the target conclusions.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard properties of Besov spaces and wavelet expansions together with operator conditions on the PDE forward map; no free parameters or new invented entities are introduced in the abstract.

axioms (2)
  • domain assumption Besov spaces B^α_pp admit random wavelet expansions with p-exponentially distributed coefficients that serve as priors
    Invoked when constructing the Besov prior via random wavelet expansions (abstract).
  • domain assumption The PDE forward operator satisfies general conditions that allow the refined prior-mass-and-testing framework to bound variational approximation error
    Stated as the key hypothesis under which the convergence rates are derived (abstract).

pith-pipeline@v0.9.0 · 5737 in / 1432 out tokens · 56429 ms · 2026-05-19T00:57:13.548181+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

  1. [1]

    On statistical Calder´ on problems.Mathematical Statis- tics and Learning, 2(2):165–216, 2020

    Kweku Abraham and Richard Nickl. On statistical Calder´ on problems.Mathematical Statis- tics and Learning, 2(2):165–216, 2020

  2. [2]

    Rates of contraction of posterior dis- tributions based on p-exponential priors

    Sergios Agapiou, Masoumeh Dashit, and Tapio Helin. Rates of contraction of posterior dis- tributions based on p-exponential priors. Bernoulli, 27(3):1616–1642, 2021

  3. [3]

    Posterior contraction rates for the Bayesian approach to linear ill-posed inverse problems

    Sergios Agapiou, Stig Larsson, and Andrew M Stuart. Posterior contraction rates for the Bayesian approach to linear ill-posed inverse problems. Stochastic Processes and their Appli- cations, 123(10):3828–3860, 2013

  4. [4]

    Bayesian posterior contraction rates for linear severely ill-posed inverse problems

    Sergios Agapiou, Andrew M Stuart, and Yuan-Xiang Zhang. Bayesian posterior contraction rates for linear severely ill-posed inverse problems. Journal of Inverse and Ill-posed Problems, 22(3):297–321, 2014

  5. [5]

    Laplace priors and spatial inhomogeneity in Bayesian inverse problems

    Sergios Agapiou and Sven Wang. Laplace priors and spatial inhomogeneity in Bayesian inverse problems. Bernoulli, 30(2):878–910, 2024

  6. [6]

    Laplace priors and spatial inhomogeneity in Bayesian inverse problems

    Sergios Agapiou and Sven Wang. Supplement to “Laplace priors and spatial inhomogeneity in Bayesian inverse problems”. Bernoulli, 2024. VARIATIONAL POSTERIOR CONVERGENCE FOR INVERSE PROBLEMS 33

  7. [7]

    Concentration of tempered posteriors and of their varia- tional approximations

    Pierre Alquier and James Ridgway. Concentration of tempered posteriors and of their varia- tional approximations. The Annals of Statistics , 48(3):1475–1497, 2020

  8. [8]

    Variational inference: A review for statisticians

    David M Blei, Alp Kucukelbir, and Jon D McAuliffe. Variational inference: A review for statisticians. Journal of the American statistical Association , 112(518):859–877, 2017

  9. [9]

    A Bernstein–von Mises theorem for the Calder´ on problem with piecewise constant conductivities

    Jan Bohr. A Bernstein–von Mises theorem for the Calder´ on problem with piecewise constant conductivities. Inverse Problems, 39(1):015002, 2022

  10. [10]

    Besov priors for Bayesian inverse problems

    Masoumeh Dashti, Stephen Harris, and Andrew Stuart. Besov priors for Bayesian inverse problems. Inverse Problems and Imaging , 6(2):183–200, 2012

  11. [11]

    Masoumeh Dashti and Andrew M. Stuart. The Bayesian approach to inverse problems. In Handbook of uncertainty quantification. Vol. 1, 2, 3 , pages 311–428. Springer, Cham, 2017

  12. [12]

    Function Spaces, Entropy Numbers, Differential Operators, volume 120 of Cambridge Tracts in Mathematics

    David Eric Edmunds and Hans Triebel. Function Spaces, Entropy Numbers, Differential Operators, volume 120 of Cambridge Tracts in Mathematics . Cambridge University Press, 1996

  13. [13]

    Full seismic waveform modelling and inversion

    Andreas Fichtner. Full seismic waveform modelling and inversion . Springer Science & Busi- ness Media, New York, 2010

  14. [14]

    Consistency of the Bayes method for the inverse scattering problem

    Takashi Furuya, Pu-Zhao Kow, and Jenn-Nan Wang. Consistency of the Bayes method for the inverse scattering problem. Inverse Problems, 40(5):055001, 2024

  15. [15]

    Cambridge University Press, New York, 2016

    Evarist Gin´ e and Richard Nickl.Mathematical foundations of infinite-dimensional statistical models. Cambridge University Press, New York, 2016

  16. [16]

    Consistency of Bayesian inference with Gaussian process priors in an elliptic inverse problem

    Matteo Giordano and Richard Nickl. Consistency of Bayesian inference with Gaussian process priors in an elliptic inverse problem. Inverse Problems, 36(8):085001, 2020

  17. [17]

    A varia- tional Bayesian approach for inverse problems with skew-t error distributions

    Nilabja Guha, Xiaoqing Wu, Yalchin Efendiev, Bangti Jin, and Bani K Mallick. A varia- tional Bayesian approach for inverse problems with skew-t error distributions. Journal of Computational Physics, 301:377–393, 2015

  18. [18]

    Learning regularization functionals—a supervised training approach

    Eldad Haber and Luis Tenorio. Learning regularization functionals—a supervised training approach. Inverse Problems, 19(3):611, 2003

  19. [19]

    Bayesian approach to inverse problems for functions with a variable-index Besov prior

    Junxiong Jia, Jigen Peng, and Jinghuai Gao. Bayesian approach to inverse problems for functions with a variable-index Besov prior. Inverse Problems, 32(8):085006, jun 2016

  20. [20]

    Posterior contraction for empirical Bayesian approach to inverse problems under non-diagonal assumption.Inverse Problems and Imaging, 15(2):201–228, 2021

    Junxiong Jia, Jigen Peng, and Jinghuai Gao. Posterior contraction for empirical Bayesian approach to inverse problems under non-diagonal assumption.Inverse Problems and Imaging, 15(2):201–228, 2021

  21. [21]

    Backward problem for a time-space fractional diffusion equation

    Junxiong Jia, Jigen Peng, Jinghuai Gao, and Yujiao Li. Backward problem for a time-space fractional diffusion equation. Inverse Problems and Imaging , 12(3):773–799, 2018

  22. [22]

    Harnack’s inequality for a space-time fractional diffusion equation and applications to an inverse source problem

    Junxiong Jia, Jigen Peng, and Jiaqing Yang. Harnack’s inequality for a space-time fractional diffusion equation and applications to an inverse source problem. Journal of Differential Equations, 262(8):4415–4450, 2017

  23. [23]

    Variational inverting network for statis- tical inverse problems of partial differential equations.Journal of Machine Learning Research, 24(201):1–60, 2023

    Junxiong Jia, Yanni Wu, Peijun Li, and Deyu Meng. Variational inverting network for statis- tical inverse problems of partial differential equations.Journal of Machine Learning Research, 24(201):1–60, 2023

  24. [24]

    Infinite-dimensional Bayesian ap- proach for inverse scattering problems of a fractional Helmholtz equation

    Junxiong Jia, Shigang Yue, Jigen Peng, and Jinghuai Gao. Infinite-dimensional Bayesian ap- proach for inverse scattering problems of a fractional Helmholtz equation. Journal of Func- tional Analysis, 275(9):2299–2332, 2018

  25. [25]

    Variational Bayes’ method for functions with applications to some inverse problems

    Junxiong Jia, Qian Zhao, Zongben Xu, Deyu Meng, and Yee Leung. Variational Bayes’ method for functions with applications to some inverse problems. SIAM Journal on Scientific Computing, 43(1):A355–A383, 2021

  26. [26]

    A variational Bayesian method to inverse problems with impulsive noise

    Bangti Jin. A variational Bayesian method to inverse problems with impulsive noise. Journal of Computational Physics , 231(2):423–435, 2012

  27. [27]

    Fractional differential equations—an approach via fractional derivatives, volume 206 of Applied Mathematical Sciences

    Bangti Jin. Fractional differential equations—an approach via fractional derivatives, volume 206 of Applied Mathematical Sciences. Springer, Cham, 2021

  28. [28]

    B. T. Knapik, A. W. van der Vaart, and J. H. van Zanten. Bayesian inverse problems with Gaussian priors. The Annals of Statistics , 39(5):2626–2657, 2011

  29. [29]

    Bayesian recovery of the initial condition for the heat equation

    Bartek T Knapik, Aad W Van Der Vaart, and J Harry van Zanten. Bayesian recovery of the initial condition for the heat equation. Communications in Statistics-Theory and Methods , 42(7):1294–1313, 2013

  30. [30]

    Discretization-invariant Bayesian inver- sionand Besov space priors

    Matti Lassas, Eero Saksman, and Samuli Siltanen. Discretization-invariant Bayesian inver- sionand Besov space priors. Inverse Problems and Imaging , 3(1):87–122, 2009

  31. [31]

    Sparse Gaussian processes for solving nonlinear PDEs

    Rui Meng and Xianjin Yang. Sparse Gaussian processes for solving nonlinear PDEs. Journal of Computational Physics , 490:112340, 2023. 34 S.K.ZU, J. JIA, AND Z. WANG

  32. [32]

    Paternain

    Fran¸ cois Monard, Richard Nickl, and Gabriel P. Paternain. Efficient nonparametric Bayesian inference for X-ray transforms. The Annals of Statistics , 47(2):1113–1147, 2019

  33. [33]

    Paternain

    Fran¸ cois Monard, Richard Nickl, and Gabriel P. Paternain. Consistent inversion of noisy non- Abelian X-ray transforms. Communications on Pure and Applied Mathematics , 74(5):1045– 1099, 2021

  34. [34]

    Paternain

    Fran¸ cois Monard, Richard Nickl, and Gabriel P. Paternain. Statistical guarantees for Bayesian uncertainty quantification in nonlinear inverse problems with Gaussian process priors. The Annals of Statistics , 49(6):3255–3298, 2021

  35. [35]

    Bernstein–von Mises theorems for statistical inverse problems I: Schr¨ odinger equation

    Richard Nickl. Bernstein–von Mises theorems for statistical inverse problems I: Schr¨ odinger equation. Journal of the European Mathematical Society (JEMS) , 22(8):2697–2750, 2020

  36. [36]

    Bayesian non-linear statistical inverse problems

    Richard Nickl. Bayesian non-linear statistical inverse problems. Zurich Lectures in Advanced Mathematics. EMS Press, Berlin, 2023

  37. [37]

    Paternain

    Richard Nickl and Gabriel P. Paternain. On some information-theoretic aspects of non-linear statistical inverse problems. In ICM—International Congress of Mathematicians. Vol. 7. Sections 15–20, pages 5516–5538. EMS Press, Berlin, 2023

  38. [38]

    Convergence rates for penalized least squares estimators in PDE constrained regression problems

    Richard Nickl, Sara van de Geer, and Sven Wang. Convergence rates for penalized least squares estimators in PDE constrained regression problems. SIAM/ASA Journal on Uncer- tainty Quantification, 8(1):374–413, 2020

  39. [39]

    On statistical optimality of variational Bayes

    Debdeep Pati, Anirban Bhattacharya, and Yun Yang. On statistical optimality of variational Bayes. In International Conference on Artificial Intelligence and Statistics, pages 1579–1588. Proceedings of Machine Learning Research, 2018

  40. [40]

    Variational Bayesian approximation of inverse problems using sparse precision matrices.Computer Meth- ods in Applied Mechanics and Engineering , 393:114712, 2022

    Jan Povala, Ieva Kazlauskaite, Eky Febrianto, Fehmi Cirak, and Mark Girolami. Variational Bayesian approximation of inverse problems using sparse precision matrices.Computer Meth- ods in Applied Mechanics and Engineering , 393:114712, 2022

  41. [41]

    Variational Gaussian processes for linear inverse problems

    Thibault Randrianarisoa and Botond Szabo. Variational Gaussian processes for linear inverse problems. Advances in Neural Information Processing Systems , 36:28960–28972, 2023

  42. [42]

    Bayesian inverse problems with non-conjugate priors

    Kolyan Ray. Bayesian inverse problems with non-conjugate priors. Electronic Journal of Statistics, 7:2516–2549, 2013

  43. [43]

    Inverse problems: a Bayesian perspective

    Andrew M Stuart. Inverse problems: a Bayesian perspective. Acta Numerica, 19:451–559, 2010

  44. [44]

    Theory of function spaces

    Hans Triebel. Theory of function spaces . Modern Birkh¨ auser Classics. Birkh¨ auser/Springer Basel AG, Basel, 2010

  45. [45]

    Large deviations and applications

    SRS Varadhan. Large deviations and applications. Society for Industrial and Applied Math- ematics (SIAM), 1984

  46. [46]

    Frequentist consistency of variational Bayes

    Yixin Wang and David M Blei. Frequentist consistency of variational Bayes. Journal of the American Statistical Association, 114(527):1147–1161, 2019

  47. [47]

    α-variational inference with statistical guarantees

    Yun Yang, Debdeep Pati, and Anirban Bhattacharya. α-variational inference with statistical guarantees. The Annals of Statistics , 48(2):886–905, 2020

  48. [48]

    Advances in variational inference

    Cheng Zhang, Judith B¨ utepage, Hedvig Kjellstr¨ om, and Stephan Mandt. Advances in variational inference. IEEE Transactions on Pattern Analysis and Machine Intelligence , 41(8):2008–2026, 2018

  49. [49]

    Convergence rates of variational posterior distributions

    Fengshuo Zhang and Chao Gao. Convergence rates of variational posterior distributions. The Annals of Statistics , 48(4):2180–2207, 2020

  50. [50]

    Consistency of variational Bayesian inference for non-linear inverse problems of partial differential equations

    Shaokang Zu, Junxiong Jia, and Deyu Meng. Consistency of variational Bayesian inference for non-linear inverse problems of partial differential equations. arXiv:2409.18415, 2024. School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an, 710049, China Email address: incredit1@stu.xjtu.edu.cn School of Mathematics and Statistics, Xi’an Jiaoton...