Consistency of variational inference for Besov priors in non-linear inverse problems
Pith reviewed 2026-05-19 00:57 UTC · model grok-4.3
The pith
Variational posteriors with Besov priors match exact posterior convergence rates in nonlinear PDE inverse problems.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under general conditions on the PDE operator, variational posteriors constructed with Besov priors achieve convergence rates matching those of the exact posterior. These rates are minimax-optimal over the Besov classes B^α_pp with p ≥ 1 and outperform the suboptimal rates obtained with Gaussian priors by a polynomial factor. The results hold for widely used variational families and extend to prediction loss for PDE-constrained regression problems, as verified on the Darcy flow and inverse potential examples.
What carries the argument
The refined prior-mass-and-testing framework that controls the variational approximation error while preserving the contraction rate of the exact posterior.
If this is right
- Variational posteriors attain minimax-optimal rates over Besov classes B^α_pp.
- Prediction-loss rates for PDE-constrained regression problems are also minimax optimal.
- Besov priors improve on Gaussian priors by a polynomial factor in the same inverse-problem setting.
- The matching rates hold for Besov-type and mean-field variational families under the stated operator conditions.
Where Pith is reading between the lines
- The same framework could be tested on other nonlinear operators that satisfy similar general conditions but arise in different application domains.
- Efficient numerical implementations of the variational optimization step might now be benchmarked directly against exact posterior sampling on the Darcy flow example.
- The polynomial improvement over Gaussian priors suggests examining whether other wavelet-based or sparsity-promoting priors yield comparable gains in related inverse problems.
Load-bearing premise
The PDE forward operator must satisfy the general conditions that let the prior-mass-and-testing framework bound the error from the variational approximation.
What would settle it
A concrete nonlinear PDE inverse problem in which the variational posterior with a Besov prior contracts at a rate slower than the exact posterior by more than a polynomial factor.
read the original abstract
This study investigates the variational posterior convergence rates of inverse problems for partial differential equations (PDEs) with parameters in Besov spaces $B_{pp}^\alpha$ ($p \geq 1$) which are modeled naturally in a Bayesian manner using Besov priors constructed via random wavelet expansions with $p$-exponentially distributed coefficients. Departing from exact Bayesian inference, variational inference transforms the inference problem into an optimization problem by introducing variational sets. Building on a refined ``prior mass and testing'' framework, we derive general conditions on PDE operators and guarantee that variational posteriors achieve convergence rates matching those of the exact posterior under widely adopted variational families (Besov-type measures or mean-field families). Moreover, our results achieve minimax-optimal rates over $B^{\alpha}_{pp}$ classes, significantly outperforming the suboptimal rates of Gaussian priors (by a polynomial factor). As specific examples, two typical nonlinear inverse problems, the Darcy flow problems and the inverse potential problem for a subdiffusion equation, are investigated to validate our theory. Besides, we show that our convergence rates of ``prediction'' loss for these ``PDE-constrained regression problems'' are minimax optimal.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that under general conditions on the PDE forward operators, variational posteriors with Besov priors achieve convergence rates in nonlinear inverse problems that match those of the exact Bayesian posterior and are minimax optimal over Besov classes B_{p p}^α. This is shown using a refined prior-mass-and-testing framework, with applications to Darcy flow and subdiffusion problems where prediction losses are also optimal. The variational families considered include Besov-type measures and mean-field families.
Significance. If the central claims hold, this paper would be significant for providing theoretical guarantees on variational inference in nonparametric Bayesian inverse problems with PDE constraints. It improves upon Gaussian prior results by a polynomial factor and extends to nonlinear settings. The strength lies in deriving general conditions on operators and demonstrating optimality for prediction loss in PDE-constrained regression.
major comments (2)
- [Applications to nonlinear inverse problems (Section 5)] The abstract and theory section assert that the Darcy flow problem and the inverse potential problem for the subdiffusion equation satisfy the general conditions on PDE operators required for the refined prior-mass-and-testing framework to bound the variational approximation error. However, the manuscript does not include an explicit verification that these specific nonlinear operators meet all the listed hypotheses, such as the local Lipschitz condition or the testing function requirements. This verification is essential because the rate-matching result for variational posteriors depends directly on these conditions holding for the examples.
- [Main theoretical result (Theorem 3.1)] The derivation shows that variational posteriors achieve the same rates as exact posteriors under the general conditions, but it would strengthen the paper to include a brief discussion or reference to how the minimax lower bounds are established or matched for the variational case specifically, to confirm the optimality claim is not just inherited but verified.
minor comments (2)
- The introduction could benefit from a clearer statement of the main contributions in a bulleted list to improve readability.
- [Notation section] Ensure consistent use of the parameter p in Besov spaces B^α_pp throughout the manuscript, particularly in the statements of rates.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive comments on our manuscript. We address each major comment point by point below, indicating the revisions we plan to incorporate.
read point-by-point responses
-
Referee: [Applications to nonlinear inverse problems (Section 5)] The abstract and theory section assert that the Darcy flow problem and the inverse potential problem for the subdiffusion equation satisfy the general conditions on PDE operators required for the refined prior-mass-and-testing framework to bound the variational approximation error. However, the manuscript does not include an explicit verification that these specific nonlinear operators meet all the listed hypotheses, such as the local Lipschitz condition or the testing function requirements. This verification is essential because the rate-matching result for variational posteriors depends directly on these conditions holding for the examples.
Authors: We agree that an explicit verification would strengthen the presentation. Although Section 5 states that the Darcy flow and subdiffusion examples satisfy the general hypotheses of Section 3, we did not provide a line-by-line check. In the revised manuscript we will add a dedicated appendix (or subsection) that verifies each required condition, including the local Lipschitz property of the forward map, the existence of suitable testing functions, and the remaining technical assumptions, for both nonlinear PDE examples. revision: yes
-
Referee: [Main theoretical result (Theorem 3.1)] The derivation shows that variational posteriors achieve the same rates as exact posteriors under the general conditions, but it would strengthen the paper to include a brief discussion or reference to how the minimax lower bounds are established or matched for the variational case specifically, to confirm the optimality claim is not just inherited but verified.
Authors: We appreciate the suggestion. The minimax optimality follows because Theorem 3.1 shows that the variational posterior attains the same rate as the exact posterior, and the exact posterior is already known to achieve the minimax rate over B_{pp}^α (see the references cited in the introduction and Section 2). To make this explicit rather than implicit, we will insert a short remark immediately after Theorem 3.1 that recalls the relevant minimax lower-bound results from the literature and explains why the rate-matching upper bound for the variational posterior directly implies minimax optimality in the variational setting. revision: yes
Circularity Check
No circularity: derivation self-contained via abstract framework
full rationale
The paper derives general conditions on PDE operators from a refined prior-mass-and-testing framework and shows that variational posteriors inherit exact-posterior convergence rates over Besov classes, with minimax optimality following as a consequence. Specific nonlinear examples (Darcy flow, subdiffusion) are presented to validate the abstract transfer rather than to fit parameters that are then renamed as predictions. No self-definitional loops, fitted-input predictions, or load-bearing self-citations appear in the derivation chain; the central rate-matching result is obtained by applying the framework's hypotheses to the variational families, remaining independent of the target conclusions.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Besov spaces B^α_pp admit random wavelet expansions with p-exponentially distributed coefficients that serve as priors
- domain assumption The PDE forward operator satisfies general conditions that allow the refined prior-mass-and-testing framework to bound variational approximation error
Reference graph
Works this paper leans on
-
[1]
On statistical Calder´ on problems.Mathematical Statis- tics and Learning, 2(2):165–216, 2020
Kweku Abraham and Richard Nickl. On statistical Calder´ on problems.Mathematical Statis- tics and Learning, 2(2):165–216, 2020
work page 2020
-
[2]
Rates of contraction of posterior dis- tributions based on p-exponential priors
Sergios Agapiou, Masoumeh Dashit, and Tapio Helin. Rates of contraction of posterior dis- tributions based on p-exponential priors. Bernoulli, 27(3):1616–1642, 2021
work page 2021
-
[3]
Posterior contraction rates for the Bayesian approach to linear ill-posed inverse problems
Sergios Agapiou, Stig Larsson, and Andrew M Stuart. Posterior contraction rates for the Bayesian approach to linear ill-posed inverse problems. Stochastic Processes and their Appli- cations, 123(10):3828–3860, 2013
work page 2013
-
[4]
Bayesian posterior contraction rates for linear severely ill-posed inverse problems
Sergios Agapiou, Andrew M Stuart, and Yuan-Xiang Zhang. Bayesian posterior contraction rates for linear severely ill-posed inverse problems. Journal of Inverse and Ill-posed Problems, 22(3):297–321, 2014
work page 2014
-
[5]
Laplace priors and spatial inhomogeneity in Bayesian inverse problems
Sergios Agapiou and Sven Wang. Laplace priors and spatial inhomogeneity in Bayesian inverse problems. Bernoulli, 30(2):878–910, 2024
work page 2024
-
[6]
Laplace priors and spatial inhomogeneity in Bayesian inverse problems
Sergios Agapiou and Sven Wang. Supplement to “Laplace priors and spatial inhomogeneity in Bayesian inverse problems”. Bernoulli, 2024. VARIATIONAL POSTERIOR CONVERGENCE FOR INVERSE PROBLEMS 33
work page 2024
-
[7]
Concentration of tempered posteriors and of their varia- tional approximations
Pierre Alquier and James Ridgway. Concentration of tempered posteriors and of their varia- tional approximations. The Annals of Statistics , 48(3):1475–1497, 2020
work page 2020
-
[8]
Variational inference: A review for statisticians
David M Blei, Alp Kucukelbir, and Jon D McAuliffe. Variational inference: A review for statisticians. Journal of the American statistical Association , 112(518):859–877, 2017
work page 2017
-
[9]
A Bernstein–von Mises theorem for the Calder´ on problem with piecewise constant conductivities
Jan Bohr. A Bernstein–von Mises theorem for the Calder´ on problem with piecewise constant conductivities. Inverse Problems, 39(1):015002, 2022
work page 2022
-
[10]
Besov priors for Bayesian inverse problems
Masoumeh Dashti, Stephen Harris, and Andrew Stuart. Besov priors for Bayesian inverse problems. Inverse Problems and Imaging , 6(2):183–200, 2012
work page 2012
-
[11]
Masoumeh Dashti and Andrew M. Stuart. The Bayesian approach to inverse problems. In Handbook of uncertainty quantification. Vol. 1, 2, 3 , pages 311–428. Springer, Cham, 2017
work page 2017
-
[12]
David Eric Edmunds and Hans Triebel. Function Spaces, Entropy Numbers, Differential Operators, volume 120 of Cambridge Tracts in Mathematics . Cambridge University Press, 1996
work page 1996
-
[13]
Full seismic waveform modelling and inversion
Andreas Fichtner. Full seismic waveform modelling and inversion . Springer Science & Busi- ness Media, New York, 2010
work page 2010
-
[14]
Consistency of the Bayes method for the inverse scattering problem
Takashi Furuya, Pu-Zhao Kow, and Jenn-Nan Wang. Consistency of the Bayes method for the inverse scattering problem. Inverse Problems, 40(5):055001, 2024
work page 2024
-
[15]
Cambridge University Press, New York, 2016
Evarist Gin´ e and Richard Nickl.Mathematical foundations of infinite-dimensional statistical models. Cambridge University Press, New York, 2016
work page 2016
-
[16]
Consistency of Bayesian inference with Gaussian process priors in an elliptic inverse problem
Matteo Giordano and Richard Nickl. Consistency of Bayesian inference with Gaussian process priors in an elliptic inverse problem. Inverse Problems, 36(8):085001, 2020
work page 2020
-
[17]
A varia- tional Bayesian approach for inverse problems with skew-t error distributions
Nilabja Guha, Xiaoqing Wu, Yalchin Efendiev, Bangti Jin, and Bani K Mallick. A varia- tional Bayesian approach for inverse problems with skew-t error distributions. Journal of Computational Physics, 301:377–393, 2015
work page 2015
-
[18]
Learning regularization functionals—a supervised training approach
Eldad Haber and Luis Tenorio. Learning regularization functionals—a supervised training approach. Inverse Problems, 19(3):611, 2003
work page 2003
-
[19]
Bayesian approach to inverse problems for functions with a variable-index Besov prior
Junxiong Jia, Jigen Peng, and Jinghuai Gao. Bayesian approach to inverse problems for functions with a variable-index Besov prior. Inverse Problems, 32(8):085006, jun 2016
work page 2016
-
[20]
Junxiong Jia, Jigen Peng, and Jinghuai Gao. Posterior contraction for empirical Bayesian approach to inverse problems under non-diagonal assumption.Inverse Problems and Imaging, 15(2):201–228, 2021
work page 2021
-
[21]
Backward problem for a time-space fractional diffusion equation
Junxiong Jia, Jigen Peng, Jinghuai Gao, and Yujiao Li. Backward problem for a time-space fractional diffusion equation. Inverse Problems and Imaging , 12(3):773–799, 2018
work page 2018
-
[22]
Junxiong Jia, Jigen Peng, and Jiaqing Yang. Harnack’s inequality for a space-time fractional diffusion equation and applications to an inverse source problem. Journal of Differential Equations, 262(8):4415–4450, 2017
work page 2017
-
[23]
Junxiong Jia, Yanni Wu, Peijun Li, and Deyu Meng. Variational inverting network for statis- tical inverse problems of partial differential equations.Journal of Machine Learning Research, 24(201):1–60, 2023
work page 2023
-
[24]
Junxiong Jia, Shigang Yue, Jigen Peng, and Jinghuai Gao. Infinite-dimensional Bayesian ap- proach for inverse scattering problems of a fractional Helmholtz equation. Journal of Func- tional Analysis, 275(9):2299–2332, 2018
work page 2018
-
[25]
Variational Bayes’ method for functions with applications to some inverse problems
Junxiong Jia, Qian Zhao, Zongben Xu, Deyu Meng, and Yee Leung. Variational Bayes’ method for functions with applications to some inverse problems. SIAM Journal on Scientific Computing, 43(1):A355–A383, 2021
work page 2021
-
[26]
A variational Bayesian method to inverse problems with impulsive noise
Bangti Jin. A variational Bayesian method to inverse problems with impulsive noise. Journal of Computational Physics , 231(2):423–435, 2012
work page 2012
-
[27]
Bangti Jin. Fractional differential equations—an approach via fractional derivatives, volume 206 of Applied Mathematical Sciences. Springer, Cham, 2021
work page 2021
-
[28]
B. T. Knapik, A. W. van der Vaart, and J. H. van Zanten. Bayesian inverse problems with Gaussian priors. The Annals of Statistics , 39(5):2626–2657, 2011
work page 2011
-
[29]
Bayesian recovery of the initial condition for the heat equation
Bartek T Knapik, Aad W Van Der Vaart, and J Harry van Zanten. Bayesian recovery of the initial condition for the heat equation. Communications in Statistics-Theory and Methods , 42(7):1294–1313, 2013
work page 2013
-
[30]
Discretization-invariant Bayesian inver- sionand Besov space priors
Matti Lassas, Eero Saksman, and Samuli Siltanen. Discretization-invariant Bayesian inver- sionand Besov space priors. Inverse Problems and Imaging , 3(1):87–122, 2009
work page 2009
-
[31]
Sparse Gaussian processes for solving nonlinear PDEs
Rui Meng and Xianjin Yang. Sparse Gaussian processes for solving nonlinear PDEs. Journal of Computational Physics , 490:112340, 2023. 34 S.K.ZU, J. JIA, AND Z. WANG
work page 2023
- [32]
- [33]
- [34]
-
[35]
Bernstein–von Mises theorems for statistical inverse problems I: Schr¨ odinger equation
Richard Nickl. Bernstein–von Mises theorems for statistical inverse problems I: Schr¨ odinger equation. Journal of the European Mathematical Society (JEMS) , 22(8):2697–2750, 2020
work page 2020
-
[36]
Bayesian non-linear statistical inverse problems
Richard Nickl. Bayesian non-linear statistical inverse problems. Zurich Lectures in Advanced Mathematics. EMS Press, Berlin, 2023
work page 2023
- [37]
-
[38]
Convergence rates for penalized least squares estimators in PDE constrained regression problems
Richard Nickl, Sara van de Geer, and Sven Wang. Convergence rates for penalized least squares estimators in PDE constrained regression problems. SIAM/ASA Journal on Uncer- tainty Quantification, 8(1):374–413, 2020
work page 2020
-
[39]
On statistical optimality of variational Bayes
Debdeep Pati, Anirban Bhattacharya, and Yun Yang. On statistical optimality of variational Bayes. In International Conference on Artificial Intelligence and Statistics, pages 1579–1588. Proceedings of Machine Learning Research, 2018
work page 2018
-
[40]
Jan Povala, Ieva Kazlauskaite, Eky Febrianto, Fehmi Cirak, and Mark Girolami. Variational Bayesian approximation of inverse problems using sparse precision matrices.Computer Meth- ods in Applied Mechanics and Engineering , 393:114712, 2022
work page 2022
-
[41]
Variational Gaussian processes for linear inverse problems
Thibault Randrianarisoa and Botond Szabo. Variational Gaussian processes for linear inverse problems. Advances in Neural Information Processing Systems , 36:28960–28972, 2023
work page 2023
-
[42]
Bayesian inverse problems with non-conjugate priors
Kolyan Ray. Bayesian inverse problems with non-conjugate priors. Electronic Journal of Statistics, 7:2516–2549, 2013
work page 2013
-
[43]
Inverse problems: a Bayesian perspective
Andrew M Stuart. Inverse problems: a Bayesian perspective. Acta Numerica, 19:451–559, 2010
work page 2010
-
[44]
Hans Triebel. Theory of function spaces . Modern Birkh¨ auser Classics. Birkh¨ auser/Springer Basel AG, Basel, 2010
work page 2010
-
[45]
Large deviations and applications
SRS Varadhan. Large deviations and applications. Society for Industrial and Applied Math- ematics (SIAM), 1984
work page 1984
-
[46]
Frequentist consistency of variational Bayes
Yixin Wang and David M Blei. Frequentist consistency of variational Bayes. Journal of the American Statistical Association, 114(527):1147–1161, 2019
work page 2019
-
[47]
α-variational inference with statistical guarantees
Yun Yang, Debdeep Pati, and Anirban Bhattacharya. α-variational inference with statistical guarantees. The Annals of Statistics , 48(2):886–905, 2020
work page 2020
-
[48]
Advances in variational inference
Cheng Zhang, Judith B¨ utepage, Hedvig Kjellstr¨ om, and Stephan Mandt. Advances in variational inference. IEEE Transactions on Pattern Analysis and Machine Intelligence , 41(8):2008–2026, 2018
work page 2008
-
[49]
Convergence rates of variational posterior distributions
Fengshuo Zhang and Chao Gao. Convergence rates of variational posterior distributions. The Annals of Statistics , 48(4):2180–2207, 2020
work page 2020
-
[50]
Shaokang Zu, Junxiong Jia, and Deyu Meng. Consistency of variational Bayesian inference for non-linear inverse problems of partial differential equations. arXiv:2409.18415, 2024. School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an, 710049, China Email address: incredit1@stu.xjtu.edu.cn School of Mathematics and Statistics, Xi’an Jiaoton...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.