pith. sign in

arxiv: 2605.19610 · v1 · pith:5DV3262Nnew · submitted 2026-05-19 · 📊 stat.ML · cs.LG

Posterior Contraction of L\'evy Adaptive B-spline Regression in Besov Spaces

Pith reviewed 2026-05-20 02:10 UTC · model grok-4.3

classification 📊 stat.ML cs.LG
keywords posterior contractionBesov spacesBayesian nonparametric regressionadaptive B-splinesLévy processLARK modelnonparametric estimation
0
0 comments X

The pith

The LABS posterior contracts around true functions in Besov spaces at nearly minimax-optimal rates while adapting to unknown smoothness.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the Lévy Adaptive B-spline regression posterior contracts around the true regression function when the function belongs to a Besov class. The rates achieved are nearly the best possible minimax rates, up to a logarithmic factor, and the procedure adapts automatically without knowing the smoothness level in advance. A reader would care because the result supplies missing theoretical support for a flexible Bayesian nonparametric model that can capture irregular and locally varying features. The setting is univariate random design with Gaussian noise. Simulations on standard test functions illustrate that the method performs well in practice.

Core claim

Within the nonparametric regression model with univariate random design and Gaussian errors, the LABS posterior, formed by placing a Lévy process prior on the number of B-spline terms and their knot locations and degrees, contracts around the true function at rates that match the minimax rate up to a logarithmic factor whenever the true function lies in a Besov space, and the contraction holds simultaneously for all smoothness indices in a given range.

What carries the argument

The Lévy Adaptive B-spline (LABS) model, which embeds B-splines of varying degrees with independently chosen knots into the Lévy Adaptive Regression Kernel framework.

If this is right

  • The method can be applied to estimate functions whose smoothness is unknown in advance while still attaining near-optimal rates.
  • The same contraction holds for functions exhibiting irregular or locally structured behavior that Besov spaces can capture.
  • The theoretical guarantee fills the previous gap for posterior contraction of LARK-type models in Besov spaces.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar Lévy-process priors on spline knots could be tested for adaptation in other function classes such as Sobolev or Hölder spaces with comparable rates.
  • The univariate result suggests examining whether the LABS construction extends to multivariate random designs while preserving automatic adaptation.
  • If the logarithmic factor can be removed by a modest change in the prior, the procedure would achieve exact minimax rates.

Load-bearing premise

The Lévy process prior on spline degrees and knot locations must satisfy the specific technical conditions required to prove contraction in Besov spaces under the univariate random-design Gaussian-error model.

What would settle it

A simulation or analytic calculation that exhibits a posterior contraction rate strictly slower than the claimed nearly minimax rate for some function known to lie in a Besov space with fixed smoothness would refute the main result.

Figures

Figures reproduced from arXiv: 2605.19610 by Jaeyong Lee, Jeunghun Oh, Sewon Park.

Figure 1
Figure 1. Figure 1: Four test functions from Donoho and Johnstone (1994) used in the simulation. 2. Bumps f2(x) = X j hj K  x − tj wj  , K(t) = (1 + |t|) −4 , (tj ) = (0.1, 0.13, 0.15, 0.23, 0.25, 0.40, 0.44, 0.65, 0.76, 0.78, 0.81), (hj ) = (4, 5, 3, 4, 5, 4.2, 2.1, 4.3, 3.1, 2.1, 4.2), (wj ) = (0.005, 0.005, 0.006, 0.01, 0.01, 0.03, 0.01, 0.01, 0.005, 0.008, 0.005). 3. HeaviSine f3(x) = 4 sin(4πx) − sgn(x − 0.3) − sgn(0.7… view at source ↗
Figure 2
Figure 2. Figure 2: MSE boxplots for n = 128 based on 100 simulation replicates. 14 [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: MSE boxplots for n = 1024 based on 100 simulation replicates. shows that for n = 128, LABS achieves the lowest log MSE across all test functions and RSNR levels. For n = 1024 and n = 8192, as shown in [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: MSE boxplots for n = 8192 based on 100 simulation replicates. 16 [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
read the original abstract

We investigate the asymptotic properties of the L\'evy Adaptive B-spline (LABS) regression model, a Bayesian nonparametric method that incorporates B-spline kernels into the L\'evy Adaptive Regression Kernel (LARK) model. LABS applies splines of varying degrees with independently defined knots, yielding a flexible model class capable of adapting to irregular and locally structured features of the true function. Within the nonparametric regression framework with univariate random design and Gaussian errors, we establish that the LABS posterior contracts around the true function in Besov classes at nearly minimax-optimal rates, up to a logarithmic factor, while adapting automatically to unknown smoothness. This study contributes to filling a gap in the literature, where theoretical results on posterior contraction of the LARK model in Besov spaces remain scarce. Simulation experiments on standard test functions in Besov spaces, including Blocks, Bumps, HeaviSine, and Doppler, complement the theoretical results and demonstrate the practical utility of LABS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript establishes that the posterior in the Lévy Adaptive B-spline (LABS) regression model contracts around the true regression function in Besov spaces at nearly minimax-optimal rates (up to a logarithmic factor) while automatically adapting to unknown smoothness. The setting is univariate nonparametric regression with random design and Gaussian errors; the prior places a Lévy process on spline degrees and knot locations within the LARK framework. Theoretical results are complemented by simulations on standard test functions (Blocks, Bumps, HeaviSine, Doppler).

Significance. If the central claims hold, the work is significant because it supplies the first posterior-contraction guarantees for the LABS model in Besov spaces, addressing a documented gap in the LARK literature. The automatic adaptation to smoothness and the nearly optimal rates (up to log factor) are competitive with other adaptive Bayesian nonparametric methods. The simulation study on irregular test functions provides useful empirical corroboration.

major comments (2)
  1. [Theorem 3.1] Theorem 3.1 (and the surrounding discussion in §3.2): the nearly-minimax contraction rate is obtained by appealing to a general Ghosal–van der Vaart-type theorem whose two load-bearing conditions—sufficient prior mass on Kullback–Leibler neighborhoods of f0 in the Besov norm and entropy bounds on a suitable sieve—are only sketched. The manuscript invokes earlier LARK results without re-verifying that the specific Lévy intensity, B-spline kernel, and random-design measure continue to satisfy the required lower bounds on prior mass for every smoothness level s; this verification is central to the adaptation claim.
  2. [§3.3] §3.3, entropy bound for the sieve: the growth of the covering numbers induced by the random degree and knot locations must remain o(n ε_n²) uniformly over Besov balls of varying smoothness. The current argument does not explicitly control the effective dimension as a function of the Lévy intensity parameters when s is unknown, which is required to close the proof for the adaptive rate.
minor comments (2)
  1. [Introduction] The introduction would benefit from a concise table or paragraph contrasting the LABS rates with those already available for other adaptive spline or wavelet priors in Besov spaces.
  2. [Simulations] Figure captions for the simulation study should explicitly label the smoothness parameters and the competing methods so that the visual comparison is self-contained.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback on our manuscript. The comments highlight areas where additional detail would strengthen the presentation of the theoretical results. We have revised the manuscript to address these points explicitly while preserving the original contributions.

read point-by-point responses
  1. Referee: [Theorem 3.1] Theorem 3.1 (and the surrounding discussion in §3.2): the nearly-minimax contraction rate is obtained by appealing to a general Ghosal–van der Vaart-type theorem whose two load-bearing conditions—sufficient prior mass on Kullback–Leibler neighborhoods of f0 in the Besov norm and entropy bounds on a suitable sieve—are only sketched. The manuscript invokes earlier LARK results without re-verifying that the specific Lévy intensity, B-spline kernel, and random-design measure continue to satisfy the required lower bounds on prior mass for every smoothness level s; this verification is central to the adaptation claim.

    Authors: We agree that explicit verification of the prior-mass condition for the specific LABS components is necessary to rigorously support automatic adaptation across all smoothness levels. In the revised manuscript we have added a dedicated appendix subsection that re-derives the lower bound on prior mass in Kullback–Leibler neighborhoods directly for the Lévy intensity, B-spline kernel, and random-design measure. The argument shows that the intensity parameters can be chosen independently of s so that the required mass holds uniformly over the Besov ball, thereby justifying the appeal to the general Ghosal–van der Vaart theorem for the adaptive rate. revision: yes

  2. Referee: [§3.3] §3.3, entropy bound for the sieve: the growth of the covering numbers induced by the random degree and knot locations must remain o(n ε_n²) uniformly over Besov balls of varying smoothness. The current argument does not explicitly control the effective dimension as a function of the Lévy intensity parameters when s is unknown, which is required to close the proof for the adaptive rate.

    Authors: We appreciate the referee’s observation that the entropy control must be uniform in the unknown smoothness. In the revised §3.3 we now explicitly bound the effective dimension of the random-degree-and-knot sieve in terms of the Lévy intensity parameters. These bounds are shown to be independent of s within the admissible range, ensuring that the metric entropy remains o(n ε_n²) uniformly over the Besov balls. The updated argument therefore closes the proof of the nearly minimax adaptive contraction rate. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation applies external general theorems to specific prior.

full rationale

The paper claims posterior contraction for the LABS model in Besov spaces by verifying the standard conditions (prior mass on KL neighborhoods and entropy bounds) required by general Bayesian nonparametric contraction theorems from the literature. These verifications are performed for the Lévy-driven B-spline prior under the given random-design Gaussian model and constitute independent technical work rather than any self-definition, fitted-input renaming, or reduction of the target rate to quantities defined by the result itself. No load-bearing step equates the claimed nearly-minimax rate (up to log factor) to an input by construction, and any references to prior LARK work serve as background rather than an unverified self-citation chain that forces the conclusion.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard domain assumptions for nonparametric regression and Besov-space function classes; no free parameters or new invented entities are introduced.

axioms (1)
  • domain assumption Univariate random design with Gaussian errors and the Lévy-process prior construction on B-splines satisfy the conditions needed for posterior contraction in Besov spaces.
    Invoked throughout the asymptotic analysis described in the abstract.

pith-pipeline@v0.9.0 · 5707 in / 1173 out tokens · 35488 ms · 2026-05-20T02:10:46.757451+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 2 internal anchors

  1. [1]

    (2014), `Adaptive Priors Based on Splines with Random Knots', Bayesian Analysis, 9, 859--882

    Belitser, E., and Serra, P. (2014), `Adaptive Priors Based on Splines with Random Knots', Bayesian Analysis, 9, 859--882

  2. [2]

    (2000), `Adaptive Bayesian Regression Splines in Semiparametric Generalized Linear Models', Journal of Computational and Graphical Statistics, 9, 122--140

    Biller, C. (2000), `Adaptive Bayesian Regression Splines in Semiparametric Generalized Linear Models', Journal of Computational and Graphical Statistics, 9, 122--140

  3. [3]

    (2010), ` BART : Bayesian Additive Regression Trees', The Annals of Applied Statistics, 4, 266--298

    Chipman, H.A., George, E.I., and McCulloch, R.E. (2010), ` BART : Bayesian Additive Regression Trees', The Annals of Applied Statistics, 4, 266--298

  4. [4]

    (1979), `Robust Locally Weighted Regression and Smoothing Scatterplots', Journal of the American Statistical Association, 74, 829--836

    Cleveland, W.S. (1979), `Robust Locally Weighted Regression and Smoothing Scatterplots', Journal of the American Statistical Association, 74, 829--836

  5. [5]

    (1978), A Practical Guide to Splines, New York: Springer

    de Boor, C. (1978), A Practical Guide to Splines, New York: Springer

  6. [6]

    (2012), `Adaptive Estimation of Multivariate Functions Using Conditionally Gaussian Tensor-Product Spline Priors', Electronic Journal of Statistics, 6, 1984--2001

    de Jonge, R., and van Zanten, J.H. (2012), `Adaptive Estimation of Multivariate Functions Using Conditionally Gaussian Tensor-Product Spline Priors', Electronic Journal of Statistics, 6, 1984--2001

  7. [7]

    (1998), `Bayesian MARS ', Statistics and Computing, 8, 337--346

    Denison, D.G.T., Mallick, B.K., and Smith, A.F.M. (1998), `Bayesian MARS ', Statistics and Computing, 8, 337--346

  8. [8]

    (2001), `Bayesian Curve-Fitting with Free-Knot Splines', Biometrika, 88, 1055--1071

    DiMatteo, I., Genovese, C.R., and Kass, R.E. (2001), `Bayesian Curve-Fitting with Free-Knot Splines', Biometrika, 88, 1055--1071

  9. [9]

    (1994), `Ideal Spatial Adaptation by Wavelet Shrinkage', Biometrika, 81, 425--455

    Donoho, D.L., and Johnstone, I.M. (1994), `Ideal Spatial Adaptation by Wavelet Shrinkage', Biometrika, 81, 425--455

  10. [10]

    (1998), `Minimax Estimation via Wavelet Shrinkage', The Annals of Statistics, 26, 879--921

    Donoho, D.L., and Johnstone, I.M. (1998), `Minimax Estimation via Wavelet Shrinkage', The Annals of Statistics, 26, 879--921

  11. [11]

    (2000), `Convergence Rates of Posterior Distributions', The Annals of Statistics, 28, 500--531

    Ghosal, S., Ghosh, J.K., and van der Vaart, A.W. (2000), `Convergence Rates of Posterior Distributions', The Annals of Statistics, 28, 500--531

  12. [12]

    (2007), `Convergence Rates of Posterior Distributions for Non-IID Observations', The Annals of Statistics, 35, 192--223

    Ghosal, S., and van der Vaart, A. (2007), `Convergence Rates of Posterior Distributions for Non-IID Observations', The Annals of Statistics, 35, 192--223

  13. [13]

    (2017), Fundamentals of Nonparametric Bayesian Inference, Cambridge: Cambridge University Press

    Ghosal, S., and van der Vaart, A. (2017), Fundamentals of Nonparametric Bayesian Inference, Cambridge: Cambridge University Press

  14. [14]

    (2021), Mathematical Foundations of Infinite-Dimensional Statistical Models, Cambridge: Cambridge University Press

    Gin \'e , E., and Nickl, R. (2021), Mathematical Foundations of Infinite-Dimensional Statistical Models, Cambridge: Cambridge University Press

  15. [15]

    (1995), `Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination', Biometrika, 82, 711--732

    Green, P.J. (1995), `Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination', Biometrika, 82, 711--732

  16. [16]

    (2004), `Convergence Rates for Posterior Distributions and Adaptive Estimation', The Annals of Statistics, 32, 1556--1593

    Huang, T.-M. (2004), `Convergence Rates for Posterior Distributions and Adaptive Estimation', The Annals of Statistics, 32, 1556--1593

  17. [17]

    (2005), `Empirical Bayes Selection of Wavelet Thresholds', The Annals of Statistics, 33, 1700--1752

    Johnstone, I.M., and Silverman, B.W. (2005), `Empirical Bayes Selection of Wavelet Thresholds', The Annals of Statistics, 33, 1700--1752

  18. [18]

    (2024), kernlab: Kernel-Based Machine Learning Lab (R package version 0.9-33)

    Karatzoglou, A., Smola, A., and Hornik, K. (2024), kernlab: Kernel-Based Machine Learning Lab (R package version 0.9-33). Available at https://CRAN.R-project.org/package=kernlab

  19. [19]

    (2022), `Asymptotic Properties for Bayesian Neural Network in Besov Space', in Advances in Neural Information Processing Systems, 35, pp

    Lee, K., and Lee, J. (2022), `Asymptotic Properties for Bayesian Neural Network in Besov Space', in Advances in Neural Information Processing Systems, 35, pp. 5641--5653

  20. [20]

    (2002), `Bayesian Estimation of Free-Knot Splines Using Reversible Jumps', Computational Statistics & Data Analysis, 41, 255--269

    Lindstrom, M.J. (2002), `Bayesian Estimation of Free-Knot Splines Using Reversible Jumps', Computational Statistics & Data Analysis, 41, 255--269

  21. [21]

    KAN: Kolmogorov-Arnold Networks

    Liu, Z., Wang, Y., Vaidya, S., Ruehle, F., Halverson, J., Solja c i \'c , M., Hou, T.Y., and Tegmark, M. (2025), ` KAN : Kolmogorov-Arnold Networks', arXiv preprint arXiv:2404.19756

  22. [22]

    (1996), `Bayesian Curve Fitting Using Multivariate Normal Mixtures', Biometrika, 83, 67--79

    M \"u ller, P., Erkanli, A., and West, M. (1996), `Bayesian Curve Fitting Using Multivariate Normal Mixtures', Biometrika, 83, 67--79

  23. [23]

    (2023), `L \'e vy Adaptive B-Spline Regression via Overcomplete Systems', Statistica Sinica, 33, 2715--2737

    Park, S., Oh, H.-S., and Lee, J. (2023), `L \'e vy Adaptive B-Spline Regression via Overcomplete Systems', Statistica Sinica, 33, 2715--2737

  24. [24]

    Available at https://www.R-project.org/

    R Core Team (2025), R: A Language and Environment for Statistical Computing, Vienna, Austria: R Foundation for Statistical Computing. Available at https://www.R-project.org/

  25. [25]

    (2007), Spline Functions: Basic Theory (3rd ed.), Cambridge: Cambridge University Press

    Schumaker, L. (2007), Spline Functions: Basic Theory (3rd ed.), Cambridge: Cambridge University Press

  26. [26]

    (2015), `Adaptive Bayesian Procedures Using Random Series Priors', Scandinavian Journal of Statistics, 42, 1194--1213

    Shen, W., and Ghosal, S. (2015), `Adaptive Bayesian Procedures Using Random Series Priors', Scandinavian Journal of Statistics, 42, 1194--1213

  27. [27]

    (2005), ` EbayesThresh : R Programs for Empirical Bayes Thresholding', Journal of Statistical Software, 12, 1--38

    Johnstone, I., and Silverman, B.W. (2005), ` EbayesThresh : R Programs for Empirical Bayes Thresholding', Journal of Statistical Software, 12, 1--38

  28. [28]

    Adaptivity of deep ReLU network for learning in Besov and mixed smooth Besov spaces: optimal rate and curse of dimensionality

    Suzuki, T. (2018), `Adaptivity of Deep ReLU Network for Learning in Besov and Mixed Smooth Besov Spaces: Optimal Rate and Curse of Dimensionality', arXiv preprint arXiv:1810.08033

  29. [29]

    (2003), Introduction \`a l'estimation non param \'e trique , Berlin: Springer

    Tsybakov, A.B. (2003), Introduction \`a l'estimation non param \'e trique , Berlin: Springer

  30. [30]

    (2020), fANCOVA : Nonparametric Analysis of Covariance (R package version 0.6-1)

    Wang, X. (2020), fANCOVA : Nonparametric Analysis of Covariance (R package version 0.6-1). Available at https://CRAN.R-project.org/package=fANCOVA

  31. [31]

    (2006), Gaussian Processes for Machine Learning, Cambridge, MA: MIT Press

    Williams, C.K.I., and Rasmussen, C.E. (2006), Gaussian Processes for Machine Learning, Cambridge, MA: MIT Press

  32. [32]

    (2011), `Stochastic Expansions Using Continuous Dictionaries: L \'e vy Adaptive Regression Kernels', The Annals of Statistics, 39, 1916--1962

    Wolpert, R.L., Clyde, M.A., and Tu, C. (2011), `Stochastic Expansions Using Continuous Dictionaries: L \'e vy Adaptive Regression Kernels', The Annals of Statistics, 39, 1916--1962

  33. [33]

    (2020), `Adaptive Bayesian Nonparametric Regression Using a Kernel Mixture of Polynomials with Application to Partial Linear Models', Bayesian Analysis, 15, 159--186

    Xie, F., and Xu, Y. (2020), `Adaptive Bayesian Nonparametric Regression Using a Kernel Mixture of Polynomials with Application to Partial Linear Models', Bayesian Analysis, 15, 159--186