Laplace Approximations for Mixed-Effects and Gaussian Process Quantile Regression
Pith reviewed 2026-05-21 01:47 UTC · model grok-4.3
The pith
Laplace approximations for quantile regression use Fisher information or expected loss curvature instead of the observed Hessian.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The obstacle of a vanishing observed Hessian in Laplace approximations for the asymmetric Laplace likelihood can be overcome without smoothing by using the Fisher information for correctly specified models and the population curvature of the expected loss under misspecification. This basis allows development of a Laplace approximation framework for quantile regression in mixed-effects and Gaussian process models, with practical curvature estimators such as the triangular kernel curvature estimator that are asymptotically valid.
What carries the argument
Replacement of the observed Hessian with the curvature from the Fisher information or the expected loss in the quadratic expansion for the Laplace approximation.
If this is right
- The methods enable scalable and numerically stable posterior inference for latent Gaussian quantile regression models.
- Approximations achieve accuracy comparable to MCMC at lower computational cost.
- Marginal likelihoods can be estimated for model selection in these settings.
- The framework applies to both correctly specified and misspecified models.
Where Pith is reading between the lines
- This could allow Laplace methods for other non-smooth or non-differentiable loss functions in Bayesian generalized linear models.
- Connections to robust statistics suggest similar curvature-based justifications might apply to other M-estimators.
- Future work could test the approach in high-dimensional Gaussian process quantile regression for spatial data.
Load-bearing premise
Local quadratic behavior of the expected loss or Fisher information provides a valid curvature for the Laplace approximation despite the observed Hessian vanishing almost everywhere.
What would settle it
Compare the approximate posterior means and variances from the proposed method against those obtained from long-run MCMC in a simulated mixed-effects quantile regression dataset with known parameters.
Figures
read the original abstract
Laplace approximations are a standard tool for computationally efficient inference in latent Gaussian models, but they fail for quantile regression with the asymmetric Laplace likelihood because the observed Hessian vanishes almost everywhere. We show that this obstacle can be overcome without smoothing the likelihood: the relevant local curvature is given not by the observed Hessian, but by the Fisher information when the model is correctly specified and by the population curvature of the expected loss under misspecification. On this basis, we develop a Laplace approximation framework for quantile regression with mixed-effects and Gaussian process models. We propose practical curvature estimators, including the triangular kernel curvature (TKC) estimator, that yield approximations for posterior distributions and marginal likelihoods, and we establish their asymptotic validity. Empirically, the proposed methods are scalable and numerically stable, and for latent Gaussian models, they achieve accuracy comparable to or better than MCMC and variational competitors at substantially lower computational costs. More broadly, the framework clarifies how Laplace approximations can be justified for non-smooth generalized posteriors through local quadratic behavior of the expected loss.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops Laplace approximations for Bayesian inference in mixed-effects and Gaussian process quantile regression models that use the asymmetric Laplace likelihood. It addresses the vanishing observed Hessian by replacing it with the Fisher information matrix under correct specification or the Hessian of the expected loss under misspecification, proposes practical curvature estimators including the triangular kernel curvature (TKC) estimator, derives approximations to the posterior and marginal likelihood, and claims asymptotic validity together with empirical performance comparable to MCMC at lower cost.
Significance. If the central claims on asymptotic validity hold, the work provides a computationally attractive route to posterior approximation and model comparison for quantile regression in latent Gaussian settings where standard Laplace methods break down. The framework's emphasis on population curvature rather than observed Hessian offers a principled way to handle non-smooth generalized posteriors and could improve scalability for hierarchical and spatial quantile models.
major comments (2)
- [§3] §3 (framework derivation): the substitution of Fisher information or expected-loss curvature for the observed Hessian is load-bearing for the entire approximation; the manuscript must supply a rigorous bound showing that the Laplace remainder term still vanishes at the usual rate when the mode lies near a kink of the asymmetric Laplace loss and when latent variables are integrated.
- [Theorem on asymptotic validity] Theorem on asymptotic validity (likely §3.3 or §5): the proof sketch invokes local quadratic behavior of the expected loss, yet it is unclear whether the argument controls the additional error introduced by the TKC estimator's tuning parameters or by the non-differentiability when the posterior mass straddles a kink; an explicit rate for the total approximation error is required.
minor comments (2)
- [Abstract] Abstract and §2: the description of the TKC estimator would benefit from a one-sentence definition of its kernel and bandwidth choice before the empirical comparisons.
- [Simulation section] Simulation section: state explicitly the rule for excluding or handling data sets in which the mode coincides with a kink point of the quantile loss.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. The comments highlight important points regarding the rigor of our asymptotic analysis, and we address each one below. We commit to revisions that strengthen the theoretical justification without altering the core contributions of the framework.
read point-by-point responses
-
Referee: [§3] §3 (framework derivation): the substitution of Fisher information or expected-loss curvature for the observed Hessian is load-bearing for the entire approximation; the manuscript must supply a rigorous bound showing that the Laplace remainder term still vanishes at the usual rate when the mode lies near a kink of the asymmetric Laplace loss and when latent variables are integrated.
Authors: We agree that an explicit rigorous bound on the remainder is necessary to confirm the approximation remains valid near kinks and after marginalization. In the revised manuscript we will add a supporting lemma and proof (placed in an appendix to §3) that establishes the Laplace remainder is o_p(1) at the standard rate. The argument proceeds by showing that the population curvature (Fisher information or expected-loss Hessian) governs the local quadratic behavior almost surely, that the set of parameter values exactly at a kink has posterior measure zero, and that the integral over latent variables in the mixed-effects and GP cases preserves the rate via a dominated-convergence argument under the Gaussian prior. We will also verify the conditions under which the mode lies sufficiently far from kinks with high probability. revision: yes
-
Referee: [Theorem on asymptotic validity] Theorem on asymptotic validity (likely §3.3 or §5): the proof sketch invokes local quadratic behavior of the expected loss, yet it is unclear whether the argument controls the additional error introduced by the TKC estimator's tuning parameters or by the non-differentiability when the posterior mass straddles a kink; an explicit rate for the total approximation error is required.
Authors: We accept that the current sketch leaves the total error rate and the influence of the TKC tuning parameter implicit. In revision we will strengthen the theorem (likely in §3.3) to state an explicit total approximation error of order O_p(n^{-1}) for the log-posterior and O_p(n^{-1/2}) for the marginal likelihood. The proof will include (i) a separate bound showing that the TKC estimation error is o_p(n^{-1/2}) under the bandwidth condition h_n = o(1) with n h_n^2 → ∞, and (ii) a probabilistic control demonstrating that the probability of the posterior mass straddling a kink decays exponentially fast by posterior concentration, rendering its contribution negligible. These additions will be accompanied by a short simulation study confirming the rates in finite samples. revision: yes
Circularity Check
No significant circularity; derivation relies on standard asymptotic curvature arguments independent of internal fits.
full rationale
The paper's core claim replaces the vanishing observed Hessian of the asymmetric Laplace likelihood with Fisher information (correct specification) or the Hessian of the expected loss (misspecification) to justify Laplace approximations for quantile regression. This substitution is presented as following from classical results on local quadratic behavior of the log-posterior or expected loss, without defining the curvature in terms of quantities fitted inside the paper itself. The triangular kernel curvature (TKC) estimator and other practical methods are downstream tools for implementing the framework, not load-bearing definitions that reduce predictions to inputs by construction. No self-citation chains, uniqueness theorems from prior author work, or renaming of known results appear as the justification for the central result. The derivation chain remains self-contained against external benchmarks such as standard Laplace theory and asymptotic statistics.
Axiom & Free-Parameter Ledger
free parameters (1)
- triangular kernel curvature (TKC) estimator tuning parameters
axioms (1)
- domain assumption Local quadratic behavior of the expected loss supplies valid curvature for Laplace approximation when observed Hessian vanishes
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the relevant local curvature is given not by the observed Hessian, but by the Fisher information when the model is correctly specified and by the population curvature of the expected loss under misspecification
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Williams, Christopher KI and Rasmussen, Carl Edward , year=
-
[2]
Regression Quantiles , urldate =
Roger Koenker and Gilbert Bassett , journal =. Regression Quantiles , urldate =
- [3]
-
[4]
Journal of Machine Learning Research , volume=
Approximations for binary Gaussian process classification , author=. Journal of Machine Learning Research , volume=
-
[5]
Statistics and computing , volume=
Linear quantile mixed models , author=. Statistics and computing , volume=. 2014 , publisher=
work page 2014
-
[6]
Gaussian Process Quantile Regression using Expectation Propagation
Gaussian process quantile regression using expectation propagation , author=. arXiv preprint arXiv:1206.6391 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[7]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Variational inference for nonparametric bayesian quantile regression , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
-
[8]
Uncertainty in Artificial Intelligence , pages=
Bayesian quantile and expectile optimisation , author=. Uncertainty in Artificial Intelligence , pages=. 2022 , organization=
work page 2022
- [9]
-
[10]
A note on Gauss—Hermite quadrature , author=. Biometrika , volume=. 1994 , publisher=
work page 1994
-
[11]
Statistical Science , number =
Andrew Gelman and Xiao-Li Meng , title =. Statistical Science , number =. 1998 , doi =
work page 1998
-
[12]
Journal of statistical computation and simulation , volume=
Gibbs sampling methods for Bayesian quantile regression , author=. Journal of statistical computation and simulation , volume=. 2011 , publisher=
work page 2011
-
[13]
Computational Statistics & Data Analysis , volume=
Bayesian inference for additive mixed quantile regression models , author=. Computational Statistics & Data Analysis , volume=. 2011 , publisher=
work page 2011
-
[14]
Journal of Statistical Software , volume=
bayesQR: A Bayesian approach to quantile regression , author=. Journal of Statistical Software , volume=
-
[15]
Journal of Multivariate Analysis , volume=
Geometric ergodicity of the Gibbs sampler for Bayesian quantile regression , author=. Journal of Multivariate Analysis , volume=. 2012 , publisher=
work page 2012
-
[16]
Gaussian Approximations for Probability Measures on R\^
Lu, Yulong and Stuart, Andrew and Weber, Hendrik , journal=. Gaussian Approximations for Probability Measures on R\^. 2017 , publisher=
work page 2017
-
[17]
Numerische Mathematik , volume=
On the convergence of the Laplace approximation and noise-level-robustness of Laplace-based Monte Carlo methods for Bayesian inverse problems , author=. Numerische Mathematik , volume=. 2020 , publisher=
work page 2020
-
[18]
Asymptotics in statistics: some basic concepts , author=. 2000 , publisher=
work page 2000
-
[19]
B.J.K. Kleijn and A.W. van der Vaart , title =. Electronic Journal of Statistics , number =
-
[20]
Journal of Statistical Software , author=
Linear Quantile Mixed Models: The lqmm Package for Laplace Quantile Regression , volume=. Journal of Statistical Software , author=. 2014 , pages=. doi:10.18637/jss.v057.i13 , abstract=
-
[21]
Quantiles as optimal point forecasts , journal =
Tilmann Gneiting , keywords =. Quantiles as optimal point forecasts , journal =. 2011 , issn =. doi:https://doi.org/10.1016/j.ijforecast.2009.12.015 , url =
-
[22]
Preprint, Carnegie Mellon University, March , volume=
Characterization of proper and strictly proper scoring rules for quantiles , author=. Preprint, Carnegie Mellon University, March , volume=
-
[23]
The Annals of Statistics , volume=
Dimension-free mixing times of Gibbs samplers for Bayesian hierarchical models , author=. The Annals of Statistics , volume=. 2024 , publisher=
work page 2024
-
[24]
Wood and Margaux Zaffran and Raphaël Nedellec and Yannig Goude and , title =
Matteo Fasiolo and Simon N. Wood and Margaux Zaffran and Raphaël Nedellec and Yannig Goude and , title =. Journal of the American Statistical Association , volume =. 2021 , publisher =
work page 2021
-
[25]
Bayesian quantile regression using the skew exponential power distribution , journal =
Mauro Bernardi and Marco Bottone and Lea Petrella , keywords =. Bayesian quantile regression using the skew exponential power distribution , journal =. 2018 , issn =. doi:https://doi.org/10.1016/j.csda.2018.04.008 , url =
-
[26]
Communications in Statistics - Simulation and Computation , volume =
Lukas Arnroth and Johan Vegelius and , title =. Communications in Statistics - Simulation and Computation , volume =. 2024 , publisher =. doi:10.1080/03610918.2023.2224945 , URL =
-
[27]
The Laplace asymptotic expansion in high dimensions , author=. 2024 , eprint=
work page 2024
-
[28]
Towards a Complete Analysis of Langevin Monte Carlo: Beyond Poincar\'e Inequality , author=. 2023 , eprint=
work page 2023
-
[29]
Ramamoorthi and Pulak Ghosh , title =
Karthik Sriram and R.V. Ramamoorthi and Pulak Ghosh , title =. Bayesian Analysis , number =
- [30]
-
[31]
Statistics and Computing , volume=
Laplace approximation and natural gradient for Gaussian process regression with heteroscedastic student-t model , author=. Statistics and Computing , volume=. 2019 , publisher=
work page 2019
-
[32]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=
A general framework for updating belief distributions , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2016 , publisher=
work page 2016
-
[33]
Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it , author=
-
[34]
Journal of the american statistical association , volume=
Accurate approximations for posterior moments and marginal densities , author=. Journal of the american statistical association , volume=. 1986 , publisher=
work page 1986
- [35]
- [36]
- [37]
-
[38]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=
Estimation and model identification for continuous spatial processes , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 1988 , publisher=
work page 1988
-
[39]
Journal of econometrics , volume=
An MCMC approach to classical estimation , author=. Journal of econometrics , volume=. 2003 , publisher=
work page 2003
-
[40]
arXiv preprint arXiv:2508.01738 , year=
Bayesian Smoothed Quantile Regression , author=. arXiv preprint arXiv:2508.01738 , year=
-
[41]
Journal of Econometrics , volume=
Smoothed quantile regression with large-scale inference , author=. Journal of Econometrics , volume=. 2023 , publisher=
work page 2023
-
[42]
Nonlinear quantile mixed models
Nonlinear quantile mixed models , author=. arXiv preprint arXiv:1712.09981 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[43]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=
Of quantiles and expectiles: consistent scoring functions, Choquet representations and forecast rankings , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2016 , publisher=
work page 2016
-
[44]
Weak Convergence and Empirical Processes: With Applications to Statistics , pages=
M-estimators , author=. Weak Convergence and Empirical Processes: With Applications to Statistics , pages=. 1996 , publisher=
work page 1996
-
[45]
Journal of Machine Learning Research , volume=
Distributional random forests: Heterogeneity adjustment and multivariate distributional regression , author=. Journal of Machine Learning Research , volume=
-
[46]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=
Isotonic distributional regression , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2021 , publisher=
work page 2021
-
[47]
Distributional vs. Quantile Regression , author=. 2013 , month=
work page 2013
-
[48]
Chernozhukov, Victor and Fernández-Val, Iván and Galichon, Alfred , title =. Econometrica , volume =. doi:https://doi.org/10.3982/ECTA7880 , url =. https://onlinelibrary.wiley.com/doi/pdf/10.3982/ECTA7880 , abstract =
-
[49]
Journal of the American Statistical Association , volume=
K. Journal of the American Statistical Association , volume=. 2025 , publisher=
work page 2025
-
[50]
Handbook of Quantile Regression , editor=
-
[51]
The Annals of Mathematical Statistics , volume=
The empirical Bayes approach to statistical decision problems , author=. The Annals of Mathematical Statistics , volume=. 1964 , publisher=
work page 1964
-
[52]
MacKay, David J. C. , title =. Neural Computation , volume =. 1992 , month =. doi:10.1162/neco.1992.4.3.415 , url =
-
[53]
On the marginal likelihood and cross-validation , author=. Biometrika , volume=. 2020 , publisher=
work page 2020
-
[54]
Syring, Nicholas and Martin, Ryan , title =. Biometrika , volume =. 2018 , month =
work page 2018
- [55]
-
[56]
International Conference on Machine Learning , pages=
Variational sparse inverse Cholesky approximation for latent Gaussian processes via double Kullback-Leibler minimization , author=. International Conference on Machine Learning , pages=. 2023 , organization=
work page 2023
-
[57]
Least absolute deviations estimation for the censored regression model , journal =. 1984 , author =
work page 1984
-
[58]
Journal of Statistical Software , author=
brms: An R Package for Bayesian Multilevel Models Using Stan , volume=. Journal of Statistical Software , author=. 2017 , pages=
work page 2017
-
[59]
GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration , author=. 2021 , eprint=
work page 2021
-
[60]
Numerische mathematik , volume=
Gaussian elimination is not optimal , author=. Numerische mathematik , volume=. 1969 , publisher=
work page 1969
-
[61]
Nature communications , volume=
Mapping functional diversity from remotely sensed morphological and physiological forest traits , author=. Nature communications , volume=. 2017 , publisher=
work page 2017
-
[62]
Journal of agricultural, biological and environmental Statistics , volume=
A case study competition among methods for analyzing large spatial data , author=. Journal of agricultural, biological and environmental Statistics , volume=. 2019 , publisher=
work page 2019
-
[63]
Journal of Machine Learning Research , volume=
Integrating random effects in deep neural networks , author=. Journal of Machine Learning Research , volume=
-
[64]
Advances in neural information processing systems , volume=
Conformalized quantile regression , author=. Advances in neural information processing systems , volume=
-
[65]
International Conference on Artificial Intelligence and Statistics , pages=
Integrating uncertainty awareness into conformalized quantile regression , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2024 , organization=
work page 2024
- [66]
-
[67]
Acm transactions on interactive intelligent systems (tiis) , volume=
The movielens datasets: History and context , author=. Acm transactions on interactive intelligent systems (tiis) , volume=. 2015 , publisher=
work page 2015
-
[68]
Journal of Machine Learning Research , volume=
Asymptotic normality, concentration, and coverage of generalized posteriors , author=. Journal of Machine Learning Research , volume=
-
[69]
Gyger, Tim and Furrer, Reinhard and Sigrist, Fabio , journal=. 2026 , publisher=
work page 2026
-
[70]
Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence , pages =
Bayesian quantile and expectile optimisation , author =. Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence , pages =. 2022 , editor =
work page 2022
-
[71]
Datta, Abhirup and Banerjee, Sudipto and Finley, Andrew O and Gelfand, Alan E , journal=. Hierarchical nearest-neighbor. 2016 , publisher=
work page 2016
-
[72]
Katzfuss, Matthias and Guinness, Joseph , journal=. A general framework for
- [73]
-
[74]
An accuracy-runtime trade-off comparison of scalable
Rambelli, Filippo and Sigrist, Fabio , journal=. An accuracy-runtime trade-off comparison of scalable
-
[75]
Scalable Krylov Subspace Methods for Generalized Mixed-Effects Models with Crossed Random Effects
K. arXiv preprint arXiv:2505.09552 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[76]
Gyger, Tim and Furrer, Reinhard and Sigrist, Fabio , journal=
- [77]
-
[78]
Journal of the American Statistical Association , volume=
Bayesian spatial quantile regression , author=. Journal of the American Statistical Association , volume=. 2011 , publisher=
work page 2011
-
[79]
Annals of the Institute of Statistical Mathematics , year =
Kato, Kengo , title =. Annals of the Institute of Statistical Mathematics , year =
-
[80]
Journal of Machine Learning Research , volume=
Gaussian process boosting , author=. Journal of Machine Learning Research , volume=
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.