Bayesian Stability Selection and Inference on Selection Probabilities
Pith reviewed 2026-05-23 19:01 UTC · model grok-4.3
The pith
Bayesian priors from experts produce posterior distributions of selection probabilities that have lower variance than frequentist frequencies.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The integration of prior information into the stability selection framework yields posterior distributions of selection probabilities; these posteriors improve inference through credible intervals and reduce variance of the selection probabilities, thereby increasing the stability of downstream decisions.
What carries the argument
The two-step prior-elicitation process that constructs expert-informed priors on selection probabilities while allowing explicit control of their weight relative to the data-driven frequencies.
Load-bearing premise
Domain experts can supply priors whose weight can be controlled so that the resulting posteriors are meaningfully improved rather than dominated by arbitrary prior choices or elicitation bias.
What would settle it
A simulation or real-data experiment in which the posterior variance of selection probabilities is not smaller than the variance of the corresponding selection frequencies, or in which the credible intervals fail to achieve nominal coverage.
Figures
read the original abstract
Stability selection is a versatile framework for structure estimation and variable selection in high-dimensional setting, primarily grounded in frequentist principles. In this paper, we propose an enhanced methodology that integrates Bayesian analysis to refine the inference of selection probabilities within the stability selection framework. Traditional approaches rely on selection frequencies for decision-making, often disregarding domain-specific knowledge. Our methodology uses prior information to derive posterior distributions of selection probabilities, thereby improving both inference and decision-making. We present a two-step process for engaging with domain experts, enabling statisticians to construct prior distributions informed by expert knowledge while allowing experts to control the weight of their input on the final results. Using posterior distributions, we offer Bayesian credible intervals to quantify uncertainty in the variable selection process. Furthermore, we demonstrate how the integration of prior knowledge reduces the variance of selection probabilities, thereby improving the stability of decision-making. Our approach preserves the versatility of stability selection and is suitable for a broad range of structure estimation challenges.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a Bayesian extension to stability selection for high-dimensional structure estimation. It introduces a two-step expert elicitation process to construct priors on selection probabilities, derives posterior distributions, provides Bayesian credible intervals for uncertainty quantification, and claims that incorporating prior knowledge reduces the variance of selection probabilities relative to frequentist stability selection while preserving the original framework's versatility.
Significance. If the two-step elicitation mechanism can be shown to produce well-calibrated posteriors with controlled prior weight and demonstrable variance reduction without introducing elicitation bias, the approach could meaningfully extend stability selection by allowing domain knowledge to stabilize decisions. The preservation of the original method's applicability across structure estimation tasks is a potential strength, but the abstract provides no derivations, algorithms, or empirical results to substantiate the variance-reduction or calibration claims.
major comments (2)
- Abstract: the central claim that 'the integration of prior knowledge reduces the variance of selection probabilities' is asserted without any derivation, algorithm description, or bound showing how the two-step process achieves variance reduction or ensures the posterior is not dominated by arbitrary prior choices or elicitation error.
- Abstract: no formal mechanism (e.g., effective prior sample size or sensitivity analysis) is defined for how experts 'control the weight of their input,' leaving the robustness of the posterior to prior-data conflict or misspecification unaddressed; this directly undermines the inference and decision-making improvement claims.
Simulated Author's Rebuttal
We thank the referee for their thoughtful and constructive comments. We address each major comment point by point below, indicating the locations in the full manuscript where the requested details appear.
read point-by-point responses
-
Referee: Abstract: the central claim that 'the integration of prior knowledge reduces the variance of selection probabilities' is asserted without any derivation, algorithm description, or bound showing how the two-step process achieves variance reduction or ensures the posterior is not dominated by arbitrary prior choices or elicitation error.
Authors: The abstract is a concise summary of the paper's contributions. The full manuscript contains the derivations of the posterior distributions in Section 2.3, the algorithm for the two-step elicitation process in Section 3.1, a theoretical bound on the variance reduction in Theorem 4.1, and simulation studies in Section 4 that empirically demonstrate the variance reduction together with sensitivity analyses addressing potential prior dominance and elicitation error. revision: no
-
Referee: Abstract: no formal mechanism (e.g., effective prior sample size or sensitivity analysis) is defined for how experts 'control the weight of their input,' leaving the robustness of the posterior to prior-data conflict or misspecification unaddressed; this directly undermines the inference and decision-making improvement claims.
Authors: Section 2.2 formally defines the two-step elicitation process, in which experts specify both a prior mean and a strength parameter that corresponds to an effective prior sample size, thereby controlling the prior weight. Robustness to prior-data conflict and misspecification is examined through sensitivity analyses and dedicated simulation experiments in Section 4.4. revision: no
Circularity Check
No circularity; derivation relies on external expert priors and standard Bayesian updating
full rationale
The paper describes a Bayesian extension to stability selection that incorporates domain-expert priors through a two-step elicitation process to form posteriors on selection probabilities, with claimed variance reduction and credible intervals. No equations, self-referential definitions, fitted parameters renamed as predictions, or self-citation chains appear in the abstract or description. The central claims rest on external expert input (not derived from the same data) and standard Bayesian mechanics, which are independent of the target stability-selection results. The method is presented as preserving the original frequentist framework rather than redefining its outputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard Bayesian updating can be applied directly to selection frequencies obtained from repeated subsampling.
Reference graph
Works this paper leans on
-
[1]
A. Alfons. robustHD : An R package for robust regression with high-dimensional data. Journal of Open Source Software, 6 0 (67): 0 3786, 2021
work page 2021
- [2]
-
[3]
M. M. Barbieri and J. O. Berger. Optimal predictive model selection . The Annals of Statistics, 32 0 (3): 0 870 -- 897, 2004
work page 2004
-
[4]
T. Bayes and R. Price. Lii. an essay towards solving a problem in the doctrine of chances. by the late rev. mr. bayes, f. r. s. communicated by mr. price, in a letter to john canton, a. m. f. r. s. Philosophical Transactions of the Royal Society of London, 53: 0 370--418, 1763
-
[5]
A. Beinrucker, \"U . Dogan, and G. Blanchard. Extensions of stability selection using subsamples of observations and covariates. Statistics and Computing, 26: 0 1059--1077, 2016
work page 2016
-
[6]
B. Bodinier, S. Filippi, T. H. N st, J. Chiquet, and M. Chadeau-Hyam. Automated calibration for stability selection in penalised regression and graphical models. Journal of the Royal Statistical Society Series C: Applied Statistics, 72 0 (5): 0 1375--1393, 2023
work page 2023
-
[7]
L. Bottolo and S. Richardson. Evolutionary stochastic search for bayesian model exploration. Bayesian Analysis, 5 0 (3): 0 583--618, 2010
work page 2010
-
[8]
P. B \"u hlmann, M. Kalisch, and L. Meier. High-dimensional statistics with a view toward applications in biology. Annual Review of Statistics and Its Application, 1 0 (1): 0 255--278, 2014
work page 2014
-
[9]
B. P. Carlin and T. A. Louis. Bayes and Empirical Bayes Methods for Data Analysis. Chapman & Hall, New York, 1996
work page 1996
-
[10]
I. Castillo and A. Van Der Vaart. Needles and straw in a haystack: Posterior concentration for possibly sparse sequences. The Annals of Statistics, 40 0 (4): 0 2069--2101, 2012
work page 2069
-
[11]
A. P. Chiang, J. S. Beck, H.-J. Yen, M. K. Tayeh, T. E. Scheetz, R. E. Swiderski, D. Y. Nishimura, T. A. Braun, K.-Y. A. Kim, J. Huang, et al. Homozygosity mapping with SNP arrays identifies TRIM32, an E3 ubiquitin ligase, as a Bardet--Biedl syndrome gene (BBS11) . Proceedings of the National Academy of Sciences, 103 0 (16): 0 6287--6292, 2006
work page 2006
-
[12]
D. Cortes. isotree: Isolation-Based Outlier Detection, 2024. URL https://CRAN.R-project.org/package=isotree. R package version 0.6.1-1
work page 2024
-
[13]
R. Dezeure, P. B\"uhlmann, L. Meier, and N. Meinshausen. High-dimensional inference: Confidence intervals, p-values and R -software hdi . Statistical Science, 30 0 (4): 0 533--558, 2015
work page 2015
-
[14]
J. Friedman, T. Hastie, and R. Tibshirani. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33 0 (1): 0 1, 2010
work page 2010
- [15]
-
[16]
A. E. Hoerl and R. W. Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12 0 (1): 0 55--67, 1970
work page 1970
- [17]
- [18]
-
[19]
E. T. Jaynes. Probability Theory: The Logic of Science. Cambridge University Press, Cambridge, UK, 2003
work page 2003
- [20]
-
[21]
J. A. Khan, S. Van Aelst, and R. H. Zamar. Robust linear model selection based on least angle regression. Journal of the American Statistical Association, 102 0 (480): 0 1289--1299, 2007
work page 2007
-
[22]
N. Kissel and L. Mentch. Forward stability and model path selection. Statistics and Computing, 34 0 (82), 2024
work page 2024
-
[23]
R. Kohn, M. Smith, and D. Chan. Nonparametric regression using linear combinations of basis functions. Statistics and Computing, 11: 0 313--322, 2001
work page 2001
- [24]
-
[25]
R. Li, J. Liu, and L. Lou. Variable selection via partial correlation. Statistica Sinica, 27 0 (3): 0 983, 2017
work page 2017
- [26]
-
[27]
D. V. Lindley. The Future of Statistics: A Bayesian 21st Century . Advances in Applied Probability, 7: 0 106--115, 1975
work page 1975
-
[28]
N. Meinshausen and P. B \"u hlmann. Stability selection. Journal of the Royal Statistical Society Series B: Statistical Methodology, 72 0 (4): 0 417--473, 2010
work page 2010
-
[29]
S. Nogueira, K. Sechidis, and G. Brown. On the stability of feature selection algorithms. Journal of Machine Learning Research, 18 0 (174): 0 1--54, 2018
work page 2018
-
[30]
T. E. Scheetz, K.-Y. A. Kim, R. E. Swiderski, A. R. Philp, T. A. Braun, K. L. Knudtson, A. M. Dorrance, G. F. DiBona, J. Huang, T. L. Casavant, et al. Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proceedings of the National Academy of Sciences, 103 0 (39): 0 14429--14434, 2006
work page 2006
-
[31]
R. D. Shah and R. J. Samworth. Variable selection with error control: another look at stability selection. Journal of the Royal Statistical Society Series B: Statistical Methodology, 75 0 (1): 0 55--80, 2013
work page 2013
- [32]
-
[33]
W. R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25 0 (3-4): 0 285--294, 1933
work page 1933
-
[34]
R. Tibshirani. Regression shrinkage and selection via the L asso. Journal of the Royal Statistical Society Series B: Statistical Methodology, 58 0 (1): 0 267--288, 1996
work page 1996
-
[35]
A. N. Tikhonov. Solution of incorrectly formulated problems and the regularization method. Soviet Mathematics Doklady, 4: 0 1035--1038, 1963
work page 1963
-
[36]
H. Zou. The adaptive L asso and its oracle properties. Journal of the American Statistical Association, 101 0 (476): 0 1418--1429, 2006
work page 2006
- [37]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.