pith. sign in

arxiv: 2410.21914 · v2 · submitted 2024-10-29 · 📊 stat.ME · stat.CO

Bayesian Stability Selection and Inference on Selection Probabilities

Pith reviewed 2026-05-23 19:01 UTC · model grok-4.3

classification 📊 stat.ME stat.CO
keywords stability selectionBayesian inferencevariable selectionselection probabilitiesprior elicitationhigh-dimensional datacredible intervals
0
0 comments X

The pith

Bayesian priors from experts produce posterior distributions of selection probabilities that have lower variance than frequentist frequencies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a Bayesian extension of stability selection that replaces raw selection frequencies with posterior distributions derived from expert-elicited priors. A two-step process lets domain experts supply prior distributions while statisticians control how much weight those priors receive in the final posterior. The resulting posteriors support credible intervals for uncertainty quantification and are shown to have reduced variance relative to the classical approach. If the method works as described, variable selection decisions in high-dimensional problems become both more stable and more directly informed by subject-matter knowledge.

Core claim

The integration of prior information into the stability selection framework yields posterior distributions of selection probabilities; these posteriors improve inference through credible intervals and reduce variance of the selection probabilities, thereby increasing the stability of downstream decisions.

What carries the argument

The two-step prior-elicitation process that constructs expert-informed priors on selection probabilities while allowing explicit control of their weight relative to the data-driven frequencies.

Load-bearing premise

Domain experts can supply priors whose weight can be controlled so that the resulting posteriors are meaningfully improved rather than dominated by arbitrary prior choices or elicitation bias.

What would settle it

A simulation or real-data experiment in which the posterior variance of selection probabilities is not smaller than the variance of the corresponding selection frequencies, or in which the credible intervals fail to achieve nominal coverage.

Figures

Figures reproduced from arXiv: 2410.21914 by Connor Smith, Mahdi Nouraie, Samuel Muller.

Figure 1
Figure 1. Figure 1: Variance of the posterior selection probability as a function of [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Variance of the posterior selection probability as a function of [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Variance of the posterior selection probability as a function of [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Heatmap of correct and incorrect selections versus [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Heatmap of correct and incorrect selections versus [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
read the original abstract

Stability selection is a versatile framework for structure estimation and variable selection in high-dimensional setting, primarily grounded in frequentist principles. In this paper, we propose an enhanced methodology that integrates Bayesian analysis to refine the inference of selection probabilities within the stability selection framework. Traditional approaches rely on selection frequencies for decision-making, often disregarding domain-specific knowledge. Our methodology uses prior information to derive posterior distributions of selection probabilities, thereby improving both inference and decision-making. We present a two-step process for engaging with domain experts, enabling statisticians to construct prior distributions informed by expert knowledge while allowing experts to control the weight of their input on the final results. Using posterior distributions, we offer Bayesian credible intervals to quantify uncertainty in the variable selection process. Furthermore, we demonstrate how the integration of prior knowledge reduces the variance of selection probabilities, thereby improving the stability of decision-making. Our approach preserves the versatility of stability selection and is suitable for a broad range of structure estimation challenges.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper proposes a Bayesian extension to stability selection for high-dimensional structure estimation. It introduces a two-step expert elicitation process to construct priors on selection probabilities, derives posterior distributions, provides Bayesian credible intervals for uncertainty quantification, and claims that incorporating prior knowledge reduces the variance of selection probabilities relative to frequentist stability selection while preserving the original framework's versatility.

Significance. If the two-step elicitation mechanism can be shown to produce well-calibrated posteriors with controlled prior weight and demonstrable variance reduction without introducing elicitation bias, the approach could meaningfully extend stability selection by allowing domain knowledge to stabilize decisions. The preservation of the original method's applicability across structure estimation tasks is a potential strength, but the abstract provides no derivations, algorithms, or empirical results to substantiate the variance-reduction or calibration claims.

major comments (2)
  1. Abstract: the central claim that 'the integration of prior knowledge reduces the variance of selection probabilities' is asserted without any derivation, algorithm description, or bound showing how the two-step process achieves variance reduction or ensures the posterior is not dominated by arbitrary prior choices or elicitation error.
  2. Abstract: no formal mechanism (e.g., effective prior sample size or sensitivity analysis) is defined for how experts 'control the weight of their input,' leaving the robustness of the posterior to prior-data conflict or misspecification unaddressed; this directly undermines the inference and decision-making improvement claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive comments. We address each major comment point by point below, indicating the locations in the full manuscript where the requested details appear.

read point-by-point responses
  1. Referee: Abstract: the central claim that 'the integration of prior knowledge reduces the variance of selection probabilities' is asserted without any derivation, algorithm description, or bound showing how the two-step process achieves variance reduction or ensures the posterior is not dominated by arbitrary prior choices or elicitation error.

    Authors: The abstract is a concise summary of the paper's contributions. The full manuscript contains the derivations of the posterior distributions in Section 2.3, the algorithm for the two-step elicitation process in Section 3.1, a theoretical bound on the variance reduction in Theorem 4.1, and simulation studies in Section 4 that empirically demonstrate the variance reduction together with sensitivity analyses addressing potential prior dominance and elicitation error. revision: no

  2. Referee: Abstract: no formal mechanism (e.g., effective prior sample size or sensitivity analysis) is defined for how experts 'control the weight of their input,' leaving the robustness of the posterior to prior-data conflict or misspecification unaddressed; this directly undermines the inference and decision-making improvement claims.

    Authors: Section 2.2 formally defines the two-step elicitation process, in which experts specify both a prior mean and a strength parameter that corresponds to an effective prior sample size, thereby controlling the prior weight. Robustness to prior-data conflict and misspecification is examined through sensitivity analyses and dedicated simulation experiments in Section 4.4. revision: no

Circularity Check

0 steps flagged

No circularity; derivation relies on external expert priors and standard Bayesian updating

full rationale

The paper describes a Bayesian extension to stability selection that incorporates domain-expert priors through a two-step elicitation process to form posteriors on selection probabilities, with claimed variance reduction and credible intervals. No equations, self-referential definitions, fitted parameters renamed as predictions, or self-citation chains appear in the abstract or description. The central claims rest on external expert input (not derived from the same data) and standard Bayesian mechanics, which are independent of the target stability-selection results. The method is presented as preserving the original frequentist framework rather than redefining its outputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract provides no explicit free parameters, invented entities, or non-standard axioms; the approach rests on standard Bayesian updating applied to selection frequencies.

axioms (1)
  • domain assumption Standard Bayesian updating can be applied directly to selection frequencies obtained from repeated subsampling.
    The paper states that prior information is used to derive posterior distributions of selection probabilities.

pith-pipeline@v0.9.0 · 5689 in / 1195 out tokens · 39200 ms · 2026-05-23T19:01:02.708025+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages

  1. [1]

    A. Alfons. robustHD : An R package for robust regression with high-dimensional data. Journal of Open Source Software, 6 0 (67): 0 3786, 2021

  2. [2]

    Arashi, M

    M. Arashi, M. Roozbeh, N. A. Hamzah, and M. Gasparini. Ridge regression and its applications in genetic studies. Plos One, 16 0 (4): 0 e0245376, 2021

  3. [3]

    M. M. Barbieri and J. O. Berger. Optimal predictive model selection . The Annals of Statistics, 32 0 (3): 0 870 -- 897, 2004

  4. [4]

    Bayes and R

    T. Bayes and R. Price. Lii. an essay towards solving a problem in the doctrine of chances. by the late rev. mr. bayes, f. r. s. communicated by mr. price, in a letter to john canton, a. m. f. r. s. Philosophical Transactions of the Royal Society of London, 53: 0 370--418, 1763

  5. [5]

    Beinrucker, \"U

    A. Beinrucker, \"U . Dogan, and G. Blanchard. Extensions of stability selection using subsamples of observations and covariates. Statistics and Computing, 26: 0 1059--1077, 2016

  6. [6]

    Bodinier, S

    B. Bodinier, S. Filippi, T. H. N st, J. Chiquet, and M. Chadeau-Hyam. Automated calibration for stability selection in penalised regression and graphical models. Journal of the Royal Statistical Society Series C: Applied Statistics, 72 0 (5): 0 1375--1393, 2023

  7. [7]

    Bottolo and S

    L. Bottolo and S. Richardson. Evolutionary stochastic search for bayesian model exploration. Bayesian Analysis, 5 0 (3): 0 583--618, 2010

  8. [8]

    B \"u hlmann, M

    P. B \"u hlmann, M. Kalisch, and L. Meier. High-dimensional statistics with a view toward applications in biology. Annual Review of Statistics and Its Application, 1 0 (1): 0 255--278, 2014

  9. [9]

    B. P. Carlin and T. A. Louis. Bayes and Empirical Bayes Methods for Data Analysis. Chapman & Hall, New York, 1996

  10. [10]

    Castillo and A

    I. Castillo and A. Van Der Vaart. Needles and straw in a haystack: Posterior concentration for possibly sparse sequences. The Annals of Statistics, 40 0 (4): 0 2069--2101, 2012

  11. [11]

    A. P. Chiang, J. S. Beck, H.-J. Yen, M. K. Tayeh, T. E. Scheetz, R. E. Swiderski, D. Y. Nishimura, T. A. Braun, K.-Y. A. Kim, J. Huang, et al. Homozygosity mapping with SNP arrays identifies TRIM32, an E3 ubiquitin ligase, as a Bardet--Biedl syndrome gene (BBS11) . Proceedings of the National Academy of Sciences, 103 0 (16): 0 6287--6292, 2006

  12. [12]

    D. Cortes. isotree: Isolation-Based Outlier Detection, 2024. URL https://CRAN.R-project.org/package=isotree. R package version 0.6.1-1

  13. [13]

    Dezeure, P

    R. Dezeure, P. B\"uhlmann, L. Meier, and N. Meinshausen. High-dimensional inference: Confidence intervals, p-values and R -software hdi . Statistical Science, 30 0 (4): 0 533--558, 2015

  14. [14]

    Friedman, T

    J. Friedman, T. Hastie, and R. Tibshirani. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33 0 (1): 0 1, 2010

  15. [15]

    Hastie, R

    T. Hastie, R. Tibshirani, and J. H. Friedman. The E lements of S tatistical L earning: D ata M ining, I nference, and P rediction , volume 1. Springer, New York, 2009

  16. [16]

    A. E. Hoerl and R. W. Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12 0 (1): 0 55--67, 1970

  17. [17]

    Hu and V

    J. Hu and V. E. Johnson. Bayesian model selection using test statistics. Journal of the Royal Statistical Society Series B: Statistical Methodology, 71 0 (1): 0 143--158, 2009

  18. [18]

    Huang, S

    J. Huang, S. Ma, and C.-H. Zhang. Adaptive L asso for sparse high-dimensional regression models. Statistica Sinica, 18 0 (4): 0 1603--1618, 2008

  19. [19]

    E. T. Jaynes. Probability Theory: The Logic of Science. Cambridge University Press, Cambridge, UK, 2003

  20. [20]

    Jeffreys

    H. Jeffreys. Probability and scientific method. Proceedings of the Royal Society of London. Series A, 146 0 (856): 0 9--16, 1934

  21. [21]

    J. A. Khan, S. Van Aelst, and R. H. Zamar. Robust linear model selection based on least angle regression. Journal of the American Statistical Association, 102 0 (480): 0 1289--1299, 2007

  22. [22]

    Kissel and L

    N. Kissel and L. Mentch. Forward stability and model path selection. Statistics and Computing, 34 0 (82), 2024

  23. [23]

    R. Kohn, M. Smith, and D. Chan. Nonparametric regression using linear combinations of basis functions. Statistics and Computing, 11: 0 313--322, 2001

  24. [24]

    Ley and M

    E. Ley and M. F. Steel. On the effect of prior assumptions in bayesian model averaging with applications to growth regression. Journal of Applied Econometrics, 24 0 (4): 0 651--674, 2009

  25. [25]

    R. Li, J. Liu, and L. Lou. Variable selection via partial correlation. Statistica Sinica, 27 0 (3): 0 983, 2017

  26. [26]

    Liang, Q

    F. Liang, Q. Li, and L. Zhou. Bayesian neural networks for selection of drug sensitive genes. Journal of the American Statistical Association, 113 0 (523): 0 955--972, 2018

  27. [27]

    D. V. Lindley. The Future of Statistics: A Bayesian 21st Century . Advances in Applied Probability, 7: 0 106--115, 1975

  28. [28]

    Meinshausen and P

    N. Meinshausen and P. B \"u hlmann. Stability selection. Journal of the Royal Statistical Society Series B: Statistical Methodology, 72 0 (4): 0 417--473, 2010

  29. [29]

    Nogueira, K

    S. Nogueira, K. Sechidis, and G. Brown. On the stability of feature selection algorithms. Journal of Machine Learning Research, 18 0 (174): 0 1--54, 2018

  30. [30]

    T. E. Scheetz, K.-Y. A. Kim, R. E. Swiderski, A. R. Philp, T. A. Braun, K. L. Knudtson, A. M. Dorrance, G. F. DiBona, J. Huang, T. L. Casavant, et al. Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proceedings of the National Academy of Sciences, 103 0 (39): 0 14429--14434, 2006

  31. [31]

    R. D. Shah and R. J. Samworth. Variable selection with error control: another look at stability selection. Journal of the Royal Statistical Society Series B: Statistical Methodology, 75 0 (1): 0 55--80, 2013

  32. [32]

    Staerk, M

    C. Staerk, M. Kateri, and I. Ntzoufras. A metropolized adaptive subspace algorithm for high-dimensional bayesian variable selection. Bayesian Analysis, 19 0 (1): 0 261--291, 2024

  33. [33]

    W. R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25 0 (3-4): 0 285--294, 1933

  34. [34]

    Tibshirani

    R. Tibshirani. Regression shrinkage and selection via the L asso. Journal of the Royal Statistical Society Series B: Statistical Methodology, 58 0 (1): 0 267--288, 1996

  35. [35]

    A. N. Tikhonov. Solution of incorrectly formulated problems and the regularization method. Soviet Mathematics Doklady, 4: 0 1035--1038, 1963

  36. [36]

    H. Zou. The adaptive L asso and its oracle properties. Journal of the American Statistical Association, 101 0 (476): 0 1418--1429, 2006

  37. [37]

    Zou and T

    H. Zou and T. Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67 0 (2): 0 301--320, 2005