Multiple testing with the horseshoe
Pith reviewed 2026-07-01 06:37 UTC · model grok-4.3
The pith
Posterior decision rules from the horseshoe prior attain the optimal detection boundary while controlling both FDR and FNR in the sparse normal means model.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the sparse normal means model, the proposed posterior-based decision rules attain the optimal detection boundary and achieve frequentist asymptotic control of both the false discovery rate and the false negative rate.
What carries the argument
Posterior-based decision rules calibrated for FDR control, derived from continuous global-local shrinkage priors such as the horseshoe; the rules convert posterior output into signal decisions that achieve the desired error-rate bounds.
If this is right
- The rules apply across a broad class of continuous shrinkage priors.
- They are implementable using standard posterior sampling algorithms.
- Realised FDR and FNR in simulations track the theoretical targets closely.
- The approach extends directly to high-dimensional regression and Gaussian graphical models.
Where Pith is reading between the lines
- The same calibration strategy could be tested in other high-dimensional models already using horseshoe-type priors for estimation, to check whether FDR and FNR control carry over.
- Simultaneous control of FDR and FNR opens the possibility of using these rules as a preprocessing step before downstream tasks that penalise both false positives and missed signals.
Load-bearing premise
The data are generated exactly from the sparse normal means model under the horseshoe prior, with the asymptotic regime of growing dimension and controlled sparsity.
What would settle it
In large-dimensional simulations drawn from the sparse normal means model, if the proportion of false discoveries among the selected signals exceeds the nominal FDR level by a fixed margin as dimension increases, the asymptotic control claim would be contradicted.
read the original abstract
We study multiple testing under continuous global--local shrinkage priors, with a focus on the horseshoe prior in high-dimensional sparse settings. While such priors provide adaptive shrinkage and computational scalability, they do not induce exact zeros and hence do not directly yield posterior inclusion probabilities, making principled false discovery control nontrivial. We propose posterior--based decision rules for signal detection that are applicable across a broad class of continuous shrinkage priors and are calibrated to control the false discovery rate (FDR) while retaining high power. In the sparse normal means model, we show that the proposed procedures attain the optimal detection boundary and achieve frequentist asymptotic control of both FDR and false negative rate (FNR). The method is readily implementable via standard posterior sampling, and empirical studies indicate that the realised FDR and FNR closely track their theoretical targets. Applications to high-dimensional regression and Gaussian graphical models further illustrate the scope and practical effectiveness of the approach.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes posterior-based decision rules for multiple testing under continuous global-local shrinkage priors (focus on horseshoe) in high-dimensional sparse settings. These rules are calibrated to control FDR while retaining power and, in the sparse normal means model, are shown to attain the optimal detection boundary with asymptotic frequentist control of both FDR and FNR. The approach is implemented via standard posterior sampling and illustrated on regression and graphical models.
Significance. If the asymptotic results hold, the work supplies a principled route to FDR control for shrinkage priors that do not induce exact zeros, extending frequentist multiple-testing theory to a broad class of continuous global-local priors with computational scalability. The combination of optimal detection boundary attainment and explicit FDR/FNR control under the sparse normal means model would be a substantive contribution to high-dimensional inference.
major comments (2)
- [§3.2, Theorem 3.1] §3.2, Theorem 3.1: the proof that the proposed threshold attains the exact detection boundary appears to rely on a specific tail behavior of the horseshoe posterior; the argument should be checked against the general class of priors stated in the introduction, as the constant in the boundary may depend on the prior parameters.
- [§4.1, Eq. (4.3)] §4.1, Eq. (4.3): the FDR control statement is asymptotic as n,p→∞ with p0/p→0; it is unclear whether the o(1) term is uniform over the sparsity level or requires additional conditions on the signal strength that are not stated in the theorem.
minor comments (3)
- [§2 and §4] Notation for the decision threshold τ_n is introduced in §2 but reused with different subscripts in §4 without explicit redefinition; a single consistent definition would improve readability.
- [Figure 2] Figure 2 caption states 'realised FDR tracks theoretical target' but the plotted curves lack pointwise error bars or replication count; adding this information would strengthen the empirical claim.
- [Abstract and §3] The statement in the abstract that the rules are 'applicable across a broad class' is not accompanied by an explicit list of the required prior conditions until §3; moving a concise list to the introduction would help readers.
Simulated Author's Rebuttal
We thank the referee for the careful reading, positive assessment, and recommendation of minor revision. We address each major comment below.
read point-by-point responses
-
Referee: [§3.2, Theorem 3.1] §3.2, Theorem 3.1: the proof that the proposed threshold attains the exact detection boundary appears to rely on a specific tail behavior of the horseshoe posterior; the argument should be checked against the general class of priors stated in the introduction, as the constant in the boundary may depend on the prior parameters.
Authors: The proof of Theorem 3.1 is developed for the horseshoe prior and uses its specific posterior tail decay. The decision rule construction applies to the broader class of continuous global-local shrinkage priors, but the precise detection boundary constant depends on the prior's tail parameters. We will revise the theorem statement and add a remark clarifying the scope and the dependence of the constant on the prior. revision: yes
-
Referee: [§4.1, Eq. (4.3)] §4.1, Eq. (4.3): the FDR control statement is asymptotic as n,p→∞ with p0/p→0; it is unclear whether the o(1) term is uniform over the sparsity level or requires additional conditions on the signal strength that are not stated in the theorem.
Authors: The o(1) term in the FDR control of Eq. (4.3) is uniform over sparsity levels satisfying p0/p → 0 under the conditions already stated in the theorem; no additional restrictions on signal strength are imposed beyond those needed to attain the detection boundary. We will insert a sentence in the theorem to make the uniformity explicit. revision: yes
Circularity Check
No significant circularity; derivation self-contained
full rationale
The paper establishes frequentist asymptotic results (optimal detection boundary, FDR and FNR control) for posterior decision rules derived from continuous global-local priors in the sparse normal means model. These guarantees are obtained via direct analysis of the model and asymptotic regime, without reducing to fitted quantities renamed as predictions, self-definitional loops, or load-bearing self-citations whose content is unverified. The central claims remain independent of the prior calibration step and are externally falsifiable through the stated modeling assumptions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Observations follow the sparse normal means model with continuous global-local shrinkage prior
Reference graph
Works this paper leans on
-
[1]
and GASSIAT, E
ABRAHAM, K., CASTILLO, I. and GASSIAT, E. (2022). Multiple testing in nonparametric hidden Markov models: an empirical Bayes approach.J. Mach. Learn. Res.23Paper No. [94], 57. MR4576679
2022
-
[2]
and ROQUAIN, E
ABRAHAM, K., CASTILLO, I. and ROQUAIN, E. (2022). Empirical Bayes cumulativeℓ-value multiple testing procedure for sparse sequences.Electron. J. Stat.162033–2081. https://doi.org/10.1214/ 22-ejs1979 MR4415394
2022
-
[3]
ABRAHAM, K., CASTILLO, I. and ROQUAIN, E. (2024). Sharp multiple testing boundary for sparse se- quences.Ann. Statist.521564–1591. https://doi.org/10.1214/24-aos2404 MR4804180
-
[4]
ARMAGAN, A., DUNSON, D. B. and LEE, J. (2013). Generalized Double Pareto Shrinkage.Statistica Sinica23119–143
2013
-
[5]
Bayesian inference in high-dimensional models
BANERJEE, S., CASTILLO, I. and GHOSAL, S. (2026). Bayesian inference in high-dimensional models. Statistical Surveys. Arxiv eprint 2101.04491, to appear
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[6]
BARBER, R. F. and CANDÈS, E. J. (2015). Controlling the false discovery rate via knockoffs.The Annals of Statistics432055–2085
2015
-
[7]
and HOCHBERG, Y
BENJAMINI, Y. and HOCHBERG, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.Journal of the Royal Statistical Society, Series B57289–300
1995
-
[8]
BHADRA, A., DATTA, J., POLSON, N. G. and WILLARD, B. (2017). The Horseshoe+ Estimator of Ultra- Sparse Signals.Bayesian Analysis121105–1131
2017
-
[9]
BHATTACHARYA, A., PATI, D., PILLAI, N. S. and DUNSON, D. B. (2015). Dirichlet–Laplace Priors for Optimal Shrinkage.Journal of the American Statistical Association1101479–1490
2015
-
[10]
BUTUCEA, C., NDAOUD, M., STEPANOVA, N. A. and TSYBAKOV, A. B. (2018). Variable selection with Hamming loss.Ann. Statist.461837–1875. https://doi.org/10.1214/17-AOS1572 MR3845003
-
[11]
and LV, J
CANDÈS, E., FAN, Y., JANSON, L. and LV, J. (2018). Panning for gold: ’Model-X’ knockoffs for high dimensional controlled variable selection.Journal of the Royal Statistical Society: Series B (Statistical Methodology)80551–577
2018
-
[12]
CARVALHO, C. M., POLSON, N. G. and SCOTT, J. G. (2010). The Horseshoe Estimator for Sparse Signals. Biometrika97465–480. https://doi.org/10.1093/biomet/asq017
-
[13]
([2024] ©2024).Bayesian nonparametric statistics.Lecture Notes in Mathematics2358
CASTILLO, I. ([2024] ©2024).Bayesian nonparametric statistics.Lecture Notes in Mathematics2358. Springer, Cham École d’Été de Probabilités de Saint-Flour LI—2023 [Saint-Flour Probability Summer School]. https://doi.org/10.1007/978-3-031-74035-0 MR4864217
-
[14]
CASTILLO, I. and MISMER, R. (2018). Empirical Bayes analysis of spike and slab posterior distributions. Electron. J. Stat.123953–4001. https://doi.org/10.1214/18-EJS1494 MR3885271
-
[15]
CASTILLO, I. and ROQUAIN, E. (2020). On spike and slab empirical Bayes multiple testing.Ann. Statist. 482548-2574. https://doi.org/10.1214/19-AOS1897
-
[16]
andVAN DERVAART, A
CASTILLO, I. andVAN DERVAART, A. W. (2012). Needles and straw in a haystack: posterior concentration for possibly sparse sequences.Ann. Statist.402069–2101
2012
-
[17]
CHANDRA, N., MÜLLER, P. and SARKAR, A. (2026). Bayesian Scalable Precision Factor Analysis for Gaussian Graphical Models.Bayesian Analysis21. https://doi.org/10.1214/24-BA1461
-
[18]
and RAY, K
CLARA, G., SZABO, B. and RAY, K. (2020). sparsevb: spike and slab variational Bayes for linear and logistic regression R package version 1.0
2020
-
[19]
and GHOSH, J
DATTA, J. and GHOSH, J. K. (2013). Asymptotic properties of Bayes risk for the horseshoe prior
2013
-
[20]
EFRON, B. (2008). Microarrays, empirical Bayes and the two-groups model.Statist. Sci.2345–47. https: //doi.org/10.1214/08-STS236REJ MR2523938
-
[21]
EFRON, B., TIBSHIRANI, R., STOREY, J. D. and TUSHER, V. (2001). Empirical Bayes Analysis of a Microarray Experiment.Journal of the American Statistical Association961151–1160
2001
-
[22]
FROMONT, M., LERASLE, M. and REYNAUD-BOURET, P. (2016). Family-wise separation rates for multi- ple testing.Ann. Statist.442533–2563. https://doi.org/10.1214/15-AOS1418 MR3576553
-
[23]
GAN, L., NARISETTY, N. N. and LIANG, F. (2019). Bayesian Regularization for Graphical Models With Unequal Shrinkage.Journal of the American Statistical Association1141218–1231. https://doi.org/ 10.1080/01621459.2018.1482755
-
[24]
GEORGE, E. I. and MCCULLOCH, R. E. (1993). Variable Selection via Gibbs Sampling.Journal of the American Statistical Association88881–889. https://doi.org/10.1080/01621459.1993.10476353
-
[25]
GHOSAL, S. andVAN DERVAART, A. (2017).Fundamentals of nonparametric Bayesian inference.Cam- bridge Series in Statistical and Probabilistic Mathematics44. Cambridge University Press, Cam- bridge. https://doi.org/10.1017/9781139029834 MR3587782
-
[26]
GRIFFIN, J. E. and BROWN, P. J. (2010). Inference with Normal–Gamma Prior Distributions in Regression Problems.Bayesian Analysis5171–188. MULTIPLE TESTING WITH THE HORSESHOE27
2010
-
[27]
HAUTAMÄKI, T., KORHONEN, A. E., SARALA, O., KUISMIN, M. and SILLANPÄÄ, M. J. (2026). GH- SCM: Efficient maximum a posteriori inference for biological networks with the graphical horseshoe prior.Information Sciences741123257. https://doi.org/10.1016/j.ins.2026.123257
-
[28]
and BHATTACHARYA, A
JOHNDROW, J., ORENSTEIN, P. and BHATTACHARYA, A. (2020). Scalable Approximate MCMC Algo- rithms for the Horseshoe Prior.Journal of Machine Learning Research211–61
2020
-
[29]
JOHNSTONE, I. M. and SILVERMAN, B. W. (2004). Needles and straw in haystacks: empirical Bayes estimates of possibly sparse sequences.Ann. Statist.321594–1649. MR2089135 (2005h:62027)
2004
-
[30]
JOHNSTONE, I. M. and SILVERMAN, B. W. (2005). EbayesThresh: R Programs for Empirical Bayes Thresholding.Journal of Statistical Software12
2005
-
[31]
and LEE, K
KANG, M. and LEE, K. (2024). Mhorseshoe: Approximate Algorithm for Horseshoe Prior R package ver- sion 0.1.3
2024
-
[32]
KANG, M. and LEE, K. (2025). Mhorseshoe package in R: Approximate algorithm for the horseshoe prior in Bayesian linear model.SoftwareX31102236. https://doi.org/10.1016/j.softx.2025.102236
-
[33]
LI, Y., CRAIG, B. A. and BHADRA, A. (2019). The graphical horseshoe estimator for inverse covariance matrices.Journal of Computational and Graphical Statistics28747–757
2019
-
[34]
MONTANARI, A. and WU, Y. (2026). Provably Efficient Posterior Sampling for Sparse Linear Regression via Measure Decomposition.Journal of the American Statistical Association. To appear. https://doi. org/10.1080/01621459.2025.2537461
-
[35]
MÜLLER, P., PARMIGIANI, G., ROBERT, C. and ROUSSEAU, J. (2004). Optimal sample size for multiple testing: the case of gene expression microarrays.J. Amer. Statist. Assoc.99990–1001. https://doi.org/ 10.1198/016214504000001646 MR2109489
-
[36]
and CHAKRABARTI, A
PAUL, S., GHOSH, P. and CHAKRABARTI, A. (2026). Sharp Asymptotic Minimaxity for Multiple Testing Using One-Group Shrinkage Priors
2026
-
[37]
RABINOVICH, M., RAMDAS, A., JORDAN, M. I. and WAINWRIGHT, M. J. (2020). Optimal rates and trade-offs in multiple testing.Statist. Sinica30741–762. MR4214160
2020
-
[38]
and SZABÓ, B
RAY, K. and SZABÓ, B. (2022). Variational Bayes for high-dimensional linear regression with sparse priors. J. Am. Statist. Ass.1171270–1281. MR4480711
2022
-
[39]
ROBBINS, H. (1956). An empirical Bayes approach to statistics. InProceedings of the Third Berkeley Sym- posium on Mathematical Statistics and Probability, 1954–1955, vol. I157–163. Univ. California Press, Berkeley-Los Angeles, Calif. MR84919
1956
-
[40]
RO ˇCKOVÁ, V. and GEORGE, E. I. (2018). The spike-and-slab LASSO.J. Amer. Statist. Assoc.113431–444. https://doi.org/10.1080/01621459.2016.1260469 MR3803476
-
[41]
and BHADRA, A
SAGAR, K., BANERJEE, S., DATTA, J. and BHADRA, A. (2024). Precision matrix estimation under the horseshoe-like prior–penalty dual.Electronic Journal of Statistics181–46
2024
-
[42]
SALOMOND, J.-B. (2017). Risk quantification for the thresholding rule for multiple testing using Gaussian scale mixtures.arXiv preprint arXiv:1711.08705
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[43]
STOREY, J. D. (2003). The Positive False Discovery Rate: A Bayesian Interpretation and theq-Value. Annals of Statistics312013–2035
2003
-
[44]
and CAI, T
SUN, W. and CAI, T. (2007). Oracle and Adaptive Compound Decision Rules for False Discovery Rate Control.Journal of the American Statistical Association102901–912
2007
-
[45]
and CAI, T
SUN, W. and CAI, T. T. (2009). Large-scale multiple testing under dependence.J. R. Stat. Soc. Ser. B Stat. Methodol.71393–424. MR2649603
2009
-
[46]
SUN, W., REICH, B. J., CAI, T. T., GUINDANI, M. and SCHWARTZMAN, A. (2015). False discovery control in large-scale spatial multiple testing.J. R. Stat. Soc. Ser. B. Stat. Methodol.7759–83. https: //doi.org/10.1111/rssb.12064 MR3299399 [47]VAN DERPAS, S., KLEIJN, B. J. K. andVAN DERVAART, A. W. (2014). The Horseshoe Estimator: Posterior Concentration Aroun...
-
[47]
LEMMAA.4 (Bernstein’s inequality).LetZ i,1≤i≤Nbe centered independent vari- ables with|Z i| ≤Mand PN i=1 Var(Zi)≤V
withε τ therein taken equal toτ 1/4. LEMMAA.4 (Bernstein’s inequality).LetZ i,1≤i≤Nbe centered independent vari- ables with|Z i| ≤Mand PN i=1 Var(Zi)≤V. Then for allt >0, P " NX i=1 Zi > t # ≤exp −1 2 t2 V+M t/3 . A.2.2.Proof of remaining Lemmas. LEMMAA.5.For anyτ∈(0, τ 0], anyxsuch that √ 2≤x≤M τ , thes–value is bounded as follows 1− |O(τ 1/4)| 1 +τ h(x)...
2000
-
[48]
SSLasso_seqmodel_fct.R
for some results in this direction). By Bayes’ formula the posterior distribution under (2) and (65) with fixedw∈[0,1]is (66)Π w[· |X]∼ nO i=1 (1−α(X i))G0,Xi(·) +α(X i)G1,Xi(·), whereg k(x) =ϕ∗G k(x) = R ϕ(x−u)dG k(u)is the convolution ofϕandG k at pointx∈R fork= 0,1, the posterior weightα(X i)is defined through the functionα(·)given by α(x) =α w(x) = wg...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.