Dirichlet process mixtures of block $g$ priors for model selection and prediction in linear models

Abel Rodriguez; Anupreet Porwal

arxiv: 2411.00471 · v3 · submitted 2024-11-01 · 📊 stat.ME · cs.LG

Dirichlet process mixtures of block g priors for model selection and prediction in linear models

Anupreet Porwal , Abel Rodriguez This is my paper

Pith reviewed 2026-05-23 18:19 UTC · model grok-4.3

classification 📊 stat.ME cs.LG

keywords Dirichlet processblock g priorsmodel selectionlinear modelsshrinkage priorsconsistencyLindley paradoxMCMC

0 comments

The pith

Dirichlet process mixtures of block g priors are consistent for model selection and avoid the conditional Lindley paradox in linear models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Dirichlet process mixtures of block g priors to handle model selection and prediction in linear models. These priors extend mixtures of g priors by letting the data select blocks of coefficients that receive different shrinkage while respecting the full correlation structure among predictors. The authors establish that the resulting priors are consistent in several senses. They specifically show avoidance of the conditional Lindley paradox that arises under some other priors. An MCMC algorithm with little tuning is provided, and simulations plus real data illustrate gains in power for smaller effects when a few large signals are present.

Core claim

Dirichlet process mixtures of block g priors are consistent in various senses and, in particular, avoid the conditional Lindley paradox. They permit differential shrinkage across data-selected blocks of regression coefficients while fully accounting for the correlation structure of the predictors, thereby bridging model-selection and continuous-shrinkage approaches.

What carries the argument

Dirichlet process mixture of block g priors, which clusters coefficients into blocks that share a common shrinkage parameter while each block prior incorporates the predictors' full covariance.

If this is right

The priors achieve consistency for model selection and prediction under standard linear-model assumptions.
They avoid the conditional Lindley paradox highlighted for certain other priors.
In datasets containing a small number of very large effects, the priors yield higher power for smaller significant effects with only a minimal rise in false discoveries.
Posterior inference is feasible via an MCMC algorithm that requires only minimal ad-hoc tuning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The data-driven block construction may reveal latent grouping structure among predictors that is useful beyond prediction accuracy.
The same mixture construction could be tested in settings with missing predictors or non-Gaussian errors to check whether the consistency properties persist.
When predictors exhibit strong multicollinearity, the explicit accounting for correlation inside each block g prior may reduce sensitivity to arbitrary variable ordering.

Load-bearing premise

The Dirichlet process mixture can identify data-selected blocks of parameters that permit differential shrinkage while the block g construction fully accounts for the predictors' correlation structure.

What would settle it

A simulation study or real dataset in which the conditional Lindley paradox appears under standard g-prior mixtures but is absent under the Dirichlet process block version, or in which the prior fails to recover blocks and loses power relative to simpler alternatives.

Figures

Figures reproduced from arXiv: 2411.00471 by Abel Rodriguez, Anupreet Porwal.

**Figure 2.** Figure 2: Scatterplots of random samples from the Dirchlet mixture of block [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗

**Figure 3.** Figure 3: Behavior of log (Ba,0(y)) (left column) and Pr(ξ1 ̸= ξ2 | y) (right column) under the DP mixture of block g priors in our first simulation study. Each thin grey line corresponds to one replicate of the simulation, while the thicker blue line corresponds to the mean curve. Figures in the top row correspond to design matrices generated under η = 0, while the bottom row corresponds to η = 0.5 24 [PITH_FULL_I… view at source ↗

**Figure 4.** Figure 4: F1 scores for model selection procedures based on various priors for our second simulation study. 29 [PITH_FULL_IMAGE:figures/full_fig_p029_4.png] view at source ↗

**Figure 5.** Figure 5: Prediction MSE for η = 0 and η = 0.5. the relative MSE with respect to that under the g-prior for each dataset. Hence, values less than 1 correspond to methods with smaller (better) prediction MSE. Note that, with 31 [PITH_FULL_IMAGE:figures/full_fig_p031_5.png] view at source ↗

**Figure 6.** Figure 6: Joint and marginal posterior distributions for [PITH_FULL_IMAGE:figures/full_fig_p032_6.png] view at source ↗

**Figure 7.** Figure 7: Posterior inclusion probabilities for individual variables and model sizes for var [PITH_FULL_IMAGE:figures/full_fig_p034_7.png] view at source ↗

**Figure 8.** Figure 8: Joint and marginal posterior distributions for [PITH_FULL_IMAGE:figures/full_fig_p035_8.png] view at source ↗

**Figure 9.** Figure 9: Predictive mean squared error (MSE) and and median interval scores (MIS) for [PITH_FULL_IMAGE:figures/full_fig_p036_9.png] view at source ↗

**Figure 10.** Figure 10: Relative mean squared error of the coefficients for [PITH_FULL_IMAGE:figures/full_fig_p053_10.png] view at source ↗

**Figure 11.** Figure 11: Relative mean squared error of the coefficients for [PITH_FULL_IMAGE:figures/full_fig_p054_11.png] view at source ↗

read the original abstract

This paper introduces Dirichlet process mixtures of block $g$ priors for model selection and prediction in linear models. These priors are extensions of traditional mixtures of $g$ priors that allow for differential shrinkage for various (data-selected) blocks of parameters while fully accounting for the predictors' correlation structure, providing a bridge between the literatures on model selection and continuous shrinkage priors. We show that Dirichlet process mixtures of block $g$ priors are consistent in various senses and, in particular, that they avoid the conditional Lindley ``paradox'' highlighted by Som et al. (2016). Further, we develop a Markov chain Monte Carlo algorithm for posterior inference that requires only minimal ad-hoc tuning. Finally, we investigate the empirical performance of the prior in various real and simulated datasets. In the presence of a small number of very large effects, Dirichlet process mixtures of block $g$ priors lead to higher power for detecting smaller but significant effects without only a minimal increase in the number of false discoveries.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines DP mixtures over block g-priors to let data pick groups for differential shrinkage while respecting predictor correlations inside blocks.

read the letter

The main new piece is the construction of a Dirichlet process mixture where each component is a block g-prior. This lets the prior adaptively form blocks of coefficients that receive different shrinkage levels, and the g-prior form inside each block keeps the correlation structure of the predictors. That is a direct nonparametric extension of existing g-prior mixtures and is not already in the cited literature. The paper also supplies an MCMC sampler that needs little tuning and reports that the prior performs well on both simulated and real data when a few large effects are present, recovering smaller signals with only a modest rise in false positives. Those are the concrete advances. The theoretical claims are the softer part. The abstract states posterior consistency in various senses and avoidance of the conditional Lindley paradox from Som et al. (2016), but the conditions and proof outlines are not visible here, so it is not yet clear how general those results are or whether the block structure introduces extra restrictions. The empirical summary is also high-level without quantitative tables or variability measures in the abstract, which makes the performance claims harder to judge precisely. This paper is aimed at Bayesian statisticians who work on variable selection and shrinkage in linear models. A reader already familiar with g-priors or DP mixtures in regression would see the most value. It deserves peer review because the prior family is original, the sampler is usable, and the empirical behavior is at least directionally interesting; referees can check the derivations and the detailed comparisons.

Referee Report

2 major / 3 minor

Summary. The manuscript introduces Dirichlet process mixtures of block g priors as a nonparametric extension of g-prior mixtures for Bayesian linear model selection and prediction. The construction permits differential shrinkage across data-selected blocks of coefficients while incorporating the full correlation structure among predictors within blocks. The authors establish posterior consistency in multiple senses and show explicit avoidance of the conditional Lindley paradox of Som et al. (2016). They supply an MCMC algorithm requiring only minimal tuning and report empirical results on simulated and real data, highlighting improved power to detect small effects when a few large effects are present, with only minimal increase in false discoveries.

Significance. If the consistency and paradox-avoidance results hold under the stated conditions, the work supplies a principled bridge between discrete model-selection priors and continuous shrinkage priors. The nonparametric block structure and the MCMC sampler with limited tuning are practical strengths; the empirical demonstration of power gains without substantial false-discovery inflation is a concrete contribution to high-dimensional linear modeling.

major comments (2)

[§3.2, Theorem 3] §3.2, Theorem 3 (consistency for model selection): the proof sketch invokes the block-g construction to control the marginal likelihood ratio, but the argument appears to condition on the realized partition; it is unclear whether the result remains valid when the posterior on the number of blocks is allowed to grow with n, which is the generic behavior of the DP mixture.
[§3.3, Proposition 1] §3.3, Proposition 1 (avoidance of conditional Lindley paradox): the derivation shows that the posterior odds remain bounded away from zero when a block contains both large and small signals, yet the bound depends on the fixed value of the DP concentration parameter; the paper does not state whether the result continues to hold for data-driven choices of this hyperparameter.

minor comments (3)

[§2.1] §2.1: the definition of the block-g prior matrix R_b should explicitly display how the within-block correlation matrix is formed from the design submatrix X_b.
[Table 2] Table 2: the reported posterior inclusion probabilities for the small-effect variables lack standard errors or credible intervals, making it difficult to assess variability across replications.
[§4] The MCMC description in §4 states that only minimal tuning is required, but the supplementary material does not report effective sample sizes or mixing diagnostics for the block-allocation variables.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review. The comments highlight important points regarding the scope of our consistency and paradox-avoidance results. We address each major comment below and indicate the revisions that will be incorporated.

read point-by-point responses

Referee: [§3.2, Theorem 3] §3.2, Theorem 3 (consistency for model selection): the proof sketch invokes the block-g construction to control the marginal likelihood ratio, but the argument appears to condition on the realized partition; it is unclear whether the result remains valid when the posterior on the number of blocks is allowed to grow with n, which is the generic behavior of the DP mixture.

Authors: We agree that the current proof of Theorem 3 proceeds conditionally on a fixed partition. To establish unconditional consistency, the argument must also control the posterior mass on partitions whose number of blocks grows too rapidly with n. We will revise §3.2 to include an additional lemma showing that the DP prior (with fixed concentration) places vanishing posterior probability on partitions with more than O(log n) blocks under the stated conditions, thereby extending the marginal-likelihood ratio bound to the unconditional case. This revision will be made. revision: yes
Referee: [§3.3, Proposition 1] §3.3, Proposition 1 (avoidance of conditional Lindley paradox): the derivation shows that the posterior odds remain bounded away from zero when a block contains both large and small signals, yet the bound depends on the fixed value of the DP concentration parameter; the paper does not state whether the result continues to hold for data-driven choices of this hyperparameter.

Authors: The derivation of Proposition 1 indeed treats the DP concentration parameter α as fixed. The lower bound on the posterior odds is monotone in α, so the qualitative avoidance of the conditional Lindley paradox continues to hold for any fixed α in a compact interval away from zero and infinity. We will add an explicit remark after the proposition stating this assumption and noting that fully data-driven selection of α lies outside the present scope. No further technical change is required. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper introduces Dirichlet process mixtures of block g priors as a nonparametric extension of existing g-prior mixtures, with consistency claims and avoidance of the conditional Lindley paradox positioned as theoretical properties of the construction. No load-bearing steps reduce by definition, fitted input, or self-citation chain to the inputs themselves; the derivation relies on external arguments for consistency rather than internal tautologies or renamed empirical patterns. The central claims remain independent of any self-referential fitting or uniqueness imported solely from the authors' prior work.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 1 invented entities

The central contribution is a new prior construction whose properties are asserted but not derived in the abstract; the ledger therefore records the minimal set of modeling choices needed to state the claim.

free parameters (1)

Dirichlet process concentration parameter
Hyperparameter controlling the number of blocks; its value or prior is required for the mixture but not specified in the abstract.

axioms (2)

domain assumption The block g prior correctly encodes the correlation structure of the design matrix
Invoked when extending the classical g-prior to blocks.
standard math Standard consistency results for Dirichlet process mixtures carry over to the block-g setting
Required for the consistency claims.

invented entities (1)

block g prior no independent evidence
purpose: To permit differential shrinkage across data-selected groups of coefficients
New modeling object introduced by the paper.

pith-pipeline@v0.9.0 · 5701 in / 1229 out tokens · 51477 ms · 2026-05-23T18:19:00.162528+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages · 1 internal anchor

[1]

, " * write output.state after.block = add.period write newline

ENTRY address author booktitle chapter doi edition editor eid howpublished institution journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sentence...

work page
[2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page
[3]

Andrade, J. A. A. & O'Hagan, A. (2011). Bayesian robustness modelling of location and scale parameters. Scandinavian Journal of Statistics 38, 691--711

work page 2011
[4]

Antoniak, C. E. (1974). Mixtures of dirichlet processes with applications to bayesian nonparametric problems. The annals of statistics pp. 1152--1174

work page 1974
[5]

On the Beta Prime Prior for Scale Parameters in High-Dimensional Bayesian Regression Models

Bai, R. & Ghosh, M. (2018). On the beta prime prior for scale parameters in high-dimensional B ayesian regression models. arXiv preprint arXiv:1807.06539

work page internal anchor Pith review Pith/arXiv arXiv 2018
[6]

Bayarri, M. J. , Berger, J. O. , Forte, A. , Garc \' a-Donato, G. et al. (2012). Criteria for B ayesian model choice with application to variable selection. The Annals of Statistics 40, 1550--1577

work page 2012
[7]

Berger, J. O. , Bernardo, J. M. & Sun, D. (2009). The formal definition of reference priors. Annals of Statistics 37, 905--938

work page 2009
[8]

Berger, J. O. & Pericchi, L. R. (1996). The intrinsic B ayes factor for linear models. In Bayesian Statistics 5, Eds. A. P. D. J. M. Bernardo, J. O. Berger & A. F. M. Smith, pp. 25--44. Oxford Univ. Press

work page 1996
[9]

Berger, J. O. , Pericchi, L. R. & Varshavsky, J. A. (1998). Bayes factors and marginal distributions in invariant situations. Sankhy \=a : The Indian Journal of Statistics, Series A pp. 307--321

work page 1998
[10]

Bertoin, J. (2006). Random fragmentation and coagulation processes, volume 102. Cambridge University Press

work page 2006
[11]

, Datta, J

Bhadra, A. , Datta, J. , Polson, N. G. , Willard, B. et al. (2017). The horseshoe+ estimator of ultra-sparse signals. Bayesian Analysis 12, 1105--1131

work page 2017
[12]

, Pati, D

Bhattacharya, A. , Pati, D. , Pillai, N. S. & Dunson, D. B. (2015). Dirichlet-- L aplace priors for optimal shrinkage. Journal of the American Statistical Association 110, 1479--1490

work page 2015
[13]

& MacQueen, J

Blackwell, D. & MacQueen, J. B. (1973). Ferguson distributions via p \'o lya urn schemes. The annals of statistics 1, 353--355

work page 1973
[14]

, Datta, J

Boss, J. , Datta, J. , Wang, X. , Park, S. K. , Kang, J. & Mukherjee, B. (2023). Group inverse-gamma gamma shrinkage for sparse linear models with block-correlated regressors. Bayesian Analysis 1, 1--30

work page 2023
[15]

Bov \'e , D. S. & Held, L. (2011). Hyper- g priors for generalized linear models. Bayesian Analysis 6, 387--410

work page 2011
[16]

& Friedman, J

Breiman, L. & Friedman, J. H. (1985). Estimating optimal transformations for multiple regression and correlation. Journal of the American statistical Association 80, 580--598

work page 1985
[17]

Brown, P. J. & Griffin, J. E. (2010). Inference with normal-gamma prior distributions in regression problems. Bayesian Analysis 5, 171--188

work page 2010
[18]

Carvalho, C. M. , Polson, N. G. & Scott, J. G. (2010). The horseshoe estimator for sparse signals. Biometrika 97, 465--480

work page 2010
[19]

Carvalho, C. M. & Scott, J. G. (2009). Objective B ayesian model selection in G aussian graphical models. Biometrika 96, 497--512

work page 2009
[20]

& Moreno, E

Casella, G. & Moreno, E. (2006). Objective bayesian variable selection. Journal of the American Statistical Association 101, 157--167

work page 2006
[21]

, Fouskakis, D

Consonni, G. , Fouskakis, D. , Liseo, B. , Ntzoufras, I. et al. (2018). Prior distributions for objective B ayesian analysis. Bayesian Analysis 13, 627--679

work page 2018
[22]

& Spezzaferri, F

De Santis, F. & Spezzaferri, F. (2001). Consistent fractional B ayes factor for nested normal linear models. Journal of statistical planning and inference 97, 305--321

work page 2001
[23]

, Azevedo, R

Denti, F. , Azevedo, R. , Lo, C. , Wheeler, D. G. , Gandhi, S. P. , Guindani, M. & Shahbaba, B. (2023). A horseshoe mixture model for bayesian screening with an application to light sheet fluorescence microscopy in brain imaging. The Annals of Applied Statistics 17, 2639--2658

work page 2023
[24]

Ferguson, T. S. (1973). A bayesian analysis of some nonparametric problems. The annals of statistics pp. 209--230

work page 1973
[25]

& Drton, M

Finegold, M. & Drton, M. (2014). Robust B ayesian graphical modeling using D irichlet t-distributions. Bayesian Analysis 9, 521--550

work page 2014
[26]

, Garcia-Donato, G

Forte, A. , Garcia-Donato, G. & Steel, M. F. J. (2018). Methods and tools for B ayesian variable selection and model averaging in normal linear regression. International Statistical Review 86, 237--258

work page 2018
[27]

, Ntzoufras, I

Fouskakis, D. , Ntzoufras, I. & Draper, D. (2015). Power-expected-posterior priors for variable selection in G aussian linear models. Bayesian Analysis 10, 75--107

work page 2015
[28]

& Raftery, A

Gneiting, T. & Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association 102, 359--378

work page 2007
[29]

Gordy, M. B. (1998). A generalization of generalized B eta distributions. Technical report, Division of Research and Statistics, Division of Monetary Affairs, Federal Reserve

work page 1998
[30]

Green, P. J. (1995). Reversible jump M arkov chain M onte C arlo computation and B ayesian model determination. Biometrika 82, 711--732

work page 1995
[31]

& Brown, P

Griffin, J. & Brown, P. (2005). Alternative prior distributions for variable selection with very many more variables than observations. University of Kent Technical Report

work page 2005
[32]

Hans, C. (2009). Bayesian lasso regression. Biometrika 96, 835--845

work page 2009
[33]

Huang, J. , Ma, S. & Zhang, C.-H. (2008). Adaptive lasso for sparse high-dimensional regression models. Statistica Sinica pp. 1603--1618

work page 2008
[34]

Johnson, V. E. & Rossell, D. (2010). On the use of non-local prior densities in B ayesian hypothesis tests. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72, 143--170

work page 2010
[35]

Johnson, V. E. & Rossell, D. (2012). Bayesian model selection in high-dimensional settings. Journal of the American Statistical Association 107, 649--660

work page 2012
[36]

Kass, R. E. & Wasserman, L. (1995). A reference B ayesian test for nested hypotheses and its relationship to the S chwarz criterion. Journal of the American Statistical Association 90, 928--934

work page 1995
[37]

Lee, S. Y. , Pati, D. & Mallick, B. K. (2020). Continuous shrinkage prior revisited: a collapsing behavior and remedy. arXiv preprint arXiv:2007.02192

work page arXiv 2020
[38]

, Tran, M.-N

Leng, C. , Tran, M.-N. & Nott, D. (2014). Bayesian adaptive lasso. Annals of the Institute of Statistical Mathematics 66, 221--244

work page 2014
[39]

& Pati, D

Li, H. & Pati, D. (2017). Variable selection using shrinkage priors. Computational Statistics & Data Analysis 107, 107--119

work page 2017
[40]

& Clyde, M

Li, Y. & Clyde, M. A. (2018). Mixtures of g-priors in generalized linear models. Journal of the American Statistical Association 113, 1828--1845

work page 2018
[41]

, Paulo, R

Liang, F. , Paulo, R. , Molina, G. , Clyde, M. A. & Berger, J. O. (2008). Mixtures of g-priors for B ayesian variable selection. Journal of the American Statistical Association 103, 410--423

work page 2008
[42]

, Wichura, M

Liu, Y. , Wichura, M. J. & Drton, M. (2012). Rejection sampling for an extended gamma distribution. Unpublished manuscript

work page 2012
[43]

Neal, R. M. (2000). Markov chain sampling methods for dirichlet process mixture models. Journal of computational and graphical statistics 9, 249--265

work page 2000
[44]

O'Hagan, A. (1995). Fractional B ayes factors for model comparison. Journal of the Royal Statistical Society: Series B (Methodological) 57, 99--118

work page 1995
[45]

& Casella, G

Park, T. & Casella, G. (2008). The B ayesian lasso. Journal of the American Statistical Association 103, 681--686

work page 2008
[46]

Polson, N. G. & Scott, J. G. (2012). Local shrinkage rules, l \'e vy processes and regularized regression. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 74, 287--311

work page 2012
[47]

Polson, N. G. , Scott, J. G. & Windle, J. (2013). Bayesian inference for logistic models using P \'o lya-- G amma latent variables. Journal of the American Statistical Association 108, 1339--1349

work page 2013
[48]

& Raftery, A

Porwal, A. & Raftery, A. E. (2022). Effect of model space priors on statistical inference with model uncertainty. The New England Journal of Statistics in Data Science pp. 1--10

work page 2022
[49]

& Rodr \' guez, A

Porwal, A. & Rodr \' guez, A. (2023). Laplace power-expected-posterior priors for logistic regression. Bayesian Analysis 1, 1--24

work page 2023
[50]

Rodr \' guez, A. (2013). On the jeffreys prior for the multivariate ewens distribution. Statistics & Probability Letters 83, 1539--1546

work page 2013
[51]

Scott, J. G. & Berger, J. O. (2010). Bayes and empirical- B ayes multiplicity adjustment in the variable-selection problem. The Annals of Statistics pp. 2587--2619

work page 2010
[52]

Sethuraman, J. (1994). A constructive definition of dirichlet priors. Statistica sinica pp. 639--650

work page 1994
[53]

Som, A. (2014). Paradoxes and Priors in Bayesian Regression. Ph.D. thesis, The Ohio State University

work page 2014
[54]

, Hans, C

Som, A. , Hans, C. M. & MacEachern, S. N. (2016). A conditional L indley paradox in B ayesian linear models. Biometrika 103, 993--999

work page 2016
[55]

Tipping, M. E. (2001). Sparse B ayesian learning and the relevance vector machine. Journal of Machine Learning Research 1, 211--244

work page 2001
[56]

Zellner, A. (1986). On assessing prior distributions and B ayesian regression analysis with g-prior distributions. In Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti, Eds. P. K. Goel & A. Zellner, pp. 233--243. Amsterdam: North-Holland/Elsevier

work page 1986
[57]

& Siow, A

Zellner, A. & Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. Trabajos de Estad \'i stica y de Investigaci \'o w Operativa 31, 585--603

work page 1980

[1] [1]

, " * write output.state after.block = add.period write newline

ENTRY address author booktitle chapter doi edition editor eid howpublished institution journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sentence...

work page

[2] [2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page

[3] [3]

Andrade, J. A. A. & O'Hagan, A. (2011). Bayesian robustness modelling of location and scale parameters. Scandinavian Journal of Statistics 38, 691--711

work page 2011

[4] [4]

Antoniak, C. E. (1974). Mixtures of dirichlet processes with applications to bayesian nonparametric problems. The annals of statistics pp. 1152--1174

work page 1974

[5] [5]

On the Beta Prime Prior for Scale Parameters in High-Dimensional Bayesian Regression Models

Bai, R. & Ghosh, M. (2018). On the beta prime prior for scale parameters in high-dimensional B ayesian regression models. arXiv preprint arXiv:1807.06539

work page internal anchor Pith review Pith/arXiv arXiv 2018

[6] [6]

Bayarri, M. J. , Berger, J. O. , Forte, A. , Garc \' a-Donato, G. et al. (2012). Criteria for B ayesian model choice with application to variable selection. The Annals of Statistics 40, 1550--1577

work page 2012

[7] [7]

Berger, J. O. , Bernardo, J. M. & Sun, D. (2009). The formal definition of reference priors. Annals of Statistics 37, 905--938

work page 2009

[8] [8]

Berger, J. O. & Pericchi, L. R. (1996). The intrinsic B ayes factor for linear models. In Bayesian Statistics 5, Eds. A. P. D. J. M. Bernardo, J. O. Berger & A. F. M. Smith, pp. 25--44. Oxford Univ. Press

work page 1996

[9] [9]

Berger, J. O. , Pericchi, L. R. & Varshavsky, J. A. (1998). Bayes factors and marginal distributions in invariant situations. Sankhy \=a : The Indian Journal of Statistics, Series A pp. 307--321

work page 1998

[10] [10]

Bertoin, J. (2006). Random fragmentation and coagulation processes, volume 102. Cambridge University Press

work page 2006

[11] [11]

, Datta, J

Bhadra, A. , Datta, J. , Polson, N. G. , Willard, B. et al. (2017). The horseshoe+ estimator of ultra-sparse signals. Bayesian Analysis 12, 1105--1131

work page 2017

[12] [12]

, Pati, D

Bhattacharya, A. , Pati, D. , Pillai, N. S. & Dunson, D. B. (2015). Dirichlet-- L aplace priors for optimal shrinkage. Journal of the American Statistical Association 110, 1479--1490

work page 2015

[13] [13]

& MacQueen, J

Blackwell, D. & MacQueen, J. B. (1973). Ferguson distributions via p \'o lya urn schemes. The annals of statistics 1, 353--355

work page 1973

[14] [14]

, Datta, J

Boss, J. , Datta, J. , Wang, X. , Park, S. K. , Kang, J. & Mukherjee, B. (2023). Group inverse-gamma gamma shrinkage for sparse linear models with block-correlated regressors. Bayesian Analysis 1, 1--30

work page 2023

[15] [15]

Bov \'e , D. S. & Held, L. (2011). Hyper- g priors for generalized linear models. Bayesian Analysis 6, 387--410

work page 2011

[16] [16]

& Friedman, J

Breiman, L. & Friedman, J. H. (1985). Estimating optimal transformations for multiple regression and correlation. Journal of the American statistical Association 80, 580--598

work page 1985

[17] [17]

Brown, P. J. & Griffin, J. E. (2010). Inference with normal-gamma prior distributions in regression problems. Bayesian Analysis 5, 171--188

work page 2010

[18] [18]

Carvalho, C. M. , Polson, N. G. & Scott, J. G. (2010). The horseshoe estimator for sparse signals. Biometrika 97, 465--480

work page 2010

[19] [19]

Carvalho, C. M. & Scott, J. G. (2009). Objective B ayesian model selection in G aussian graphical models. Biometrika 96, 497--512

work page 2009

[20] [20]

& Moreno, E

Casella, G. & Moreno, E. (2006). Objective bayesian variable selection. Journal of the American Statistical Association 101, 157--167

work page 2006

[21] [21]

, Fouskakis, D

Consonni, G. , Fouskakis, D. , Liseo, B. , Ntzoufras, I. et al. (2018). Prior distributions for objective B ayesian analysis. Bayesian Analysis 13, 627--679

work page 2018

[22] [22]

& Spezzaferri, F

De Santis, F. & Spezzaferri, F. (2001). Consistent fractional B ayes factor for nested normal linear models. Journal of statistical planning and inference 97, 305--321

work page 2001

[23] [23]

, Azevedo, R

Denti, F. , Azevedo, R. , Lo, C. , Wheeler, D. G. , Gandhi, S. P. , Guindani, M. & Shahbaba, B. (2023). A horseshoe mixture model for bayesian screening with an application to light sheet fluorescence microscopy in brain imaging. The Annals of Applied Statistics 17, 2639--2658

work page 2023

[24] [24]

Ferguson, T. S. (1973). A bayesian analysis of some nonparametric problems. The annals of statistics pp. 209--230

work page 1973

[25] [25]

& Drton, M

Finegold, M. & Drton, M. (2014). Robust B ayesian graphical modeling using D irichlet t-distributions. Bayesian Analysis 9, 521--550

work page 2014

[26] [26]

, Garcia-Donato, G

Forte, A. , Garcia-Donato, G. & Steel, M. F. J. (2018). Methods and tools for B ayesian variable selection and model averaging in normal linear regression. International Statistical Review 86, 237--258

work page 2018

[27] [27]

, Ntzoufras, I

Fouskakis, D. , Ntzoufras, I. & Draper, D. (2015). Power-expected-posterior priors for variable selection in G aussian linear models. Bayesian Analysis 10, 75--107

work page 2015

[28] [28]

& Raftery, A

Gneiting, T. & Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association 102, 359--378

work page 2007

[29] [29]

Gordy, M. B. (1998). A generalization of generalized B eta distributions. Technical report, Division of Research and Statistics, Division of Monetary Affairs, Federal Reserve

work page 1998

[30] [30]

Green, P. J. (1995). Reversible jump M arkov chain M onte C arlo computation and B ayesian model determination. Biometrika 82, 711--732

work page 1995

[31] [31]

& Brown, P

Griffin, J. & Brown, P. (2005). Alternative prior distributions for variable selection with very many more variables than observations. University of Kent Technical Report

work page 2005

[32] [32]

Hans, C. (2009). Bayesian lasso regression. Biometrika 96, 835--845

work page 2009

[33] [33]

Huang, J. , Ma, S. & Zhang, C.-H. (2008). Adaptive lasso for sparse high-dimensional regression models. Statistica Sinica pp. 1603--1618

work page 2008

[34] [34]

Johnson, V. E. & Rossell, D. (2010). On the use of non-local prior densities in B ayesian hypothesis tests. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72, 143--170

work page 2010

[35] [35]

Johnson, V. E. & Rossell, D. (2012). Bayesian model selection in high-dimensional settings. Journal of the American Statistical Association 107, 649--660

work page 2012

[36] [36]

Kass, R. E. & Wasserman, L. (1995). A reference B ayesian test for nested hypotheses and its relationship to the S chwarz criterion. Journal of the American Statistical Association 90, 928--934

work page 1995

[37] [37]

Lee, S. Y. , Pati, D. & Mallick, B. K. (2020). Continuous shrinkage prior revisited: a collapsing behavior and remedy. arXiv preprint arXiv:2007.02192

work page arXiv 2020

[38] [38]

, Tran, M.-N

Leng, C. , Tran, M.-N. & Nott, D. (2014). Bayesian adaptive lasso. Annals of the Institute of Statistical Mathematics 66, 221--244

work page 2014

[39] [39]

& Pati, D

Li, H. & Pati, D. (2017). Variable selection using shrinkage priors. Computational Statistics & Data Analysis 107, 107--119

work page 2017

[40] [40]

& Clyde, M

Li, Y. & Clyde, M. A. (2018). Mixtures of g-priors in generalized linear models. Journal of the American Statistical Association 113, 1828--1845

work page 2018

[41] [41]

, Paulo, R

Liang, F. , Paulo, R. , Molina, G. , Clyde, M. A. & Berger, J. O. (2008). Mixtures of g-priors for B ayesian variable selection. Journal of the American Statistical Association 103, 410--423

work page 2008

[42] [42]

, Wichura, M

Liu, Y. , Wichura, M. J. & Drton, M. (2012). Rejection sampling for an extended gamma distribution. Unpublished manuscript

work page 2012

[43] [43]

Neal, R. M. (2000). Markov chain sampling methods for dirichlet process mixture models. Journal of computational and graphical statistics 9, 249--265

work page 2000

[44] [44]

O'Hagan, A. (1995). Fractional B ayes factors for model comparison. Journal of the Royal Statistical Society: Series B (Methodological) 57, 99--118

work page 1995

[45] [45]

& Casella, G

Park, T. & Casella, G. (2008). The B ayesian lasso. Journal of the American Statistical Association 103, 681--686

work page 2008

[46] [46]

Polson, N. G. & Scott, J. G. (2012). Local shrinkage rules, l \'e vy processes and regularized regression. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 74, 287--311

work page 2012

[47] [47]

Polson, N. G. , Scott, J. G. & Windle, J. (2013). Bayesian inference for logistic models using P \'o lya-- G amma latent variables. Journal of the American Statistical Association 108, 1339--1349

work page 2013

[48] [48]

& Raftery, A

Porwal, A. & Raftery, A. E. (2022). Effect of model space priors on statistical inference with model uncertainty. The New England Journal of Statistics in Data Science pp. 1--10

work page 2022

[49] [49]

& Rodr \' guez, A

Porwal, A. & Rodr \' guez, A. (2023). Laplace power-expected-posterior priors for logistic regression. Bayesian Analysis 1, 1--24

work page 2023

[50] [50]

Rodr \' guez, A. (2013). On the jeffreys prior for the multivariate ewens distribution. Statistics & Probability Letters 83, 1539--1546

work page 2013

[51] [51]

Scott, J. G. & Berger, J. O. (2010). Bayes and empirical- B ayes multiplicity adjustment in the variable-selection problem. The Annals of Statistics pp. 2587--2619

work page 2010

[52] [52]

Sethuraman, J. (1994). A constructive definition of dirichlet priors. Statistica sinica pp. 639--650

work page 1994

[53] [53]

Som, A. (2014). Paradoxes and Priors in Bayesian Regression. Ph.D. thesis, The Ohio State University

work page 2014

[54] [54]

, Hans, C

Som, A. , Hans, C. M. & MacEachern, S. N. (2016). A conditional L indley paradox in B ayesian linear models. Biometrika 103, 993--999

work page 2016

[55] [55]

Tipping, M. E. (2001). Sparse B ayesian learning and the relevance vector machine. Journal of Machine Learning Research 1, 211--244

work page 2001

[56] [56]

Zellner, A. (1986). On assessing prior distributions and B ayesian regression analysis with g-prior distributions. In Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti, Eds. P. K. Goel & A. Zellner, pp. 233--243. Amsterdam: North-Holland/Elsevier

work page 1986

[57] [57]

& Siow, A

Zellner, A. & Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. Trabajos de Estad \'i stica y de Investigaci \'o w Operativa 31, 585--603

work page 1980