pith. sign in

arxiv: 2511.23216 · v3 · submitted 2025-11-28 · 📊 stat.ME

Comparing Variable Selection and Model Averaging Methods for Logistic Regression

Pith reviewed 2026-05-17 04:01 UTC · model grok-4.3

classification 📊 stat.ME
keywords logistic regressionvariable selectionmodel averagingBayesian model averagingLASSOseparationmodel uncertaintysimulation study
0
0 comments X

The pith

BMA with g-priors performs best for logistic regression without separation while LASSO is most stable when separation occurs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compares 28 methods for variable selection and model averaging in logistic regression to address model uncertainty for binary outcomes. Simulations based on 11 empirical datasets test these methods in scenarios both with and without separation. BMA approaches using g-priors, especially g equal to max of sample size and predictors squared, perform strongest without separation. Penalized methods like LASSO provide stability with separation, and local EB BMA is competitive overall. This offers guidance for researchers dealing with uncertain predictors in logistic models.

Core claim

The authors conduct a preregistered simulation study comparing 28 established methods for variable selection and inference under model uncertainty in logistic regression. They find that Bayesian model averaging methods based on g-priors, particularly g = max(n, p^2), show the strongest overall performance when separation is absent. When separation occurs, penalized likelihood approaches, especially the LASSO, provide the most stable results, while BMA with the local empirical Bayes prior is competitive in both situations.

What carries the argument

Preregistered simulation study evaluating 28 variable selection and model averaging methods on logistic regression models derived from 11 empirical datasets, distinguishing cases with and without separation.

If this is right

  • BMA with g = max(n, p^2) is recommended when separation is absent.
  • LASSO should be used for stability in the presence of separation.
  • EB-local BMA works competitively across both conditions.
  • These results guide method choice for model uncertainty in logistic regression.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The performance patterns might generalize to other generalized linear models with uncertain predictors.
  • Further tests on high-dimensional datasets could confirm or refine the recommendations.
  • Hybrid methods blending BMA and penalization could be explored for robustness in mixed conditions.

Load-bearing premise

The 11 empirical datasets and simulation conditions adequately represent the range of real-world logistic regression problems with model uncertainty.

What would settle it

A new dataset or simulation where BMA with g = max(n, p^2) does not lead in performance without separation, or where LASSO is not most stable with separation, would challenge the main findings.

read the original abstract

Model uncertainty is a central challenge in statistical models for binary outcomes such as logistic regression, arising when it is unclear which predictors should be included in the model. Many methods have been proposed to address this issue for logistic regression, but their relative performance under realistic conditions remains poorly understood. We therefore conducted a preregistered, simulation-based comparison of 28 established methods for variable selection and inference under model uncertainty, using 11 empirical datasets spanning a range of sample sizes and number of predictors, in cases both with and without separation. We found that Bayesian model averaging (BMA) methods based on g-priors, particularly g = max(n, p^2), show the strongest overall performance when separation is absent. When separation occurs, penalized likelihood approaches, especially the LASSO, provide the most stable results, while BMA with the local empirical Bayes (EB-local) prior is competitive in both situations. These findings offer practical guidance for applied researchers on how to effectively address model uncertainty in logistic regression in modern empirical and machine learning research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript reports a preregistered simulation study comparing 28 variable selection and model averaging methods for logistic regression under model uncertainty. It employs 11 empirical datasets spanning ranges of sample sizes and predictors, along with simulations both with and without separation. The central findings are that BMA methods using g-priors (particularly g = max(n, p^2)) exhibit the strongest overall performance when separation is absent, penalized likelihood approaches such as LASSO are most stable when separation occurs, and BMA with the local empirical Bayes (EB-local) prior remains competitive in both regimes.

Significance. If the chosen datasets and simulation conditions prove representative, the results would supply useful practical guidance for applied researchers and machine-learning practitioners confronting model uncertainty in logistic regression. The preregistered design and explicit separation/non-separation distinction constitute clear strengths that would enhance the credibility of the performance rankings.

major comments (1)
  1. [Abstract] Abstract: The description of the 11 empirical datasets supplies no information on selection criteria, p/n ratios, or correlation structures covered. Likewise, the precise mechanism and severity of separation induced in the simulations is unspecified. Because the reported superiority of g = max(n, p^2) BMA (absent separation) and LASSO (with separation) is load-bearing for the central claim, these omissions prevent assessment of whether the performance rankings generalize beyond the specific scenarios examined.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive comments and for recommending major revision. We agree that the abstract would benefit from greater specificity to help readers assess generalizability, and we have revised it accordingly while preserving brevity. Our point-by-point response to the major comment is provided below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The description of the 11 empirical datasets supplies no information on selection criteria, p/n ratios, or correlation structures covered. Likewise, the precise mechanism and severity of separation induced in the simulations is unspecified. Because the reported superiority of g = max(n, p^2) BMA (absent separation) and LASSO (with separation) is load-bearing for the central claim, these omissions prevent assessment of whether the performance rankings generalize beyond the specific scenarios examined.

    Authors: We acknowledge the validity of this observation for the original abstract. In the revised version we have added a concise clause describing the empirical datasets as having been selected to cover a broad range of p/n ratios (approximately 0.05 to 2), varying correlation structures, and sample sizes from small to moderate, drawn from publicly available sources in biomedical and social-science domains. We have also specified that separation was induced via complete separation in a controlled subset of simulation replicates by scaling the true coefficient vector until the maximum likelihood estimator diverged. These additions are intended to give readers immediate context for the reported performance rankings; fuller methodological details, including exact selection criteria and separation severity metrics, remain in the Methods and Simulation Design sections. revision: yes

Circularity Check

0 steps flagged

Empirical simulation study with no derivation chain or self-referential reductions

full rationale

The paper reports a preregistered comparison of 28 variable selection and model averaging methods for logistic regression, evaluated on 11 empirical datasets and targeted simulations (with and without separation). Its claims consist of performance rankings derived from these external benchmarks rather than any mathematical derivation, fitted parameters renamed as predictions, or load-bearing self-citations. No equations, ansatzes, uniqueness theorems, or prior-author results are invoked to support the central findings; the results are therefore self-contained against the independent data sources used. Concerns about dataset representativeness address generalizability, not circularity in any claimed derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

As an empirical comparison study the central claim rests on standard logistic regression assumptions and the representativeness of the chosen datasets and simulation design rather than new free parameters or invented entities.

axioms (1)
  • domain assumption Observations are independent and the logit of the outcome probability is a linear function of the predictors
    Standard modeling assumption invoked for all compared logistic regression methods.

pith-pipeline@v0.9.0 · 5489 in / 1181 out tokens · 45875 ms · 2026-05-17T04:01:26.390092+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Response-free item difficulty modelling for multiple-choice items with fine-tuned transformers: Component-wise representation and multi-task learning

    cs.CL 2026-05 conditional novelty 5.0

    Fine-tuned transformers with multi-task learning recover substantial wording-derived signal for item difficulty at small sample sizes typical in applied testing.