Robust confidence intervals for generalized linear models
Pith reviewed 2026-05-08 17:57 UTC · model grok-4.3
The pith
Sign-flipping individual score contributions yields asymptotically valid confidence intervals for generalized linear models under variance misspecification.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Inverting hypothesis tests obtained by sign-flipping the individual score contributions produces confidence intervals whose asymptotic coverage remains valid under general variance misspecification in generalized linear models; the resulting intervals achieve reliable coverage in simulations and outperform standard Wald-type intervals when the mean-variance relationship is violated.
What carries the argument
Inversion via bisection of sign-flipping tests applied to individual score contributions from the GLM estimating equations.
If this is right
- The intervals maintain nominal coverage rates in finite samples when variances are misspecified in ways common to count data.
- They deliver shorter intervals or higher power than Wald intervals in the same misspecified settings.
- The procedure applies directly to high-dimensional differential expression analyses without requiring parametric variance modeling.
- Bisection search reliably locates the interval endpoints once the sign-flip test statistic is computable.
Where Pith is reading between the lines
- The same sign-flip inversion could be applied to score-based tests in other estimating-equation settings such as generalized estimating equations for clustered data.
- In practice the method may reduce the need for separate overdispersion parameters or quasi-likelihood adjustments when the goal is interval estimation rather than point estimation.
- If the score contributions can be computed efficiently, the approach scales to the sample sizes typical of modern sequencing experiments without additional tuning parameters.
Load-bearing premise
Inverting sign-flipped versions of the individual score contributions produces tests whose acceptance regions, when collected, yield intervals with the claimed asymptotic coverage even under completely arbitrary heteroskedasticity or overdispersion.
What would settle it
A Monte Carlo experiment in which the empirical coverage of the proposed intervals falls materially below the nominal level across repeated samples drawn from a GLM with severe, observation-specific variance inflation would falsify the asymptotic validity result.
Figures
read the original abstract
Reliable uncertainty quantification is a central challenge in the analysis of modern biomedical data, where complex sources of variability often violate standard modeling assumptions. In generalized linear models (GLMs), confidence intervals for regression parameters provide such information, but they typically rely on correct specification of the mean-variance relationship. However, overdispersion, heteroskedasticity, and unobserved biological variability can lead to substantial undercoverage in practice. We propose a method for constructing confidence intervals that remains valid under variance misspecification. The approach is based on the inversion of hypothesis tests obtained by sign-flipping individual score contributions and uses a bisection algorithm to determine the interval bounds. The resulting intervals inherit robustness properties from the underlying tests, and we establish their asymptotic validity under general variance misspecification. Through simulation studies, we show that the proposed method achieves reliable coverage and outperforms standard Wald-type intervals when model assumptions are violated. We illustrate the approach in a differential expression analysis of RNA-sequencing data from a cancer study, where heterogeneous variability is pervasive and parametric methods can yield inconsistent inference. The proposed framework provides a practical and robust alternative to conventional quasi-likelihood or Wald-based methods for interval estimation in GLMs, particularly suited to high-throughput biomedical applications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes constructing robust confidence intervals for regression parameters in generalized linear models by inverting hypothesis tests formed via sign-flipping of individual score contributions, with bounds located by bisection. It claims that the resulting intervals are asymptotically valid under general variance misspecification (heteroskedasticity or overdispersion) while the mean model remains correct, supports this with simulation studies showing reliable coverage that outperforms Wald-type intervals, and illustrates the method on RNA-sequencing differential expression data from a cancer study.
Significance. If the asymptotic validity result holds, the approach supplies a practical, randomization-based alternative to quasi-likelihood or sandwich estimators for interval estimation in GLMs when variance assumptions fail, which is common in high-throughput biomedical data. The sign-flipping construction on scores is a clean way to achieve robustness without estimating extra variance parameters, and the simulation evidence plus real-data example add to its applied value.
major comments (2)
- [Theoretical results / asymptotic validity section] The central claim of asymptotic validity under arbitrary variance misspecification rests on the sign-flipping tests having correct asymptotic size and the inversion preserving coverage. The manuscript asserts this result but does not list the precise regularity conditions (e.g., Lindeberg-type conditions on the score contributions or moment bounds ensuring the sign-flipped and original scores share the same limiting normal distribution). Please add an explicit statement of these conditions in the theoretical section and a brief proof sketch showing that the test inversion yields intervals with the claimed coverage.
- [Simulation studies] In the simulation studies, the data-generating processes used to evaluate coverage under misspecification should be described in sufficient detail (including the exact forms of heteroskedasticity or overdispersion and the range of sample sizes) so that readers can assess whether the reported superior performance is robust to the simulation design choices rather than specific to the chosen scenarios.
minor comments (2)
- [Method description] The bisection algorithm for locating interval bounds is mentioned but its convergence tolerance and implementation details (e.g., handling of discrete score flips) are not specified; a short algorithmic description or pseudocode would improve reproducibility.
- [Method description] Notation for the sign-flipped score vector and the resulting test statistic should be introduced once and used consistently; currently the transition from the score contributions to the randomization distribution is somewhat terse.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on our manuscript. We appreciate the positive assessment of the method's potential value. Below we address each major comment in turn. We will revise the manuscript accordingly to incorporate the suggested improvements.
read point-by-point responses
-
Referee: [Theoretical results / asymptotic validity section] The central claim of asymptotic validity under arbitrary variance misspecification rests on the sign-flipping tests having correct asymptotic size and the inversion preserving coverage. The manuscript asserts this result but does not list the precise regularity conditions (e.g., Lindeberg-type conditions on the score contributions or moment bounds ensuring the sign-flipped and original scores share the same limiting normal distribution). Please add an explicit statement of these conditions in the theoretical section and a brief proof sketch showing that the test inversion yields intervals with the claimed coverage.
Authors: We agree that the theoretical section would benefit from a more explicit statement of the regularity conditions and a proof sketch. In the revised manuscript, we will add a dedicated subsection outlining the key assumptions, including Lindeberg-type conditions on the score contributions and moment bounds that ensure the sign-flipped scores converge to the same limiting normal distribution as the original scores. We will also provide a brief proof sketch demonstrating that the inversion of the asymptotically valid tests yields intervals with the desired asymptotic coverage under variance misspecification. This will strengthen the presentation without altering the core results. revision: yes
-
Referee: [Simulation studies] In the simulation studies, the data-generating processes used to evaluate coverage under misspecification should be described in sufficient detail (including the exact forms of heteroskedasticity or overdispersion and the range of sample sizes) so that readers can assess whether the reported superior performance is robust to the simulation design choices rather than specific to the chosen scenarios.
Authors: We acknowledge that additional details on the simulation design would enhance reproducibility and allow readers to better evaluate the robustness of our findings. In the revised version, we will expand the description of the data-generating processes, specifying the exact forms of heteroskedasticity and overdispersion used (e.g., variance functions and parameters), as well as the full range of sample sizes considered. We will also clarify how these choices relate to the real-data applications. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper derives robust confidence intervals for GLMs by inverting sign-flipping tests applied to individual score contributions, then establishes asymptotic validity under variance misspecification via standard Lindeberg CLT arguments on the score process and its randomized counterpart. No equation or claim reduces the interval bounds to a fitted parameter by construction, nor does the central result depend on a self-citation chain or imported uniqueness theorem. The construction is self-contained against external benchmarks (randomization tests and sandwich variance ideas) and does not rename known patterns or smuggle ansatzes. This is the normal honest outcome for a paper whose core contribution is a new application of existing randomization techniques.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Standard regularity conditions for asymptotic normality of score statistics in GLMs
Lean theorems connected to this paper
-
Cost.FunctionalEquation (washburn_uniqueness_aczel) — RS's cost-uniqueness chain is unrelated to score-test constructionwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
S(F_g) = n^{-1/2} X^T W^{1/2} (I−H) F_g V^{-1/2} (Y − μ̂) ... S*(F_g) = S(F_g) / V(S(F_g))^{1/2}
-
Foundation.BranchSelection — α here is a significance level, not the bilinear-branch parameter; no shared structurebranch_selection unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We fix a confidence level 1−α ... invert the one-sided tests H_0: β = β_0 ... a (1−α)-confidence interval for a parameter β is defined as the set of all parameter values β_0 that would not be rejected by a level-α test
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Journal of the American Statistical Association , volume =
De Santis, Riccardo and Goeman, Jelle J and Hemerik, Jesse and Davenport, Samuel and Finos, Livio , title =. Journal of the American Statistical Association , volume =. 2025 , publisher =
work page 2025
-
[2]
Foundations of Linear and Generalized Linear Models , author=. 2015 , publisher=
work page 2015
-
[3]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=
Robust testing in generalized linear models by sign flipping score contributions , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2020 , publisher=
work page 2020
-
[4]
Exact testing with random permutations , author=. Test , volume=. 2018 , publisher=
work page 2018
-
[5]
Salvan, Alessandra and Sartori, Nicola and Pace, Luigi , year =. Modelli
-
[6]
Principles of statistical inference: from a Neo-Fisherian perspective , author=. 1997 , publisher=
work page 1997
-
[7]
Confidence intervals for discrete distributions , author=. 1984 , number=
work page 1984
-
[8]
Multivariate Permutation Tests: with Applications in Biostatistics , isbn =
Pesarin, Fortunato , year =. Multivariate Permutation Tests: with Applications in Biostatistics , isbn =
-
[9]
Erickson, Bradley J. and Kirk, Shanah and Lee, Y. and Bathe, Oliver and Kearns, Melissa and Gerdes, C. and Rieger-Christ, Kimberly and Lemmerman, John , year =. The
-
[10]
Współczesna Onkologia , author =
Review. Współczesna Onkologia , author =. 2015 , pages =. doi:10.5114/wo.2014.47136 , urldate =
-
[11]
Confidence. Biometrics , author =. 1996 , pages =. doi:10.2307/2532852 , number =
-
[12]
Journal of the American Statistical Association , author =
On Obtaining Permutation Distributions in Polynomial Time , volume =. Journal of the American Statistical Association , author =. 1983 , pages =. doi:10.1080/01621459.1983.10477990 , language =
-
[13]
Journal of the American Statistical Association , author =
On. Journal of the American Statistical Association , author =. 1984 , pages =. doi:10.1080/01621459.1984.10477085 , language =
-
[14]
Journal of Computational and Graphical Statistics , author =
Fast Conservative. Journal of Computational and Graphical Statistics , author =. 2025 , pages =. doi:10.1080/10618600.2025.2526416 , language =
- [15]
-
[16]
Proceedings of the National Academy of Sciences , volume=
Universal inference , author=. Proceedings of the National Academy of Sciences , volume=. 2020 , publisher=
work page 2020
-
[17]
A note on universal inference , author=. Stat , volume=. 2022 , publisher=
work page 2022
-
[18]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=
The HulC: confidence regions from convex hulls , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2024 , publisher=
work page 2024
-
[19]
Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability , pages=
Tests of separate families of hypotheses , author=. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability , pages=
-
[20]
Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=
Further results on tests of separate families of hypotheses , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 1962 , publisher=
work page 1962
- [21]
-
[22]
Maximum Likelihood Estimation of Misspecified Models , volume =
White, Halbert , journal =. Maximum Likelihood Estimation of Misspecified Models , volume =
-
[23]
Proceedings of the National Academy of Sciences , volume=
On the role of parameterization in models with a misspecified nuisance component , author=. Proceedings of the National Academy of Sciences , volume=. 2024 , publisher=
work page 2024
-
[24]
Studies in Nonlinear Dynamics & Econometrics , author =
Likelihood-ratio-based confidence intervals for multiple threshold parameters , volume =. Studies in Nonlinear Dynamics & Econometrics , author =. 2025 , pages =. doi:10.1515/snde-2023-0029 , language =
-
[25]
Ecology and Society , author =
Green without envy: how social capital alleviates tensions from a. Ecology and Society , author =. 2018 , pages =. doi:10.5751/ES-10181-230410 , language =
-
[26]
Communications in Statistics - Theory and Methods , author =
Bootstrapping some. Communications in Statistics - Theory and Methods , author =. 2023 , pages =. doi:10.1080/03610926.2021.1955389 , language =
-
[27]
Debiased lasso for generalized linear models with a diverging number of covariates , volume =. Biometrics , author =. 2023 , pages =. doi:10.1111/biom.13587 , language =
-
[28]
Improving confidence interval estimation in logistic regression with multicollinear predictors: a comparative study of shrinkage estimators and application to prostate cancer data , volume =. Stats , author =. 2026 , pages =. doi:10.3390/stats9010011 , language =
-
[29]
Accurate confidence and bayesian interval estimation for non-centrality parameters and effect size indices , volume =. Psychometrika , author =. 2023 , pages =. doi:10.1007/s11336-022-09899-x , language =
-
[30]
Technical and biological variance structure in. BMC Genomics , author =. 2012 , pages =. doi:10.1186/1471-2164-13-304 , language =
-
[31]
and Sebastian, Jessalyn and Minin, Volodymyr M
Němcová, Barbora and Goldstein, Isaac H. and Sebastian, Jessalyn and Minin, Volodymyr M. and Bracher, Johannes , copyright =. Unjustified. doi:10.1101/2025.07.31.25332479 , year =
-
[32]
Robust inference for generalized linear mixed models: a “two-stage summary statistics” approach based on score sign flipping , author=. Psychometrika , volume=. 2025 , publisher=
work page 2025
-
[33]
Mathematical proceedings of the cambridge philosophical society , volume=
Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation , author=. Mathematical proceedings of the cambridge philosophical society , volume=. 1948 , organization=
work page 1948
-
[34]
Communications in Statistics-Theory and Methods , volume=
A comment on locally most powerful tests in the presence of nuisance parameters , author=. Communications in Statistics-Theory and Methods , volume=. 2002 , publisher=
work page 2002
-
[35]
Small-sample adjustments for wald-type tests using sandwich estimators , volume =. Biometrics , author =. 2001 , pages =. doi:10.1111/j.0006-341X.2001.01198.x , language =
-
[36]
Various Versatile Variances: An Object-Oriented Implementation of Clustered Covariances in
Achim Zeileis and Susanne K\"oll and Nathaniel Graham , journal =. Various Versatile Variances: An Object-Oriented Implementation of Clustered Covariances in. 2020 , volume =
work page 2020
-
[37]
Journal of Statistical Software , year =
Object-Oriented Computation of Sandwich Estimators , author =. Journal of Statistical Software , year =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.