pith. sign in

arxiv: 2605.03060 · v1 · submitted 2026-05-04 · 📊 stat.ME

Robust confidence intervals for generalized linear models

Pith reviewed 2026-05-08 17:57 UTC · model grok-4.3

classification 📊 stat.ME
keywords generalized linear modelsconfidence intervalsrobust inferencesign-flippingvariance misspecificationscore testsRNA-seq
0
0 comments X

The pith

Sign-flipping individual score contributions yields asymptotically valid confidence intervals for generalized linear models under variance misspecification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Generalized linear models are widely used for biomedical data such as RNA-seq counts, yet standard confidence intervals for their parameters often undercover when variances deviate from the assumed mean-variance link through overdispersion or heteroskedasticity. The paper develops intervals by inverting tests formed through sign-flipping of each observation's contribution to the score function, with bounds found by bisection. It proves these intervals retain correct asymptotic coverage for any variance structure. Simulations confirm reliable performance where Wald intervals fail, and the method is demonstrated on differential expression analysis of cancer RNA-sequencing data exhibiting pervasive heterogeneous variability.

Core claim

Inverting hypothesis tests obtained by sign-flipping the individual score contributions produces confidence intervals whose asymptotic coverage remains valid under general variance misspecification in generalized linear models; the resulting intervals achieve reliable coverage in simulations and outperform standard Wald-type intervals when the mean-variance relationship is violated.

What carries the argument

Inversion via bisection of sign-flipping tests applied to individual score contributions from the GLM estimating equations.

If this is right

  • The intervals maintain nominal coverage rates in finite samples when variances are misspecified in ways common to count data.
  • They deliver shorter intervals or higher power than Wald intervals in the same misspecified settings.
  • The procedure applies directly to high-dimensional differential expression analyses without requiring parametric variance modeling.
  • Bisection search reliably locates the interval endpoints once the sign-flip test statistic is computable.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same sign-flip inversion could be applied to score-based tests in other estimating-equation settings such as generalized estimating equations for clustered data.
  • In practice the method may reduce the need for separate overdispersion parameters or quasi-likelihood adjustments when the goal is interval estimation rather than point estimation.
  • If the score contributions can be computed efficiently, the approach scales to the sample sizes typical of modern sequencing experiments without additional tuning parameters.

Load-bearing premise

Inverting sign-flipped versions of the individual score contributions produces tests whose acceptance regions, when collected, yield intervals with the claimed asymptotic coverage even under completely arbitrary heteroskedasticity or overdispersion.

What would settle it

A Monte Carlo experiment in which the empirical coverage of the proposed intervals falls materially below the nominal level across repeated samples drawn from a GLM with severe, observation-specific variance inflation would falsify the asymptotic validity result.

Figures

Figures reproduced from arXiv: 2605.03060 by Andrea Panarotto, Livio Finos, Riccardo De Santis.

Figure 1
Figure 1. Figure 1: Values of fp(β0) on an equispaced grid of values between βˆ obs − 1 and βˆ obs. The n observations have been generated from a logistic model. The figure shows in red the points with an associated p-value lower than α/2, and in black the points that should belong to the confidence set. With n = 50 (left) the p-value function is monotonic. With n = 20 (right), the function is non-monotonic and the bisection … view at source ↗
Figure 2
Figure 2. Figure 2: Coverage probabilities of the confidence intervals built from in view at source ↗
Figure 3
Figure 3. Figure 3: Median of the interval width of the confidence intervals built from view at source ↗
Figure 4
Figure 4. Figure 4: Distributions of the amplitudes of the confidence intervals accord view at source ↗
Figure 5
Figure 5. Figure 5: Amplitude comparison between flip and sandwich-based intervals view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of the overlaps between the flip and sandwich-based view at source ↗
Figure 7
Figure 7. Figure 7: Overlap between the confidence intervals built with Poisson and view at source ↗
read the original abstract

Reliable uncertainty quantification is a central challenge in the analysis of modern biomedical data, where complex sources of variability often violate standard modeling assumptions. In generalized linear models (GLMs), confidence intervals for regression parameters provide such information, but they typically rely on correct specification of the mean-variance relationship. However, overdispersion, heteroskedasticity, and unobserved biological variability can lead to substantial undercoverage in practice. We propose a method for constructing confidence intervals that remains valid under variance misspecification. The approach is based on the inversion of hypothesis tests obtained by sign-flipping individual score contributions and uses a bisection algorithm to determine the interval bounds. The resulting intervals inherit robustness properties from the underlying tests, and we establish their asymptotic validity under general variance misspecification. Through simulation studies, we show that the proposed method achieves reliable coverage and outperforms standard Wald-type intervals when model assumptions are violated. We illustrate the approach in a differential expression analysis of RNA-sequencing data from a cancer study, where heterogeneous variability is pervasive and parametric methods can yield inconsistent inference. The proposed framework provides a practical and robust alternative to conventional quasi-likelihood or Wald-based methods for interval estimation in GLMs, particularly suited to high-throughput biomedical applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes constructing robust confidence intervals for regression parameters in generalized linear models by inverting hypothesis tests formed via sign-flipping of individual score contributions, with bounds located by bisection. It claims that the resulting intervals are asymptotically valid under general variance misspecification (heteroskedasticity or overdispersion) while the mean model remains correct, supports this with simulation studies showing reliable coverage that outperforms Wald-type intervals, and illustrates the method on RNA-sequencing differential expression data from a cancer study.

Significance. If the asymptotic validity result holds, the approach supplies a practical, randomization-based alternative to quasi-likelihood or sandwich estimators for interval estimation in GLMs when variance assumptions fail, which is common in high-throughput biomedical data. The sign-flipping construction on scores is a clean way to achieve robustness without estimating extra variance parameters, and the simulation evidence plus real-data example add to its applied value.

major comments (2)
  1. [Theoretical results / asymptotic validity section] The central claim of asymptotic validity under arbitrary variance misspecification rests on the sign-flipping tests having correct asymptotic size and the inversion preserving coverage. The manuscript asserts this result but does not list the precise regularity conditions (e.g., Lindeberg-type conditions on the score contributions or moment bounds ensuring the sign-flipped and original scores share the same limiting normal distribution). Please add an explicit statement of these conditions in the theoretical section and a brief proof sketch showing that the test inversion yields intervals with the claimed coverage.
  2. [Simulation studies] In the simulation studies, the data-generating processes used to evaluate coverage under misspecification should be described in sufficient detail (including the exact forms of heteroskedasticity or overdispersion and the range of sample sizes) so that readers can assess whether the reported superior performance is robust to the simulation design choices rather than specific to the chosen scenarios.
minor comments (2)
  1. [Method description] The bisection algorithm for locating interval bounds is mentioned but its convergence tolerance and implementation details (e.g., handling of discrete score flips) are not specified; a short algorithmic description or pseudocode would improve reproducibility.
  2. [Method description] Notation for the sign-flipped score vector and the resulting test statistic should be introduced once and used consistently; currently the transition from the score contributions to the randomization distribution is somewhat terse.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We appreciate the positive assessment of the method's potential value. Below we address each major comment in turn. We will revise the manuscript accordingly to incorporate the suggested improvements.

read point-by-point responses
  1. Referee: [Theoretical results / asymptotic validity section] The central claim of asymptotic validity under arbitrary variance misspecification rests on the sign-flipping tests having correct asymptotic size and the inversion preserving coverage. The manuscript asserts this result but does not list the precise regularity conditions (e.g., Lindeberg-type conditions on the score contributions or moment bounds ensuring the sign-flipped and original scores share the same limiting normal distribution). Please add an explicit statement of these conditions in the theoretical section and a brief proof sketch showing that the test inversion yields intervals with the claimed coverage.

    Authors: We agree that the theoretical section would benefit from a more explicit statement of the regularity conditions and a proof sketch. In the revised manuscript, we will add a dedicated subsection outlining the key assumptions, including Lindeberg-type conditions on the score contributions and moment bounds that ensure the sign-flipped scores converge to the same limiting normal distribution as the original scores. We will also provide a brief proof sketch demonstrating that the inversion of the asymptotically valid tests yields intervals with the desired asymptotic coverage under variance misspecification. This will strengthen the presentation without altering the core results. revision: yes

  2. Referee: [Simulation studies] In the simulation studies, the data-generating processes used to evaluate coverage under misspecification should be described in sufficient detail (including the exact forms of heteroskedasticity or overdispersion and the range of sample sizes) so that readers can assess whether the reported superior performance is robust to the simulation design choices rather than specific to the chosen scenarios.

    Authors: We acknowledge that additional details on the simulation design would enhance reproducibility and allow readers to better evaluate the robustness of our findings. In the revised version, we will expand the description of the data-generating processes, specifying the exact forms of heteroskedasticity and overdispersion used (e.g., variance functions and parameters), as well as the full range of sample sizes considered. We will also clarify how these choices relate to the real-data applications. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper derives robust confidence intervals for GLMs by inverting sign-flipping tests applied to individual score contributions, then establishes asymptotic validity under variance misspecification via standard Lindeberg CLT arguments on the score process and its randomized counterpart. No equation or claim reduces the interval bounds to a fitted parameter by construction, nor does the central result depend on a self-citation chain or imported uniqueness theorem. The construction is self-contained against external benchmarks (randomization tests and sandwich variance ideas) and does not rename known patterns or smuggle ansatzes. This is the normal honest outcome for a paper whose core contribution is a new application of existing randomization techniques.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on abstract only; no free parameters, invented entities, or non-standard axioms are explicitly introduced. The approach relies on standard asymptotic theory for score tests.

axioms (1)
  • standard math Standard regularity conditions for asymptotic normality of score statistics in GLMs
    Invoked to establish asymptotic validity of the inverted tests under variance misspecification.

pith-pipeline@v0.9.0 · 5511 in / 1170 out tokens · 36450 ms · 2026-05-08T17:57:05.793468+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages

  1. [1]

    Journal of the American Statistical Association , volume =

    De Santis, Riccardo and Goeman, Jelle J and Hemerik, Jesse and Davenport, Samuel and Finos, Livio , title =. Journal of the American Statistical Association , volume =. 2025 , publisher =

  2. [2]

    2015 , publisher=

    Foundations of Linear and Generalized Linear Models , author=. 2015 , publisher=

  3. [3]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Robust testing in generalized linear models by sign flipping score contributions , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2020 , publisher=

  4. [4]

    Test , volume=

    Exact testing with random permutations , author=. Test , volume=. 2018 , publisher=

  5. [5]

    Salvan, Alessandra and Sartori, Nicola and Pace, Luigi , year =. Modelli

  6. [6]

    1997 , publisher=

    Principles of statistical inference: from a Neo-Fisherian perspective , author=. 1997 , publisher=

  7. [7]

    1984 , number=

    Confidence intervals for discrete distributions , author=. 1984 , number=

  8. [8]

    Multivariate Permutation Tests: with Applications in Biostatistics , isbn =

    Pesarin, Fortunato , year =. Multivariate Permutation Tests: with Applications in Biostatistics , isbn =

  9. [9]

    and Kirk, Shanah and Lee, Y

    Erickson, Bradley J. and Kirk, Shanah and Lee, Y. and Bathe, Oliver and Kearns, Melissa and Gerdes, C. and Rieger-Christ, Kimberly and Lemmerman, John , year =. The

  10. [10]

    Współczesna Onkologia , author =

    Review. Współczesna Onkologia , author =. 2015 , pages =. doi:10.5114/wo.2014.47136 , urldate =

  11. [11]

    Biometrics , author =

    Confidence. Biometrics , author =. 1996 , pages =. doi:10.2307/2532852 , number =

  12. [12]

    Journal of the American Statistical Association , author =

    On Obtaining Permutation Distributions in Polynomial Time , volume =. Journal of the American Statistical Association , author =. 1983 , pages =. doi:10.1080/01621459.1983.10477990 , language =

  13. [13]

    Journal of the American Statistical Association , author =

    On. Journal of the American Statistical Association , author =. 1984 , pages =. doi:10.1080/01621459.1984.10477085 , language =

  14. [14]

    Journal of Computational and Graphical Statistics , author =

    Fast Conservative. Journal of Computational and Graphical Statistics , author =. 2025 , pages =. doi:10.1080/10618600.2025.2526416 , language =

  15. [15]

    , year =

    Van Der Vaart, Aad W. , year =. Asymptotic

  16. [16]

    Proceedings of the National Academy of Sciences , volume=

    Universal inference , author=. Proceedings of the National Academy of Sciences , volume=. 2020 , publisher=

  17. [17]

    Stat , volume=

    A note on universal inference , author=. Stat , volume=. 2022 , publisher=

  18. [18]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    The HulC: confidence regions from convex hulls , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2024 , publisher=

  19. [19]

    Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability , pages=

    Tests of separate families of hypotheses , author=. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability , pages=

  20. [20]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Further results on tests of separate families of hypotheses , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 1962 , publisher=

  21. [21]

    , booktitle=

    Huber, Peter J. , booktitle=. 1967 , title =

  22. [22]

    Maximum Likelihood Estimation of Misspecified Models , volume =

    White, Halbert , journal =. Maximum Likelihood Estimation of Misspecified Models , volume =

  23. [23]

    Proceedings of the National Academy of Sciences , volume=

    On the role of parameterization in models with a misspecified nuisance component , author=. Proceedings of the National Academy of Sciences , volume=. 2024 , publisher=

  24. [24]

    Studies in Nonlinear Dynamics & Econometrics , author =

    Likelihood-ratio-based confidence intervals for multiple threshold parameters , volume =. Studies in Nonlinear Dynamics & Econometrics , author =. 2025 , pages =. doi:10.1515/snde-2023-0029 , language =

  25. [25]

    Ecology and Society , author =

    Green without envy: how social capital alleviates tensions from a. Ecology and Society , author =. 2018 , pages =. doi:10.5751/ES-10181-230410 , language =

  26. [26]

    Communications in Statistics - Theory and Methods , author =

    Bootstrapping some. Communications in Statistics - Theory and Methods , author =. 2023 , pages =. doi:10.1080/03610926.2021.1955389 , language =

  27. [27]

    Biometrics , author =

    Debiased lasso for generalized linear models with a diverging number of covariates , volume =. Biometrics , author =. 2023 , pages =. doi:10.1111/biom.13587 , language =

  28. [28]

    Stats , author =

    Improving confidence interval estimation in logistic regression with multicollinear predictors: a comparative study of shrinkage estimators and application to prostate cancer data , volume =. Stats , author =. 2026 , pages =. doi:10.3390/stats9010011 , language =

  29. [29]

    Psychometrika , author =

    Accurate confidence and bayesian interval estimation for non-centrality parameters and effect size indices , volume =. Psychometrika , author =. 2023 , pages =. doi:10.1007/s11336-022-09899-x , language =

  30. [30]

    BMC Genomics , author =

    Technical and biological variance structure in. BMC Genomics , author =. 2012 , pages =. doi:10.1186/1471-2164-13-304 , language =

  31. [31]

    and Sebastian, Jessalyn and Minin, Volodymyr M

    Němcová, Barbora and Goldstein, Isaac H. and Sebastian, Jessalyn and Minin, Volodymyr M. and Bracher, Johannes , copyright =. Unjustified. doi:10.1101/2025.07.31.25332479 , year =

  32. [32]

    two-stage summary statistics

    Robust inference for generalized linear mixed models: a “two-stage summary statistics” approach based on score sign flipping , author=. Psychometrika , volume=. 2025 , publisher=

  33. [33]

    Mathematical proceedings of the cambridge philosophical society , volume=

    Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation , author=. Mathematical proceedings of the cambridge philosophical society , volume=. 1948 , organization=

  34. [34]

    Communications in Statistics-Theory and Methods , volume=

    A comment on locally most powerful tests in the presence of nuisance parameters , author=. Communications in Statistics-Theory and Methods , volume=. 2002 , publisher=

  35. [35]

    Biometrics , author =

    Small-sample adjustments for wald-type tests using sandwich estimators , volume =. Biometrics , author =. 2001 , pages =. doi:10.1111/j.0006-341X.2001.01198.x , language =

  36. [36]

    Various Versatile Variances: An Object-Oriented Implementation of Clustered Covariances in

    Achim Zeileis and Susanne K\"oll and Nathaniel Graham , journal =. Various Versatile Variances: An Object-Oriented Implementation of Clustered Covariances in. 2020 , volume =

  37. [37]

    Journal of Statistical Software , year =

    Object-Oriented Computation of Sandwich Estimators , author =. Journal of Statistical Software , year =