Zero-Inflated Logistic Regression Models with Shared Design: Identifiability, Existence of Estimates, and a Relabeling Rule

Daisuke Yoneoka; Shinto Eguchi; Yui Tomo

arxiv: 2604.20322 · v1 · submitted 2026-04-22 · 📊 stat.ME

Zero-Inflated Logistic Regression Models with Shared Design: Identifiability, Existence of Estimates, and a Relabeling Rule

Yui Tomo , Shinto Eguchi , Daisuke Yoneoka This is my paper

Pith reviewed 2026-05-10 00:10 UTC · model grok-4.3

classification 📊 stat.ME

keywords zero-inflated logistic regressionshared design matrixidentifiabilitymaximum likelihood estimationrelabeling ruleexcess zerosmixture modelsbinary responses

0 comments

The pith

Zero-inflated logistic regression with shared design is identifiable up to exchange symmetry of its two component parameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves that zero-inflated logistic regression models using the same design matrix for both the binary response regression and the latent zero-inflation indicator are not fully identifiable in the usual sense. Instead, the parameters become identifiable once one accounts for the fact that the two components can be swapped without changing the observed data distribution. This resolves a known obstacle in applied work where no distinguishing covariates are available to separate the components. The authors further show that ignoring zero-inflation entirely can reverse the sign of the pseudo-true regression coefficient and supply conditions guaranteeing existence of the maximum likelihood estimate. They also give a simple relabeling rule that selects one ordered parameter pair from the bimodal posterior obtained by MCMC sampling.

Core claim

Under the shared-design setting the zero-inflated logistic regression model is identifiable up to exchange symmetry of the parameters for the two components, and the expected log-likelihood has a unique maximizer on the resulting quotient space. Sufficient conditions are established for existence of the maximum likelihood estimate. The posterior bimodality is examined with a Pólya-Gamma Gibbs sampler augmented by replica exchange, and a relabeling rule is proposed to select a single ordered parameter pair.

What carries the argument

The quotient space formed by identifying pairs of regression parameters that differ only by exchange of the two mixture components.

If this is right

Ignoring the zero-inflation mechanism produces a sign reversal in the pseudo-true value of the regression coefficient.
The maximum likelihood estimate exists once the stated sufficient conditions on the design matrix and response probabilities hold.
The relabeling rule recovers a unique ordered parameter estimate from the bimodal posterior produced by the Pólya-Gamma sampler.
The procedure is shown to work in simulation studies and on self-reported diabetes data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

In routine analysis the relabeling step should be applied automatically so that reported coefficients are comparable across studies.
The symmetry argument may extend directly to other zero-inflated generalized linear models that share the same covariate matrix.
When even one covariate is allowed to differ between the two components, the model regains ordinary pointwise identifiability.
The sign-flip result implies that naive logistic regression on excess-zero data can systematically mis-state the direction of an effect.

Load-bearing premise

The latent mixture correctly captures the zero-inflation mechanism and the shared design matrix supplies no information that distinguishes the two components.

What would settle it

A concrete dataset in which the likelihood surface contains modes whose parameter values are not interchangeable by swapping the two components would falsify the claimed identifiability up to symmetry.

Figures

Figures reproduced from arXiv: 2604.20322 by Daisuke Yoneoka, Shinto Eguchi, Yui Tomo.

**Figure 1.** Figure 1: PCA plots of posterior samples after k-means clustering (k = 2). Each panel corresponds to a different covariate scenario: (a) Scenario 1: all 5 covariates were drawn from independent standard normal distributions; (b) Scenario 2: all 5 covariates were drawn from independent Bernoulli(0.5) distributions; (c) Scenario 3: the first two non-intercept covariates were drawn from independent standard normal dis… view at source ↗

**Figure 2.** Figure 2: Boxplots of parameter bias (estimate minus true value) for [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

**Figure 3.** Figure 3: Boxplots of parameter bias (estimate minus true value) for [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: Trace plots of each parameter (Scenario 1). [PITH_FULL_IMAGE:figures/full_fig_p028_4.png] view at source ↗

**Figure 5.** Figure 5: The histograms of the posterior distributions for each parameter (Scenario 1). [PITH_FULL_IMAGE:figures/full_fig_p029_5.png] view at source ↗

**Figure 6.** Figure 6: Trace plots of each parameter (Scenario 2). [PITH_FULL_IMAGE:figures/full_fig_p030_6.png] view at source ↗

**Figure 7.** Figure 7: The histograms of the posterior distributions for each parameter (Scenario 2). [PITH_FULL_IMAGE:figures/full_fig_p031_7.png] view at source ↗

**Figure 8.** Figure 8: Trace plots of each parameter (Scenario 3). [PITH_FULL_IMAGE:figures/full_fig_p032_8.png] view at source ↗

**Figure 9.** Figure 9: The histograms of the posterior distributions for each parameter (Scenario 3). [PITH_FULL_IMAGE:figures/full_fig_p033_9.png] view at source ↗

read the original abstract

The zero-inflated logistic regression model accommodates binary responses with excess zeros, which often arise from a latent mixture of susceptible and insusceptible subpopulations or asymmetric misclassification of the response. The model has two components: regression for the binary response and a latent binary indicator for the zero-inflation state. In applied settings, it is common to use the same design matrix for both components if there is no prior knowledge. However, this shared-design specification lacks guaranteed identifiability of the regression parameters, as established in prior works. This paper investigates the theoretical properties of the zero-inflated logistic regression model under the shared-design setting and computational methods for applications. First, to motivate the use of the zero-inflated model, we prove that ignoring the zero-inflation mechanism can lead to a sign flip in the pseudo-true coefficient value relative to the true value. We then establish sufficient conditions for the existence of the maximum likelihood estimate. As a main result, we establish that the model under the shared-design setting is identifiable up to exchange symmetry of the parameters for two components and that the expected log-likelihood has a unique maximizer on the resulting quotient space. The posterior bimodality is examined using a P\'olya-Gamma Gibbs sampler with replica exchange. Finally, we propose a simple relabeling rule to select a single ordered parameter pair, and evaluate its performance through simulation studies and an application to self-reported diabetes data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives identifiability up to label swap plus a concrete relabeling fix for shared-design zero-inflated logistic regression, along with existence conditions and a sign-flip warning.

read the letter

The central contribution is the proof that the shared-design zero-inflated logistic model is identifiable only up to exchange of the two component parameter vectors, together with a simple relabeling rule that resolves the resulting posterior bimodality. They also show that ordinary logistic regression can produce a sign reversal on the coefficients when zero-inflation is ignored, and they supply sufficient conditions on the design matrix for the MLE to exist. The expected log-likelihood is shown to have a unique maximizer on the quotient space after accounting for the symmetry. These results are checked with a Pólya-Gamma Gibbs sampler plus replica exchange and then tested in simulations and on self-reported diabetes data. The relabeling step itself is straightforward and appears to work reliably in the reported examples. The conditions on the design matrix are explicit enough to be usable, though they may rule out some common applied settings with collinear or low-variation predictors. The sign-reversal result is a useful reminder rather than a deep surprise. The overall argument stays at the population level and does not rely on circular or self-referential constructions. This work is aimed at statisticians who routinely fit zero-inflated models with shared covariates and who care about stable inference rather than just point estimates. Readers working on identifiability questions in latent-variable logistic models will find the quotient-space treatment and the relabeling procedure directly useful. The paper is focused and technically grounded enough to merit peer review; the theoretical claims address a documented gap and the practical fix is cheap to implement.

Referee Report

0 major / 4 minor

Summary. The paper studies zero-inflated logistic regression models with a shared design matrix for the response and zero-inflation components. It demonstrates that ignoring zero-inflation can lead to sign reversal in the pseudo-true parameters. Sufficient conditions are derived for the existence of the MLE. The main theoretical contribution is proving identifiability up to exchange symmetry of the two component parameters and the uniqueness of the maximizer of the expected log-likelihood on the quotient space. A relabeling rule is introduced to handle label switching, with validation through Pólya-Gamma Gibbs sampling with replica exchange, simulation studies, and an application to self-reported diabetes data.

Significance. This manuscript addresses a practically relevant issue in statistical modeling by providing theoretical guarantees for identifiability and estimation in zero-inflated logistic regression under the shared-design specification, which is frequently employed when covariate information does not distinguish the components. The proofs for sign reversal upon misspecification, MLE existence, and the unique maximizer on the quotient space, combined with the proposed relabeling rule, offer both foundational insights and practical tools. The analytical validation of the relabeling rule and its empirical performance in simulations and real data enhance the paper's utility for researchers working with mixture models and excess zero data. These contributions are likely to influence both theoretical developments and applied work in the field.

minor comments (4)

§2: The derivation of the sign reversal in the pseudo-true coefficient could benefit from an explicit statement of the conditions under which the flip occurs, to make the motivational result more precise.
§4: In the statement of the identifiability result, the definition of the quotient space under exchange symmetry should be accompanied by a brief remark on how the metric or distance is defined to ensure uniqueness.
Simulation studies: The performance metrics for the relabeling rule in the simulations (e.g., bias, coverage) should be tabulated for different sample sizes to allow clearer assessment of finite-sample behavior.
Figure 2: The plot illustrating posterior bimodality would be improved by adding annotations that label the two modes corresponding to the exchange-symmetric parameter pairs.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our manuscript, the accurate summary of its contributions on identifiability up to exchange symmetry, uniqueness of the maximizer on the quotient space, and the relabeling rule, and the recommendation for minor revision. No specific major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper derives identifiability up to exchange symmetry, existence of the MLE, and a relabeling rule as theoretical properties of the zero-inflated logistic likelihood under shared design. These rest on explicit sufficient conditions on the design matrix and parameter space, plus direct analysis of the expected log-likelihood on the quotient space. No step reduces a claimed result to a fitted parameter, self-referential definition, or load-bearing self-citation chain; the central results are self-contained population-level statements independent of any particular data fit or prior author result invoked as an unverified axiom.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review is based on the abstract alone; no explicit free parameters, invented entities, or non-standard axioms are described. The work relies on standard properties of logistic regression and maximum-likelihood theory.

axioms (1)

standard math Standard regularity conditions for logistic regression likelihood and maximum likelihood estimation
Invoked implicitly to establish existence of estimates and uniqueness on the quotient space.

pith-pipeline@v0.9.0 · 5574 in / 1240 out tokens · 42155 ms · 2026-05-10T00:10:53.815569+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

[1]

and Anderson, J

Albert, A. and Anderson, J. A. (1984). On the existence of maximum likelihood estimates in logistic regression models.Biometrika, 71(1):1–10

work page 1984
[2]

and Vassilvitskii, S

Arthur, D. and Vassilvitskii, S. (2007). K-means++: The advantages of careful seeding.Pro- ceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, 8:1027–1035

work page 2007
[3]

and Kab´ an, A

Bootkrajang, J. and Kab´ an, A. (2012). Label-noise robust logistic regression and its applications. InJoint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 143–158. Springer

work page 2012
[4]

and Kab´ an, A

Bootkrajang, J. and Kab´ an, A. (2013). Classification of mislabelled microarrays using robust sparse logistic regression.Bioinformatics, 29(7):870–877. Centers for Disease Control and Prevention (CDC) (2017–2018). National Center for Health Statistics (NCHS). National Health and Nutrition Examination Survey Data. Hyattsville, MD: U.S. Department of Healt...

work page 2013
[5]

Diop, A., Diop, A., and Dupuy, J.-F. (2011). Maximum likelihood estimation in the logistic regression model with a cure fraction.Electronic Journal of Statistics, 5:460–483. Fr¨ uhwirth-Schnatter, S. (2006).Finite Mixture and Markov Switching Models. Springer, New York

work page 2011
[6]

and Eguchi, S

Fujisawa, H. and Eguchi, S. (2008). Robust parameter estimation with a small bias against heavy contamination.Journal of Multivariate Analysis, 99(9):2053–2081

work page 2008
[7]

Hall, D. B. (2000). Zero-inflated poisson and binomial regression with random effects: A case study.Biometrics, 56(4):1030–1039

work page 2000
[8]

Hung, H., Jou, Z.-Y., and Huang, S.-Y. (2018). Robust mislabel logistic regression without modeling mislabel probabilities.Biometrics, 74(1):145–154

work page 2018
[9]

Komori, O., Eguchi, S., Ikeda, S., Okamura, H., Ichinokawa, M., and Nakayama, S. (2016). An asymmetric logistic regression model for ecological data.Methods in Ecology and Evolution, 7(2):249–260

work page 2016
[10]

and Fidler, V

Nagelkerke, N. and Fidler, V. (2015). Estimating a logistic discrimination functions when one of the training samples is subject to misclassification: A maximum likelihood approach.PLoS One, 10(10):e0140718

work page 2015
[11]

G., Scott, J

Polson, N. G., Scott, J. G., and Windle, J. (2013). Bayesian inference for logistic mod- els using p´ olya–gamma latent variables.Journal of the American Statistical Association, 108(504):1339–1349

work page 2013
[12]

Silvapulle, M. J. (1981). On the existence of maximum likelihood estimators for the binomial response models.Journal of the Royal Statistical Society: Series B, 43(3):310–313

work page 1981
[13]

Swendsen, R. H. and Wang, J.-S. (1986). Replica Monte Carlo simulation of spin-glasses.Physical Review Letters, 57(21):2607

work page 1986
[14]

Teicher, H. (1963). Identifiability of finite mixtures.The Annals of Mathematical Statistics, 34(4):1265–1269

work page 1963
[15]

T., and Wang, X

Wainer, H., Bradlow, E. T., and Wang, X. (2007).Testlet Response Theory and Its Applications. Cambridge University Press. 34

work page 2007
[16]

Yakowitz, S. J. and Spragins, J. D. (1968). On the identifiability of finite mixtures.The Annals of Mathematical Statistics, 39(1):209–214. 35

work page 1968

[1] [1]

and Anderson, J

Albert, A. and Anderson, J. A. (1984). On the existence of maximum likelihood estimates in logistic regression models.Biometrika, 71(1):1–10

work page 1984

[2] [2]

and Vassilvitskii, S

Arthur, D. and Vassilvitskii, S. (2007). K-means++: The advantages of careful seeding.Pro- ceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, 8:1027–1035

work page 2007

[3] [3]

and Kab´ an, A

Bootkrajang, J. and Kab´ an, A. (2012). Label-noise robust logistic regression and its applications. InJoint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 143–158. Springer

work page 2012

[4] [4]

and Kab´ an, A

Bootkrajang, J. and Kab´ an, A. (2013). Classification of mislabelled microarrays using robust sparse logistic regression.Bioinformatics, 29(7):870–877. Centers for Disease Control and Prevention (CDC) (2017–2018). National Center for Health Statistics (NCHS). National Health and Nutrition Examination Survey Data. Hyattsville, MD: U.S. Department of Healt...

work page 2013

[5] [5]

Diop, A., Diop, A., and Dupuy, J.-F. (2011). Maximum likelihood estimation in the logistic regression model with a cure fraction.Electronic Journal of Statistics, 5:460–483. Fr¨ uhwirth-Schnatter, S. (2006).Finite Mixture and Markov Switching Models. Springer, New York

work page 2011

[6] [6]

and Eguchi, S

Fujisawa, H. and Eguchi, S. (2008). Robust parameter estimation with a small bias against heavy contamination.Journal of Multivariate Analysis, 99(9):2053–2081

work page 2008

[7] [7]

Hall, D. B. (2000). Zero-inflated poisson and binomial regression with random effects: A case study.Biometrics, 56(4):1030–1039

work page 2000

[8] [8]

Hung, H., Jou, Z.-Y., and Huang, S.-Y. (2018). Robust mislabel logistic regression without modeling mislabel probabilities.Biometrics, 74(1):145–154

work page 2018

[9] [9]

Komori, O., Eguchi, S., Ikeda, S., Okamura, H., Ichinokawa, M., and Nakayama, S. (2016). An asymmetric logistic regression model for ecological data.Methods in Ecology and Evolution, 7(2):249–260

work page 2016

[10] [10]

and Fidler, V

Nagelkerke, N. and Fidler, V. (2015). Estimating a logistic discrimination functions when one of the training samples is subject to misclassification: A maximum likelihood approach.PLoS One, 10(10):e0140718

work page 2015

[11] [11]

G., Scott, J

Polson, N. G., Scott, J. G., and Windle, J. (2013). Bayesian inference for logistic mod- els using p´ olya–gamma latent variables.Journal of the American Statistical Association, 108(504):1339–1349

work page 2013

[12] [12]

Silvapulle, M. J. (1981). On the existence of maximum likelihood estimators for the binomial response models.Journal of the Royal Statistical Society: Series B, 43(3):310–313

work page 1981

[13] [13]

Swendsen, R. H. and Wang, J.-S. (1986). Replica Monte Carlo simulation of spin-glasses.Physical Review Letters, 57(21):2607

work page 1986

[14] [14]

Teicher, H. (1963). Identifiability of finite mixtures.The Annals of Mathematical Statistics, 34(4):1265–1269

work page 1963

[15] [15]

T., and Wang, X

Wainer, H., Bradlow, E. T., and Wang, X. (2007).Testlet Response Theory and Its Applications. Cambridge University Press. 34

work page 2007

[16] [16]

Yakowitz, S. J. and Spragins, J. D. (1968). On the identifiability of finite mixtures.The Annals of Mathematical Statistics, 39(1):209–214. 35

work page 1968