Selective Inference via Marginal Screening for High Dimensional Classification

Ichiro Takeuchi; Yuta Umezu

arxiv: 1906.11382 · v1 · pith:K5JHNTW7new · submitted 2019-06-26 · 📊 stat.ME

Selective Inference via Marginal Screening for High Dimensional Classification

Yuta Umezu , Ichiro Takeuchi This is my paper

Pith reviewed 2026-05-25 15:01 UTC · model grok-4.3

classification 📊 stat.ME

keywords selective inferencemarginal screeninglogistic regressionhigh-dimensional classificationpost-selection inferencetype I error controlvariable selection

0 comments

The pith

Deriving the asymptotic behavior of the post-selection logistic estimator after marginal screening enables control of selective type I error in high-dimensional binary classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops selective inference for logistic regression models used in binary classification. It considers variable selection performed by marginal screening and derives the high-dimensional statistical behavior of the resulting post-selection estimator. This derivation supports asymptotic control of the selective type I error rate, which is the type I error conditional on the selection event. A reader would care because ordinary hypothesis tests after selection produce inflated false-positive rates, and the method achieves valid inference without heavy extra computation. Simulation studies are used to examine the power of the resulting tests.

Core claim

By conditioning on the marginal screening procedure, the post-selection estimator in the logistic regression model admits an asymptotic characterization in the high-dimensional regime; this characterization is accurate enough to construct tests that asymptotically control the selective type I error for hypotheses on the selected variables.

What carries the argument

the asymptotic characterization of the post-selection logistic estimator under marginal screening

If this is right

Valid p-values become available for coefficients of variables chosen by marginal screening in logistic models.
Hypothesis tests after selection can be performed while controlling the conditional type I error rate asymptotically.
The procedure applies directly to binary classification without requiring data splitting.
Power comparisons with data splitting and other baselines become feasible under the derived asymptotics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same conditioning strategy might be adapted to other link functions or loss functions beyond logistic regression.
The approach could be combined with screening methods other than simple marginal correlation.
Finite-sample refinements or bootstrap versions might improve accuracy when the high-dimensional approximation is marginal.

Load-bearing premise

The high-dimensional regime and the marginal screening step admit an asymptotic approximation of the post-selection estimator that is sufficiently accurate to control type I error.

What would settle it

A simulation or calculation in which the selective type I error rate of the proposed test exceeds the nominal level under the high-dimensional logistic model with marginal screening.

Figures

Figures reproduced from arXiv: 1906.11382 by Ichiro Takeuchi, Yuta Umezu.

**Figure 2.** Figure 2: Method comparison using simulated data based on 1,000 Monte-Carlo [PITH_FULL_IMAGE:figures/full_fig_p019_2.png] view at source ↗

**Figure 3.** Figure 3: Method comparison using simulated data based on 1,000 Monte-Carlo [PITH_FULL_IMAGE:figures/full_fig_p020_3.png] view at source ↗

**Figure 4.** Figure 4: Method comparison using simulated data based on 1,000 Monte-Carlo [PITH_FULL_IMAGE:figures/full_fig_p021_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison between adjusted selective p-values and nominal p-values. The vertical and horizontal axes represent adjusted p-values and indices of selected variables, respectively, and the black dotted line shows the significance level (α = 0.05). In each figure, black circles and red triangles respectively indicate adjusted nominal p-values and selective p-values. 7 Theoretical Analysis In this section, we … view at source ↗

**Figure 6.** Figure 6: Comparison between adjusted selective p-values and nominal p-values. The vertical and horizontal axes represent adjusted p-values and indices of selected variables, respectively, and the black dotted line shows the significance level (α = 0.05). In each figure, black circles and red triangles respectively indicate adjusted nominal p-values and selective p-values [PITH_FULL_IMAGE:figures/full_fig_p030_6.png] view at source ↗

**Figure 7.** Figure 7: Comparison between adjusted selective p-values and nominal p-values. The vertical and horizontal axes represent adjusted p-values and indices of selected variables, respectively, and the black dotted line shows the significance level (α = 0.05). In each figure, black circles and red triangles respectively indicate adjusted nominal p-values and selective p-values [PITH_FULL_IMAGE:figures/full_fig_p031_7.png] view at source ↗

read the original abstract

Post-selection inference is a statistical technique for determining salient variables after model or variable selection. Recently, selective inference, a kind of post-selection inference framework, has garnered the attention in the statistics and machine learning communities. By conditioning on a specific variable selection procedure, selective inference can properly control for so-called selective type I error, which is a type I error conditional on a variable selection procedure, without imposing excessive additional computational costs. While selective inference can provide a valid hypothesis testing procedure, the main focus has hitherto been on Gaussian linear regression models. In this paper, we develop a selective inference framework for binary classification problem. We consider a logistic regression model after variable selection based on marginal screening, and derive the high dimensional statistical behavior of the post-selection estimator. This enables us to asymptotically control for selective type I error for the purposes of hypothesis testing after variable selection. We conduct several simulation studies to confirm the statistical power of the test, and compare our proposed method with data splitting and other methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper derives an asymptotic characterization of post-selection logistic estimators after marginal screening to control selective type I error in classification.

read the letter

The main thing here is that the authors extend selective inference from Gaussian linear models to logistic regression after marginal screening, deriving the high-dimensional limiting behavior of the post-selection estimator so they can asymptotically control selective type I error for testing after selection. That is the concrete new piece, and it targets a practical workflow since marginal screening is common before fitting classifiers on high-dimensional binary data. They also run simulations to check power and compare against data splitting, which is a reasonable way to show the method is not just theoretical. The argument does not appear circular on its face, and the claim is set up to be checked against the simulations they describe. The soft spot is that the abstract gives no explicit conditions, rates, or derivation outline for the asymptotics, so it is hard to judge how restrictive the high-dimensional regime needs to be or how accurate the approximation is in finite samples where logistic models can behave unevenly. That is the part that would need the closest look in the full text. This is for readers working on post-selection inference or high-dimensional variable selection in classification settings. Someone already using selective inference methods would get direct value from the extension if the math holds. It deserves peer review because the extension is substantive, the target problem is real, and the simulations provide an independent check on the claim.

Referee Report

2 major / 1 minor

Summary. The paper develops a selective inference framework for binary classification under a logistic regression model after marginal screening in high dimensions. It claims to derive the asymptotic behavior of the post-selection estimator, which is then used to asymptotically control selective type I error for post-selection hypothesis testing. Simulation studies are presented to assess statistical power and compare against data splitting and other methods.

Significance. If the claimed asymptotic characterization holds under verifiable conditions, the work would extend selective inference beyond Gaussian linear models to classification settings, addressing a relevant gap. The approach avoids the computational burden of exact conditioning while targeting selective error control, and the simulations offer empirical checks on power.

major comments (2)

[Abstract] Abstract: the central claim rests on deriving the high-dimensional statistical behavior of the post-selection logistic estimator after marginal screening to achieve asymptotic selective type I error control, yet no explicit limiting distribution, regularity conditions on the screening threshold or signal strength, or error bounds are stated; without these the accuracy for type I error control cannot be assessed.
[Abstract] The weakest assumption (high-dimensional regime and marginal screening admitting an asymptotic characterization accurate enough for type I error control) is load-bearing but left implicit; a concrete statement of the regime (e.g., p/n rates, minimum signal strength) and the form of the limiting law is required to evaluate whether the control is valid or reduces to a data-dependent quantity.

minor comments (1)

[Abstract] Abstract: 'garnered the attention in the statistics' should read 'garnered attention in the statistics'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and the focus on clarity in the abstract. The two major comments both concern the need for more explicit statements of the asymptotic regime, limiting distribution, and conditions. We address them point by point below and will revise the abstract accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim rests on deriving the high-dimensional statistical behavior of the post-selection logistic estimator after marginal screening to achieve asymptotic selective type I error control, yet no explicit limiting distribution, regularity conditions on the screening threshold or signal strength, or error bounds are stated; without these the accuracy for type I error control cannot be assessed.

Authors: We agree that the abstract, as a concise overview, does not spell out the limiting distribution or the precise regularity conditions. These derivations appear in Sections 3–4 of the manuscript. To strengthen the abstract, we will add a sentence stating that the post-selection estimator is asymptotically normal with explicit mean and variance that depend on the selection event, under the conditions given in the main text. revision: yes
Referee: [Abstract] The weakest assumption (high-dimensional regime and marginal screening admitting an asymptotic characterization accurate enough for type I error control) is load-bearing but left implicit; a concrete statement of the regime (e.g., p/n rates, minimum signal strength) and the form of the limiting law is required to evaluate whether the control is valid or reduces to a data-dependent quantity.

Authors: The manuscript works under the high-dimensional regime in which n, p → ∞ with p/n → γ ∈ (0,1) and a minimum signal-strength condition that ensures the marginal screening step selects the relevant variables with probability approaching one. The limiting law is normal with parameters that are functions of the observed selection event. We will revise the abstract to include a brief statement of this regime and the form of the limiting distribution. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper's central claim is an asymptotic characterization of the post-selection logistic estimator after marginal screening that enables selective type I error control. The abstract and provided context describe deriving new high-dimensional limiting behavior under the screening procedure without any quoted equations or steps that reduce the result to a fitted parameter, self-citation chain, or input by construction. No self-definitional, fitted-input, or uniqueness-imported patterns are exhibited. The derivation is presented as introducing independent limiting results, making the analysis self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only abstract available; no explicit free parameters, axioms, or invented entities can be extracted.

pith-pipeline@v0.9.0 · 5694 in / 856 out tokens · 19022 ms · 2026-05-25T15:01:40.530992+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

5 extracted references · 5 canonical work pages · 2 internal anchors

[1]

A knockoff filter for high-dimensional selective inference

Barber, R. F. and Cand` es, E. J. (2016) “A knockoﬀ ﬁlter for high-dimensional selective inference,” arXiv preprint arXiv:1602.03574. Berk, R., Brown, L., Buja, A., Zhang, K., and Zhao, L. (2013) “Valid post- selection inference,” The Annals of Statistics , Vol. 41, pp. 802–837. Bickel, P. J., Ritov, Y., and Tsybakov, A. B. (2009) “Simultaneous analysis o...

work page internal anchor Pith review Pith/arXiv arXiv 2016
[2]

The little bootstrap and other methods for dimensionality selection in regression: X-ﬁxed prediction error,

Breiman, L. (1992) “The little bootstrap and other methods for dimensionality selection in regression: X-ﬁxed prediction error,” Journal of the American Statistical Association, Vol. 87, pp. 738–754. Cox, D. (1975) “A note on data-splitting for the evaluation of signiﬁcance levels,” Biometrika, Vol. 62, pp. 441–444. Dasgupta, S., Khare, K., and Ghosh, M. ...

work page 1992
[3]

Optimal Inference After Model Selection

Fithian, W., Sun, D., and Taylor, J. (2014) “Optimal inference after model selection,” arXiv preprint arXiv:1410.2597. Huang, J., Horowitz, J. L., and Ma, S. (2008) “Asymptotic properties of bridge estimators in sparse high-dimensional regression models,” The Annals of Statistics, Vol. 36, pp. 587–613. Huber, P. J. (1973) “Robust regression: asymptotics, ...

work page internal anchor Pith review Pith/arXiv arXiv 2014
[4]

p-values for high- dimensional regression,

Meinshausen, N., Meier, L., and B¨ uhlmann, P. (2009) “ p-values for high- dimensional regression,” Journal of the American Statistical Association , Selective Inference via Marginal Screening for High Dimensional Classiﬁcation 29 Vol. 104, pp. 1671–1681. Suzumura, S., Nakagawa, K., Umezu, Y., Tsuda, K., and Takeuchi, I. (2017) “Selective inference for sp...

work page 2009
[5]

Asymptotics of selective inference,

Tian, X. and Taylor, J. (2017) “Asymptotics of selective inference,” Scandi- navian Journal of Statistics , Vol. 44, pp. 480–499. Tibshirani, R. (1996) “Regression shrinkage and selection via the lasso,” Jour- nal of the Royal Statistical Society: Series B , Vol. 58, pp. 267–288. Wasserman, L. and Roeder, K. (2009) “High dimensional variable selection,” T...

work page 2017

[1] [1]

A knockoff filter for high-dimensional selective inference

Barber, R. F. and Cand` es, E. J. (2016) “A knockoﬀ ﬁlter for high-dimensional selective inference,” arXiv preprint arXiv:1602.03574. Berk, R., Brown, L., Buja, A., Zhang, K., and Zhao, L. (2013) “Valid post- selection inference,” The Annals of Statistics , Vol. 41, pp. 802–837. Bickel, P. J., Ritov, Y., and Tsybakov, A. B. (2009) “Simultaneous analysis o...

work page internal anchor Pith review Pith/arXiv arXiv 2016

[2] [2]

The little bootstrap and other methods for dimensionality selection in regression: X-ﬁxed prediction error,

Breiman, L. (1992) “The little bootstrap and other methods for dimensionality selection in regression: X-ﬁxed prediction error,” Journal of the American Statistical Association, Vol. 87, pp. 738–754. Cox, D. (1975) “A note on data-splitting for the evaluation of signiﬁcance levels,” Biometrika, Vol. 62, pp. 441–444. Dasgupta, S., Khare, K., and Ghosh, M. ...

work page 1992

[3] [3]

Optimal Inference After Model Selection

Fithian, W., Sun, D., and Taylor, J. (2014) “Optimal inference after model selection,” arXiv preprint arXiv:1410.2597. Huang, J., Horowitz, J. L., and Ma, S. (2008) “Asymptotic properties of bridge estimators in sparse high-dimensional regression models,” The Annals of Statistics, Vol. 36, pp. 587–613. Huber, P. J. (1973) “Robust regression: asymptotics, ...

work page internal anchor Pith review Pith/arXiv arXiv 2014

[4] [4]

p-values for high- dimensional regression,

Meinshausen, N., Meier, L., and B¨ uhlmann, P. (2009) “ p-values for high- dimensional regression,” Journal of the American Statistical Association , Selective Inference via Marginal Screening for High Dimensional Classiﬁcation 29 Vol. 104, pp. 1671–1681. Suzumura, S., Nakagawa, K., Umezu, Y., Tsuda, K., and Takeuchi, I. (2017) “Selective inference for sp...

work page 2009

[5] [5]

Asymptotics of selective inference,

Tian, X. and Taylor, J. (2017) “Asymptotics of selective inference,” Scandi- navian Journal of Statistics , Vol. 44, pp. 480–499. Tibshirani, R. (1996) “Regression shrinkage and selection via the lasso,” Jour- nal of the Royal Statistical Society: Series B , Vol. 58, pp. 267–288. Wasserman, L. and Roeder, K. (2009) “High dimensional variable selection,” T...

work page 2017