Valid F-screening in linear regression
Pith reviewed 2026-05-19 13:27 UTC · model grok-4.3
The pith
Selective p-values control type 1 error for regression coefficients after conditioning on rejection of the overall F-test.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We develop selective p-values for the coefficients in a least squares linear regression that control the selective Type 1 error, that is, the type 1 error conditional on having rejected the overall null hypothesis via the F-test. These p-values yield consistent tests and are computed using only the standard outputs of ordinary least squares regression. We also supply confidence intervals with nominal selective coverage and point estimates that account for the F-screening step, and we compare the resulting Fisher information to that obtained from sample splitting.
What carries the argument
Selective p-values constructed from the conditional distribution of the least-squares estimates given rejection of the overall null hypothesis.
If this is right
- Tests based on the selective p-values control error rates conditional on having rejected the overall null.
- Confidence intervals attain their nominal coverage level conditional on the F-screen.
- Point estimates can be adjusted to reflect the selection induced by the F-test.
- All quantities can be computed from published regression summary statistics alone.
- The Fisher information under this approach can be compared directly to that from sample splitting.
Where Pith is reading between the lines
- Published regressions that were only interpreted after an F-test could be re-analyzed with these tools to restore valid inference.
- The same conditional-distribution idea could be applied to other global screening tests or to generalized linear models.
- For large numbers of predictors the closed-form conditional distributions remain tractable, but numerical integration may be needed in non-Gaussian settings.
Load-bearing premise
The linear model is correctly specified and the errors are Gaussian.
What would settle it
Simulate data from the Gaussian linear model under a null coefficient, apply the F-screen, and check whether the selective p-value for that coefficient is distributed as uniform on [0,1] conditional on screening.
read the original abstract
Suppose that a data analyst wishes to report the results of a least squares linear regression only if the overall null hypothesis, $H_0^{1:p}: \beta_1= \beta_2 = \ldots = \beta_p=0$, is rejected. This practice, which we refer to as F-screening (since the overall null hypothesis is typically tested using an $F$-statistic), is in fact common across a number of applied fields. Unfortunately, it poses a problem: standard guarantees for the inferential outputs of linear regression, such as Type 1 error control of hypothesis tests and nominal coverage of confidence intervals, hold unconditionally, but fail to hold conditional on rejection of the overall null hypothesis. In this paper, we develop an inferential toolbox for the coefficients in a least squares model that are valid conditional on rejection of the overall null hypothesis. We develop selective p-values that lead to tests that are consistent and control the selective Type 1 error, i.e., the Type 1 error conditional on having rejected the overall null hypothesis. Furthermore, they can be computed without access to the raw data, i.e., using only the standard outputs of a least squares linear regression, and therefore are suitable for use in a retrospective analysis of a published study. We also develop confidence intervals that attain nominal selective coverage, and point estimates that account for having rejected the overall null hypothesis. We derive an expression for the Fisher information about the coefficients resulting from the proposed approach, and compare this to the Fisher information that results from an alternative approach that relies on sample splitting. We investigate the proposed approach in simulation and via re-analysis of two datasets from the biomedical literature.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops selective p-values, confidence intervals, and point estimates for individual regression coefficients that control the selective Type I error (i.e., error conditional on rejection of the global null via the overall F-test). These quantities are derived to be computable from standard least-squares outputs (coefficient estimates, standard errors, and the F-statistic) without requiring the raw data, enabling retrospective analysis of published regressions. The authors also derive the Fisher information under the proposed conditioning and compare it to sample splitting, with supporting simulation studies and re-analyses of two biomedical datasets.
Significance. If the derivations hold, the work provides a practical toolbox for valid post-F-screening inference in linear models, addressing a widespread applied practice where models are reported only after overall significance. The summary-statistic-only computation is a notable strength for re-analysis settings, and the explicit Fisher-information comparison to sample splitting offers a clear efficiency benchmark. Simulation and real-data results help quantify the practical gains over naive approaches.
major comments (2)
- [§3.1, Eq. (7)–(9)] §3.1, Eq. (7)–(9): The selective p-value is obtained by integrating the tail of the usual t-statistic under the law of (β̂, σ̂) conditional on the event {F > c}. This construction uses the joint normality of β̂ and its independence from σ̂, which holds if and only if the errors are i.i.d. Gaussian. The manuscript should state explicitly whether exact selective Type I error control is claimed only under this assumption or whether asymptotic or robust versions are also derived, because the Gaussian requirement is load-bearing for the exact conditional distribution used throughout the paper.
- [§4.2, Algorithm 1] §4.2, Algorithm 1: The claim that the truncation probabilities can be evaluated from standard regression outputs alone presupposes that the relevant quadratic forms and the selection region boundaries can be recovered from the reported β̂, SE(β̂), and F-statistic without the design matrix X. A concrete numerical example or pseudocode showing the exact recovery of the conditioning constants from these summaries would strengthen the retrospective-analysis claim.
minor comments (2)
- The abstract states that the procedures 'can be computed without access to the raw data'; a short parenthetical clarifying that the design matrix is not needed but that the reported (XᵀX)⁻¹ or equivalent information must be available would prevent misinterpretation.
- In the simulation section, the number of Monte Carlo replications and the exact grid of signal strengths used to assess consistency should be stated more prominently so that readers can judge the precision of the reported Type I error and power curves.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which have helped clarify important aspects of our derivations and implementation. We address each major comment below and indicate the planned revisions.
read point-by-point responses
-
Referee: [§3.1, Eq. (7)–(9)] §3.1, Eq. (7)–(9): The selective p-value is obtained by integrating the tail of the usual t-statistic under the law of (β̂, σ̂) conditional on the event {F > c}. This construction uses the joint normality of β̂ and its independence from σ̂, which holds if and only if the errors are i.i.d. Gaussian. The manuscript should state explicitly whether exact selective Type I error control is claimed only under this assumption or whether asymptotic or robust versions are also derived, because the Gaussian requirement is load-bearing for the exact conditional distribution used throughout the paper.
Authors: We agree that the exact finite-sample selective Type I error control and nominal coverage rely on the i.i.d. Gaussian errors assumption, which delivers both the joint normality of the least-squares coefficient vector and its independence from the residual variance estimator. The manuscript works throughout under the classical linear model with these properties and does not derive asymptotic or robust analogues. In the revised version we will add an explicit statement of this modeling assumption in the introduction and at the start of Section 3.1, together with a brief remark that extensions to asymptotic regimes under weaker conditions remain an interesting direction for future research. revision: yes
-
Referee: [§4.2, Algorithm 1] §4.2, Algorithm 1: The claim that the truncation probabilities can be evaluated from standard regression outputs alone presupposes that the relevant quadratic forms and the selection region boundaries can be recovered from the reported β̂, SE(β̂), and F-statistic without the design matrix X. A concrete numerical example or pseudocode showing the exact recovery of the conditioning constants from these summaries would strengthen the retrospective-analysis claim.
Authors: We appreciate the request for greater transparency on the retrospective-analysis procedure. The observed F-statistic directly supplies the value of the quadratic form β̂'(X'X)β̂ that defines the selection event, while the reported standard errors supply the marginal scales needed to evaluate the conditional tail probabilities via one-dimensional numerical integration of the joint distribution of the relevant t-statistic and the F-statistic. To make this explicit, we will insert a short numerical example (using a small simulated regression whose summary statistics are fully reported) and accompanying pseudocode for Algorithm 1 in the revised Section 4.2, demonstrating step-by-step recovery of the truncation constants from β̂, SE(β̂), F, and the degrees of freedom alone. revision: yes
Circularity Check
Selective p-value derivation is self-contained under standard linear model assumptions
full rationale
The paper derives selective p-values and confidence intervals by explicitly conditioning the usual t-statistics on the event {overall F > critical value}, using the known multivariate normal distribution of the least-squares estimator and its independence from the residual variance under i.i.d. Gaussian errors. This is a direct application of the conditional distribution implied by the model assumptions stated in the abstract and methods; it does not reduce any target quantity to a fitted parameter or prior self-citation by construction. The claim that only standard regression outputs are needed follows from the closed-form truncation probabilities under those assumptions rather than from re-labeling inputs. No load-bearing step matches any of the enumerated circularity patterns, and the central inferential guarantees remain independent of the specific fitted values being tested.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Errors are independent and normally distributed with constant variance in the linear model
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We develop selective p-values that lead to tests that are consistent and control the selective Type 1 error, i.e., the Type 1 error conditional on having rejected the overall null hypothesis... using only the standard outputs of a least squares linear regression.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Under the null hypothesis H_M^0, the F-statistic ... follows an F_{m,n-p-1} distribution.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.