Optimal Inference After Model Selection

William Fithian , Dennis Sun , Jonathan Taylor

Authors on Pith no claims yet

classification 🧮 math.ST stat.MEstat.TH

keywords inferenceselectionselectiveerrormodelclassicalderivepowerful

read the original abstract

To perform inference after model selection, we propose controlling the selective type I error; i.e., the error rate of a test given that it was performed. By doing so, we recover long-run frequency properties among selected hypotheses analogous to those that apply in the classical (non-adaptive) context. Our proposal is closely related to data splitting and has a similar intuitive justification, but is more powerful. Exploiting the classical theory of Lehmann and Scheff\'e (1955), we derive most powerful unbiased selective tests and confidence intervals for inference in exponential family models after arbitrary selection procedures. For linear regression, we derive new selective z-tests that generalize recent proposals for inference after model selection and improve on their power, and new selective t-tests that do not require knowledge of the error variance.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 8 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Post-ADC Inference: Valid Inference After Active Data Collection
stat.ML 2026-05 unverdicted novelty 7.0

Post-ADC inference supplies valid p-values and confidence intervals for data-dependent targets after active data collection by extending selective inference to correct for both adaptive sampling bias and post-hoc targ...
In-Sample Evaluation of Subgroups Identified by Generic Machine Learning
stat.ME 2026-05 unverdicted novelty 7.0

A conditional adaptive perturbation approach enables valid in-sample inference for machine learning-identified subgroups with nonregular boundaries via triple robustness.
Integrating Diagnostic Checks into Estimation
econ.EM 2026-04 unverdicted novelty 7.0

Residualizing estimators against diagnostic check statistics eliminates selective reporting distortions, reduces variance when the model is correct, and minimizes worst-case bias under local misspecification.
Towards Reliable LLM Evaluation: Correcting the Winner's Curse in Adaptive Benchmarking
stat.ML 2026-05 unverdicted novelty 6.0

SIREN corrects winner's curse bias in adaptive LLM benchmarking via selection-aware repeated splits and bootstrap for valid procedure-level confidence intervals.
A Leakage Bound for Confidence Sets after Black-Box Selection
math.ST 2026-04 unverdicted novelty 6.0

Selected-target noncoverage after black-box selection is bounded by nominal fixed-target noncoverage plus average total variation distance between marginal and conditional laws of the inferential data.
Post-Screening Portfolio Selection
q-fin.PM 2026-04 unverdicted novelty 6.0

A Lasso-based screening step followed by low-dimensional mean-variance optimization on the selected assets improves high-dimensional portfolio construction, with a defactoring extension for strong factors.
Weighted Holm Procedures: Theory, Properties, and Recommendations
stat.ME 2026-04 conditional novelty 5.0

The weighted Holm procedure (WHP) based on ordered weighted p-values is uniformly more powerful than the weighted alternative Holm procedure (WAP) based on ordered raw p-values, with stronger optimality properties und...
Inference conditional on selection: a review
stat.ME 2026-04 unverdicted novelty 2.0

The review covers selective inference techniques that provide conditional guarantees for inference after data-dependent selection, demonstrated with examples from winner inference, regression trees, clustering, and si...