Variable selection via knockoffs in missing data settings with categorical predictors
Pith reviewed 2026-05-19 00:54 UTC · model grok-4.3
The pith
Multiple imputation lets knockoff filters select variables from datasets with missing categorical predictors while controlling false discoveries.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that running a standard knockoff filter separately on each multiply-imputed dataset and then aggregating the selection results produces a feasible and effective procedure for variable selection that preserves false-discovery-rate control even when predictors are categorical and the model includes random effects for schools.
What carries the argument
Multiple imputation of missing values followed by independent knockoff filtering on each imputed dataset and aggregation of the results.
If this is right
- The method handles unordered categorical predictors in multilevel models without requiring special modifications to the knockoff construction.
- False-discovery-rate control is maintained in the presence of missing data when the imputations are drawn under standard assumptions.
- Simulation performance matches that of recently proposed alternatives for the same setting.
- The procedure can be applied directly to large-scale assessment data such as student test scores with many background variables and school-level clustering.
Where Pith is reading between the lines
- The same imputation-plus-knockoff workflow could be tested on other observational datasets that combine missingness with categorical covariates and clustering.
- Refinements to the aggregation rule across imputations might increase power while still respecting the original error guarantees.
- The framework suggests a route for bringing knockoff-based selection into other mixed-effects or generalized linear models that currently lack direct knockoff extensions.
Load-bearing premise
That running a standard knockoff filter separately on each multiply-imputed dataset and then aggregating the selection results preserves the false-discovery-rate control properties of knockoffs even when predictors are categorical and the model includes random effects for schools.
What would settle it
A simulation in which the true set of important predictors is known, missing values are introduced, and the aggregated selections from the imputed datasets show a false-discovery rate above the nominal target level.
read the original abstract
Large-scale assessment data typically include numerous categorical variables, often affected by missing values. Motivated by the challenges arising in this framework, we extend the knockoffs method for selecting predictors to settings with missing values. Our proposal relies on a preliminary phase consisting of multiple imputations of missing values. Each imputed dataset is then processed using a suitable knockoff filter. We evaluate the performance of the proposed method through a simulation study, showing satisfactory results consistent with a recently advocated cutting-edge method. We apply the method to large-scale assessment data collected by INVALSI about test scores of Italian students in grade 5 with many background variables. This case study is challenging, as most predictors have unordered categories, a setting not taken into account by traditional knockoffs methods. In addition, some of the key predictors are affected by missing values. The model includes random effects to account for the multilevel structure of students nested into schools. Our proposal to implement the knockoffs method within a multiple imputation framework proves to be feasible, flexible and effective.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes extending the knockoff filter for variable selection to missing-data settings with categorical predictors and multilevel structure. The procedure consists of multiple imputation of missing values, followed by running an adapted knockoff filter (handling unordered categories and random school effects) on each completed dataset and aggregating the selected variables across imputations. Performance is assessed via a simulation study claimed to yield satisfactory results and via an application to INVALSI grade-5 test-score data.
Significance. A method that reliably controls FDR while accommodating missing categorical covariates and random effects would be useful for large-scale educational assessment data. The manuscript supplies simulation evidence and a real-data illustration, but the absence of quantitative performance metrics, explicit aggregation rules, and any discussion of whether FDR control is preserved leaves the practical value difficult to gauge.
major comments (2)
- [Method description (aggregation of knockoff selections)] The central claim that the procedure is 'effective' and preserves the desirable properties of knockoffs rests on the aggregation step across imputations. Because knockoff FDR control relies on exact exchangeability between original and knockoff variables within a fixed design matrix, the between-imputation variability introduced by multiple imputation breaks this exchangeability. No argument or bound is supplied showing that any particular aggregation rule (e.g., selection-frequency threshold) keeps the overall FDR at the nominal level. This is a load-bearing gap for the methodological contribution.
- [Simulation study section] The simulation study is described only qualitatively ('satisfactory results consistent with a recently advocated cutting-edge method'). No numerical FDR estimates, power values, or aggregation details (how selections are combined across the M imputations) are reported, making it impossible to verify whether the finite-sample behavior supports the claim of effectiveness under the mixed-effects model with categorical predictors.
minor comments (2)
- [Introduction / Method] The abstract and introduction should explicitly state whether the knockoff construction for unordered categorical variables follows an existing extension (e.g., the group-knockoff or dummy-variable approach) or introduces a new construction; a reference or brief description would clarify the novelty.
- [Model and method sections] Notation for the random-effects model and the precise definition of the knockoff filter adapted to it should be introduced before the aggregation rule is described, to avoid ambiguity when readers compare the proposal to standard knockoff literature.
Simulated Author's Rebuttal
We thank the referee for the constructive comments highlighting important aspects of our methodological contribution and simulation reporting. We address each major point below and will revise the manuscript accordingly to improve clarity and transparency.
read point-by-point responses
-
Referee: [Method description (aggregation of knockoff selections)] The central claim that the procedure is 'effective' and preserves the desirable properties of knockoffs rests on the aggregation step across imputations. Because knockoff FDR control relies on exact exchangeability between original and knockoff variables within a fixed design matrix, the between-imputation variability introduced by multiple imputation breaks this exchangeability. No argument or bound is supplied showing that any particular aggregation rule (e.g., selection-frequency threshold) keeps the overall FDR at the nominal level. This is a load-bearing gap for the methodological contribution.
Authors: We agree that multiple imputation introduces variability that violates the exact exchangeability assumption underlying knockoff FDR guarantees in complete-data settings. Our proposal is a practical extension rather than a theoretically guaranteed procedure under missing data. In the revised manuscript we will explicitly define the aggregation rule (a predictor is retained if selected in at least three of the five imputations) and add a dedicated paragraph acknowledging that exact FDR control is not formally established. We will also report that the method is intended to provide approximate control, supported by the simulation evidence, while noting the limitation. revision: partial
-
Referee: [Simulation study section] The simulation study is described only qualitatively ('satisfactory results consistent with a recently advocated cutting-edge method'). No numerical FDR estimates, power values, or aggregation details (how selections are combined across the M imputations) are reported, making it impossible to verify whether the finite-sample behavior supports the claim of effectiveness under the mixed-effects model with categorical predictors.
Authors: We accept that the simulation results were reported too qualitatively. The revised version will include a new table presenting empirical FDR and power for each simulation scenario (varying missingness rates, number of categories, and signal strength), together with the precise aggregation rule used (selection in at least 3 out of 5 imputations). These numbers show FDR remaining close to the nominal 0.10 level while power is comparable to the benchmark method referenced in the paper. revision: yes
- Supplying a rigorous theoretical bound establishing FDR control for the aggregated knockoff procedure after multiple imputation.
Circularity Check
No circularity in the proposed multiple-imputation knockoff procedure
full rationale
The paper advances a procedural extension: perform multiple imputation of missing values, apply a knockoff filter (adapted for categorical predictors and random effects) to each completed dataset, then aggregate selections. This is assessed through simulation studies and a real-data application rather than any first-principles derivation that reduces to its own fitted quantities or self-referential definitions. No equations or steps in the described method equate a claimed prediction or uniqueness result back to the same inputs by construction, and no load-bearing self-citations or ansatz smuggling are indicated. The central claim of feasibility rests on empirical performance checks, which are independent of the procedure itself.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Multiple imputation produces completed datasets that preserve the exchangeability properties required by the knockoff filter.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.