When Knockoffs fail: diagnosing and fixing non-exchangeability of Knockoffs

Alexandre Blain; Angel Reyero Lobo; Bertrand Thirion; Julia Linhart; Pierre Neuvial

arxiv: 2407.06892 · v2 · pith:NKGBIJ7Hnew · submitted 2024-07-09 · 📊 stat.ME

When Knockoffs fail: diagnosing and fixing non-exchangeability of Knockoffs

Alexandre Blain , Angel Reyero Lobo , Julia Linhart , Bertrand Thirion , Pierre Neuvial This is my paper

classification 📊 stat.ME

keywords dataassumptioncontrolinferenceknockoffknockoffsstatisticaldiagnostic

0 comments

read the original abstract

Knockoffs are a popular statistical framework that addresses the challenging problem of conditional variable selection in high-dimensional settings with statistical control. Such statistical control is essential for the reliability of inference. However, knockoff guarantees rely on an exchangeability assumption that is difficult to test in practice, and there is little discussion in the literature on how to deal with unfulfilled hypotheses. This assumption is related to the ability to generate data similar to the observed data. To maintain reliable inference, we introduce a diagnostic tool based on Classifier Two-Sample Tests. Using simulations and real data, we show that violations of this assumption occur in common settings for classical knockoff generators, especially when the data have a strong dependence structure. As a consequence, knockoff-based inference suffers from a massive inflation of false positives. We show that the diagnostic tool correctly detects such behavior. We show that an alternative knockoff construction, based on constructing a predictor of each variable based on all others, solves the issue. We also propose a computationally-efficient variant of this algorithm and show empirically that this approach restores error control on simulated data and semi-simulated experiments based on neuroimaging data.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Model Selection for SLOPE Models: A Bayesian Perspective
stat.ME 2026-06 unverdicted novelty 7.0

Introduces Bayesian spike-and-slab embeddings for group SLOPE models and a two-step orthogonal transformation to achieve FDR control and higher power than cross-validation in general settings.
Multiple testing
stat.ME 2026-06 unverdicted

An expository introduction to multiple hypothesis testing procedures and error control criteria with R package references.