The revisited knockoffs method for variable selection in L1-penalised regressions

Anne G\'egout-Petit; Aur\'elie Gueudin-Muller; Cl\'emence Karmann

arxiv: 1907.03153 · v1 · pith:MLKL4ZWYnew · submitted 2019-07-06 · 📊 stat.ME · stat.AP

The revisited knockoffs method for variable selection in L1-penalised regressions

Anne G\'egout-Petit , Aur\'elie Gueudin-Muller , Cl\'emence Karmann This is my paper

Pith reviewed 2026-05-25 01:34 UTC · model grok-4.3

classification 📊 stat.ME stat.AP

keywords variable selectionL1-penalized regressionknockoffspenalty parameterhigh-dimensional datacovariate ranking

0 comments

The pith

A revisited knockoffs method determines the penalty parameter for variable selection in L1-penalized regressions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a new method using knockoffs to choose the penalty in L1 regressions for selecting relevant covariates. This approach works for different types of response variables and when the number of observations is less than the number of covariates. It also provides an ordering of covariate importance. A sympathetic reader would care because it offers a general way to handle variable selection in high-dimensional settings without relying on specific model assumptions beyond the knockoffs framework.

Core claim

We develop a new method based on the knockoffs idea to handle the choice of the penalty parameter in L1-penalised regression models. This revisited knockoffs method is general and suitable for a wide range of regressions with various types of response variables. It works when the number of observations is smaller than the number of covariates and gives an order of importance of the covariates.

What carries the argument

The revisited knockoffs method, which adapts the knockoffs framework to select the penalty parameter and rank covariates in L1-penalized regressions.

If this is right

It enables variable selection in regressions with more covariates than observations.
It applies to various response variable types beyond standard linear models.
It provides a ranking of covariate importance rather than just selection.
It can be compared to other variable selection methods through experimental results.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method might extend to other penalized regression types if the knockoffs adaptation generalizes.
It could reduce reliance on cross-validation for choosing the penalty in high-dimensional settings.
It connects to broader uses of knockoffs for controlling false discoveries in variable selection.

Load-bearing premise

The knockoffs framework can be adapted to L1-penalized regressions without needing extra assumptions on the data distribution or model specifics.

What would settle it

An experiment showing that the method fails to correctly identify relevant variables or select the penalty in a controlled simulation with known ground truth when n is less than p.

Figures

Figures reproduced from arXiv: 1907.03153 by Anne G\'egout-Petit, Aur\'elie Gueudin-Muller, Cl\'emence Karmann.

**Figure 1.** Figure 1: Example of positive statistics Wi sorted in ascending order. Linear Gaussian regression model with n = 500 observations of p = 20 covariates. Only covariates X1, X2, X3, X4 and X5 belong to the model (regression coefficients are set to β = (1, 1, 1, 1, 1, 0, . . . , 0)). Wi , i = 1, . . . , 20, which implies that X3 is the covariate the most likely to belong to the model. We can clearly observe a breakdo… view at source ↗

**Figure 2.** Figure 2: Detection rates of each covariate for the four meth [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Detection rates of each covariate for the three met [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Detection rates of each covariate for the three met [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Comparing the [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 5.** Figure 5: Boxplots of detection rates of each covariate acco [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Boxplots of detection rates of each covariate acco [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 7.** Figure 7: Boxplots of detection rates of each covariate acco [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

**Figure 8.** Figure 8: Detection rates of each covariate for the three met [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗

read the original abstract

We consider the problem of variable selection in regression models. In particular, we are interested in selecting explanatory covariates linked with the response variable and we want to determine which covariates are relevant, that is which covariates are involved in the model. In this framework, we deal with L1-penalised regression models. To handle the choice of the penalty parameter to perform variable selection, we develop a new method based on the knockoffs idea. This revisited knockoffs method is general, suitable for a wide range of regressions with various types of response variables. Besides, it also works when the number of observations is smaller than the number of covariates and gives an order of importance of the covariates. Finally, we provide many experimental results to corroborate our method and compare it with other variable selection methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts knockoffs to pick λ and rank variables in L1 regressions for general responses including p>n, but the exchangeability argument for non-Gaussian cases looks incomplete.

read the letter

The core idea is to revisit knockoffs so they help choose the penalty parameter in L1-penalized models and also produce an importance ordering of covariates. This is applied to a range of response types and is claimed to handle the p greater than n regime. The experiments are presented as support for the approach over other selection methods. That is the main contribution on offer. The construction appears to generate knockoff covariates and then examine the regularization path to decide on λ and the ranking. This is a reasonable direction if the FDR-type control carries over. The paper does attempt to move beyond the usual Gaussian linear case that dominates early knockoff work. The soft spot is exactly the one flagged in the stress-test note. Standard knockoff guarantees rest on constructing an augmented matrix whose joint distribution is exchangeable when a variable is null. For logistic, Poisson, or other GLMs, and especially when the design is rank-deficient, it is not obvious how this is done without extra assumptions. The abstract and the stress-test description give no explicit construction or proof that the joint law is preserved under the null for arbitrary responses. If the method implicitly falls back to Gaussian knockoffs or requires full rank, the generality claim weakens. The experiments may show good behavior on the tested examples, but that does not substitute for the missing exchangeability argument. Readers working on high-dimensional selection who already know the knockoff literature will see the gap immediately. The paper is worth sending to referees because the problem is practical and the direction is sensible, even though the theoretical support needs tightening. A serious review would likely ask for the precise knockoff construction and the conditions under which the control holds.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a 'revisited knockoffs' procedure to select the L1 penalty parameter λ for variable selection in penalized regression. It claims the method applies to a wide range of response distributions, remains valid when n < p, produces an ordering of covariate importance, and is supported by experimental comparisons against other selection methods.

Significance. A procedure that extends knockoff-based selection to arbitrary GLMs and the high-dimensional regime without Gaussian assumptions would address a practical gap; however, the abstract provides no indication that such an extension is achieved, limiting assessment of potential impact.

major comments (2)

[Abstract] Abstract: the assertion that the method is 'suitable for a wide range of regressions with various types of response variables' and 'works when the number of observations is smaller than the number of covariates' is load-bearing for the central claim, yet no construction of knockoff variables X̃ is supplied that preserves the joint exchangeability (X, X̃) under the null for non-Gaussian responses or rank-deficient designs.
[Abstract] Abstract and method description: the procedure is said to 'give an order of importance of the covariates' via the λ path, but no derivation shows how the knockoff statistics are extracted or why the resulting ordering inherits FDR control (or an analogous guarantee) once the response distribution is arbitrary.

minor comments (1)

[Abstract] The abstract states that 'many experimental results' corroborate the method; a brief indication of the simulation settings, response types, and performance metrics would help readers evaluate the scope of the reported corroboration.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We respond to each major comment below and will revise the manuscript accordingly to improve clarity and completeness.

read point-by-point responses

Referee: [Abstract] Abstract: the assertion that the method is 'suitable for a wide range of regressions with various types of response variables' and 'works when the number of observations is smaller than the number of covariates' is load-bearing for the central claim, yet no construction of knockoff variables X̃ is supplied that preserves the joint exchangeability (X, X̃) under the null for non-Gaussian responses or rank-deficient designs.

Authors: Knockoff construction operates solely on the covariate matrix X and is independent of the response distribution Y; exchangeability of (X, X̃) therefore holds regardless of whether the response is Gaussian or belongs to another GLM family. For the n < p regime we rely on existing high-dimensional knockoff constructions (e.g., those based on approximate exchangeability or SDP relaxations) that accommodate rank-deficient designs. We will revise the abstract and add an explicit paragraph in Section 2 describing the precise construction employed. revision: yes
Referee: [Abstract] Abstract and method description: the procedure is said to 'give an order of importance of the covariates' via the λ path, but no derivation shows how the knockoff statistics are extracted or why the resulting ordering inherits FDR control (or an analogous guarantee) once the response distribution is arbitrary.

Authors: The importance ordering is induced by the sequence of λ values at which each original variable enters the L1 path; knockoff statistics are formed by comparing entry λ’s of originals versus knockoffs, and the threshold is chosen to guarantee FDR control. Because the exchangeability property is a property of the design only, the control argument carries over to arbitrary response distributions. We will insert a short derivation subsection (new Section 3.2) that extracts the statistics explicitly and states the FDR guarantee under the stated assumptions. revision: yes

Circularity Check

0 steps flagged

No circularity: method adapts knockoffs without reducing claims to fitted inputs or self-citations by construction

full rationale

The paper introduces a revisited knockoffs procedure for choosing the L1 penalty in regressions, claiming generality across response types and n < p regimes. No quoted equations or sections exhibit self-definitional loops (e.g., defining a quantity in terms of itself), fitted parameters renamed as predictions, or load-bearing self-citations that substitute for independent justification. The central adaptation is presented as an extension supported by experiments rather than forced by prior author results or ansatz smuggling. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

With only the abstract available, no specific free parameters, axioms, or invented entities can be identified from the provided text.

pith-pipeline@v0.9.0 · 5672 in / 1008 out tokens · 20784 ms · 2026-05-25T01:34:02.102850+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages · 1 internal anchor

[1]

Analysis of ordinal categorical data

Alan Agresti. Analysis of ordinal categorical data . Wiley Series in Probability and Statistics. John Wiley & Sons, Inc., Hoboken, NJ, second edi tion, 2010

work page 2010
[2]

Categorical data analysis

Alan Agresti. Categorical data analysis . Wiley series in probability and statistics. Wiley, 3ed. edition, 2013

work page 2013
[3]

Regression, discrimination and measurement models for ordered categorical variables

JA Anderson and PR Philips. Regression, discrimination and measurement models for ordered categorical variables. Applied statistics , pages 22–31, 1981. 16

work page 1981
[4]

Auger and Charles E

Ivan E. Auger and Charles E. Lawrence. Algorithms for the optimal identiﬁcation of segment neighborhoods. Bull. Math. Biol. , 51(1):39–54, 1989

work page 1989
[5]

Cand` es

Rina Foygel Barber and Emmanuel J. Cand` es. Controlling the false discovery rate via knockoﬀs. Ann. Statist. , 43(5):2055–2085, 2015

work page 2055
[6]

A knockoff filter for high-dimensional selective inference

Rina Foygel Barber and Emmanuel J Candes. A knockoﬀ ﬁlter for high-dimensional selective inference. arXiv preprint arXiv:1602.03574 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[7]

The elements of statistical learning

Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The elements of statistical learning. Springer Series in Statistics. Springer-Verlag, New York , 2001. Data mining, inference, and prediction

work page 2001
[8]

Statistical learning with sparsity: the lasso and generalizations

Trevor Hastie, Robert Tibshirani, and Martin Wainwrigh t. Statistical learning with sparsity: the lasso and generalizations . CRC press, 2015

work page 2015
[9]

Stability approach to regularization selection (stars) for high dimensional graphical models

Han Liu, Kathryn Roeder, and Larry Wasserman. Stability approach to regularization selection (stars) for high dimensional graphical models. I n J. D. Laﬀerty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel, and A. Culotta, edit ors, Advances in Neural Information Processing Systems 23 , pages 1432–1440. Curran Associates, Inc., 2010

work page 2010
[10]

The analysis of ordered catego rical data: an overview and a survey of recent developments

Ivy Liu and Alan Agresti. The analysis of ordered catego rical data: an overview and a survey of recent developments. Test, 14(1):1–73, 2005. With discussion and a rejoinder by the authors

work page 2005
[11]

Regression models for ordinal data

Peter McCullagh. Regression models for ordinal data. J. Roy. Statist. Soc. Ser. B , 42(2):109–142, 1980

work page 1980
[12]

L1-regularization pa th algorithm for general- ized linear models

Mee Young Park and Trevor Hastie. L1-regularization pa th algorithm for general- ized linear models. Journal of the Royal Statistical Society: Series B (Statist ical Methodology), 69(4):659–677, 2007

work page 2007
[13]

A statistical approach for CGH microarray data analysis

Franck Picard, St´ ephane Robin, Marc Lavielle, Christ ian Vaisse, Gilles Celeux, and Jean-Jacques Daudin. A statistical approach for CGH microarray data analysis . PhD thesis, INRIA, 2004

work page 2004
[14]

A segmenta- tion/clustering model for the analysis of array cgh data

Franck Picard, St´ ephane Robin, E Lebarbier, and J-J Da udin. A segmenta- tion/clustering model for the analysis of array cgh data. Biometrics, 63(3):758–766, 2007

work page 2007
[15]

Alternative analyses for the singly-order ed contingency table

Gary Simon. Alternative analyses for the singly-order ed contingency table. Journal of the American Statistical Association , 69(348):971–976, 1974

work page 1974
[16]

Or dinal graphical models: A tale of two approaches

Arun Sai Suggala, Eunho Yang, and Pradeep Ravikumar. Or dinal graphical models: A tale of two approaches. In Doina Precup and Yee Whye Teh, edi tors, Proceedings of the 34th International Conference on Machine Learning , volume 70 of Proceedings of Machine Learning Research , pages 3260–3269, International Convention Centre, Sydney, Australia, 06–11 Aug 2...

work page 2017
[17]

Regression shrinkage and selectio n via the lasso

Robert Tibshirani. Regression shrinkage and selectio n via the lasso. J. Roy. Statist. Soc. Ser. B , 58(1):267–288, 1996. 17

work page 1996
[18]

High dimensional v ariable selection

Larry Wasserman and Kathryn Roeder. High dimensional v ariable selection. Annals of statistics , 37(5A):2178, 2009

work page 2009
[19]

Analysis of conting ency tables having ordered response categories

O Dale Williams and James E Grizzle. Analysis of conting ency tables having ordered response categories. Journal of the American Statistical Association , 67(337):55–63, 1972

work page 1972
[20]

Genome-wide association analysis by lasso penalized logis tic regression

Tong Tong Wu, Yi Fang Chen, Trevor Hastie, Eric Sobel, an d Kenneth Lange. Genome-wide association analysis by lasso penalized logis tic regression. Bioinfor- matics, 25(6):714–721, 2009

work page 2009
[21]

On model selection consistency of L asso

Peng Zhao and Bin Yu. On model selection consistency of L asso. J. Mach. Learn. Res., 7:2541–2563, 2006

work page 2006
[22]

Classiﬁcation of gene microar rays by penalized logistic regression

Ji Zhu and Trevor Hastie. Classiﬁcation of gene microar rays by penalized logistic regression. Biostatistics, 5(3):427–443, 2004. 18

work page 2004

[1] [1]

Analysis of ordinal categorical data

Alan Agresti. Analysis of ordinal categorical data . Wiley Series in Probability and Statistics. John Wiley & Sons, Inc., Hoboken, NJ, second edi tion, 2010

work page 2010

[2] [2]

Categorical data analysis

Alan Agresti. Categorical data analysis . Wiley series in probability and statistics. Wiley, 3ed. edition, 2013

work page 2013

[3] [3]

Regression, discrimination and measurement models for ordered categorical variables

JA Anderson and PR Philips. Regression, discrimination and measurement models for ordered categorical variables. Applied statistics , pages 22–31, 1981. 16

work page 1981

[4] [4]

Auger and Charles E

Ivan E. Auger and Charles E. Lawrence. Algorithms for the optimal identiﬁcation of segment neighborhoods. Bull. Math. Biol. , 51(1):39–54, 1989

work page 1989

[5] [5]

Cand` es

Rina Foygel Barber and Emmanuel J. Cand` es. Controlling the false discovery rate via knockoﬀs. Ann. Statist. , 43(5):2055–2085, 2015

work page 2055

[6] [6]

A knockoff filter for high-dimensional selective inference

Rina Foygel Barber and Emmanuel J Candes. A knockoﬀ ﬁlter for high-dimensional selective inference. arXiv preprint arXiv:1602.03574 , 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[7] [7]

The elements of statistical learning

Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The elements of statistical learning. Springer Series in Statistics. Springer-Verlag, New York , 2001. Data mining, inference, and prediction

work page 2001

[8] [8]

Statistical learning with sparsity: the lasso and generalizations

Trevor Hastie, Robert Tibshirani, and Martin Wainwrigh t. Statistical learning with sparsity: the lasso and generalizations . CRC press, 2015

work page 2015

[9] [9]

Stability approach to regularization selection (stars) for high dimensional graphical models

Han Liu, Kathryn Roeder, and Larry Wasserman. Stability approach to regularization selection (stars) for high dimensional graphical models. I n J. D. Laﬀerty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel, and A. Culotta, edit ors, Advances in Neural Information Processing Systems 23 , pages 1432–1440. Curran Associates, Inc., 2010

work page 2010

[10] [10]

The analysis of ordered catego rical data: an overview and a survey of recent developments

Ivy Liu and Alan Agresti. The analysis of ordered catego rical data: an overview and a survey of recent developments. Test, 14(1):1–73, 2005. With discussion and a rejoinder by the authors

work page 2005

[11] [11]

Regression models for ordinal data

Peter McCullagh. Regression models for ordinal data. J. Roy. Statist. Soc. Ser. B , 42(2):109–142, 1980

work page 1980

[12] [12]

L1-regularization pa th algorithm for general- ized linear models

Mee Young Park and Trevor Hastie. L1-regularization pa th algorithm for general- ized linear models. Journal of the Royal Statistical Society: Series B (Statist ical Methodology), 69(4):659–677, 2007

work page 2007

[13] [13]

A statistical approach for CGH microarray data analysis

Franck Picard, St´ ephane Robin, Marc Lavielle, Christ ian Vaisse, Gilles Celeux, and Jean-Jacques Daudin. A statistical approach for CGH microarray data analysis . PhD thesis, INRIA, 2004

work page 2004

[14] [14]

A segmenta- tion/clustering model for the analysis of array cgh data

Franck Picard, St´ ephane Robin, E Lebarbier, and J-J Da udin. A segmenta- tion/clustering model for the analysis of array cgh data. Biometrics, 63(3):758–766, 2007

work page 2007

[15] [15]

Alternative analyses for the singly-order ed contingency table

Gary Simon. Alternative analyses for the singly-order ed contingency table. Journal of the American Statistical Association , 69(348):971–976, 1974

work page 1974

[16] [16]

Or dinal graphical models: A tale of two approaches

Arun Sai Suggala, Eunho Yang, and Pradeep Ravikumar. Or dinal graphical models: A tale of two approaches. In Doina Precup and Yee Whye Teh, edi tors, Proceedings of the 34th International Conference on Machine Learning , volume 70 of Proceedings of Machine Learning Research , pages 3260–3269, International Convention Centre, Sydney, Australia, 06–11 Aug 2...

work page 2017

[17] [17]

Regression shrinkage and selectio n via the lasso

Robert Tibshirani. Regression shrinkage and selectio n via the lasso. J. Roy. Statist. Soc. Ser. B , 58(1):267–288, 1996. 17

work page 1996

[18] [18]

High dimensional v ariable selection

Larry Wasserman and Kathryn Roeder. High dimensional v ariable selection. Annals of statistics , 37(5A):2178, 2009

work page 2009

[19] [19]

Analysis of conting ency tables having ordered response categories

O Dale Williams and James E Grizzle. Analysis of conting ency tables having ordered response categories. Journal of the American Statistical Association , 67(337):55–63, 1972

work page 1972

[20] [20]

Genome-wide association analysis by lasso penalized logis tic regression

Tong Tong Wu, Yi Fang Chen, Trevor Hastie, Eric Sobel, an d Kenneth Lange. Genome-wide association analysis by lasso penalized logis tic regression. Bioinfor- matics, 25(6):714–721, 2009

work page 2009

[21] [21]

On model selection consistency of L asso

Peng Zhao and Bin Yu. On model selection consistency of L asso. J. Mach. Learn. Res., 7:2541–2563, 2006

work page 2006

[22] [22]

Classiﬁcation of gene microar rays by penalized logistic regression

Ji Zhu and Trevor Hastie. Classiﬁcation of gene microar rays by penalized logistic regression. Biostatistics, 5(3):427–443, 2004. 18

work page 2004