On the Peaking Phenomenon of the Lasso in Model Selection
read the original abstract
I briefly report on some unexpected results that I obtained when optimizing the model parameters of the Lasso. In simulations with varying observations-to-variables ratio n=p, I typically observe a strong peak in the test error curve at the transition point n/p = 1. This peaking phenomenon is well-documented in scenarios that involve the inversion of the sample covariance matrix, and as I illustrate in this note, it is also the source of the peak for the Lasso. The key problem is the parametrization of the Lasso penalty (as e.g. in the current R package lars) and I present a solution in terms of a normalized Lasso parameter.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Feature Learning in Linear-Width Two-Layer Networks: Two vs. One Step of Gradient Descent
In the linear-width regime, the second GD step yields a spiked random matrix whose number of outliers is floor(alpha2 / (1/2 - alpha1)), and batch reuse enables learning directions with information exponent greater th...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.