pith. machine review for the scientific record. sign in

arxiv: 1205.5050 · v3 · pith:YU5QPQ2Mnew · submitted 2012-05-22 · 📊 stat.ME · math.ST· stat.ML· stat.TH

A lasso for hierarchical interactions

classification 📊 stat.ME math.STstat.MLstat.TH
keywords hierarchyconstraintestimateimportantinteractionlassonumbersparsity
0
0 comments X
read the original abstract

We add a set of convex constraints to the lasso to produce sparse interaction models that honor the hierarchy restriction that an interaction only be included in a model if one or both variables are marginally important. We give a precise characterization of the effect of this hierarchy constraint, prove that hierarchy holds with probability one and derive an unbiased estimate for the degrees of freedom of our estimator. A bound on this estimate reveals the amount of fitting "saved" by the hierarchy constraint. We distinguish between parameter sparsity - the number of nonzero coefficients - and practical sparsity - the number of raw variables one must measure to make a new prediction. Hierarchy focuses on the latter, which is more closely tied to important data collection concerns such as cost, time and effort. We develop an algorithm, available in the R package hierNet, and perform an empirical study of our method.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.