pith. sign in

arxiv: 1901.04055 · v1 · pith:D7HNUGRInew · submitted 2019-01-13 · 💻 cs.LG · stat.ML

Gradient Boosted Feature Selection

classification 💻 cs.LG stat.ML
keywords featureselectionalgorithmboostedgradientdatafeaturesfour
0
0 comments X p. Extension
pith:D7HNUGRI Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{D7HNUGRI}

Prints a linked pith:D7HNUGRI badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

A feature selection algorithm should ideally satisfy four conditions: reliably extract relevant features; be able to identify non-linear feature interactions; scale linearly with the number of features and dimensions; allow the incorporation of known sparsity structure. In this work we propose a novel feature selection algorithm, Gradient Boosted Feature Selection (GBFS), which satisfies all four of these requirements. The algorithm is flexible, scalable, and surprisingly straight-forward to implement as it is based on a modification of Gradient Boosted Trees. We evaluate GBFS on several real world data sets and show that it matches or out-performs other state of the art feature selection algorithms. Yet it scales to larger data set sizes and naturally allows for domain-specific side information.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.