Multi-Task Feature Learning Via Efficient l2,1-Norm Minimization

Jieping Ye; Jun Liu; Shuiwang Ji

arxiv: 1205.2631 · v1 · pith:BWUUFR2Tnew · submitted 2012-05-09 · 💻 cs.LG · cs.CV· stat.ML

Multi-Task Feature Learning Via Efficient l2,1-Norm Minimization

Jun Liu , Shuiwang Ji , Jieping Ye This is my paper

classification 💻 cs.LG cs.CVstat.ML

keywords featurenormeuclideanoptimizationprojectioncomputedconvexjoint

0 comments

read the original abstract

The problem of joint feature selection across a group of related tasks has applications in many areas including biomedical informatics and computer vision. We consider the l2,1-norm regularized regression model for joint feature selection from multiple tasks, which can be derived in the probabilistic framework by assuming a suitable prior from the exponential family. One appealing feature of the l2,1-norm regularization is that it encourages multiple predictors to share similar sparsity patterns. However, the resulting optimization problem is challenging to solve due to the non-smoothness of the l2,1-norm regularization. In this paper, we propose to accelerate the computation by reformulating it as two equivalent smooth convex optimization problems which are then solved via the Nesterov's method-an optimal first-order black-box method for smooth convex optimization. A key building block in solving the reformulations is the Euclidean projection. We show that the Euclidean projection for the first reformulation can be analytically computed, while the Euclidean projection for the second one can be computed in linear time. Empirical evaluations on several data sets verify the efficiency of the proposed algorithms.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Multi-label feature selection based on binary hashing learning and dynamic graph constraints
cs.LG 2025-03 unverdicted novelty 7.0

BHDG is a new multi-label feature selection approach that employs binary hashing for pseudo-labels and dynamic graphs, showing superior performance on benchmarks.