pith. sign in

arxiv: 0905.2078 · v1 · submitted 2009-05-13 · 🧮 math.ST · stat.TH

Sparse recovery in convex hulls via entropy penalization

classification 🧮 math.ST stat.TH
keywords lambdavarepsiloneqnarraymathbbbulletdistributionempiricalsparsity
0
0 comments X
read the original abstract

Let $(X,Y)$ be a random couple in $S\times T$ with unknown distribution $P$ and $(X_1,Y_1),...,(X_n,Y_n)$ be i.i.d. copies of $(X,Y).$ Denote $P_n$ the empirical distribution of $(X_1,Y_1),...,(X_n,Y_n).$ Let $h_1,...,h_N:S\mapsto [-1,1]$ be a dictionary that consists of $N$ functions. For $\lambda \in {\mathbb{R}}^N,$ denote $f_{\lambda}:=\sum_{j=1}^N\lambda_jh_j.$ Let $\ell:T\times {\mathbb{R}}\mapsto {\mathbb{R}}$ be a given loss function and suppose it is convex with respect to the second variable. Let $(\ell \bullet f)(x,y):=\ell(y;f(x)).$ Finally, let $\Lambda \subset {\mathbb{R}}^N$ be the simplex of all probability distributions on $\{1,...,N\}.$ Consider the following penalized empirical risk minimization problem \begin{eqnarray*}\hat{\lambda}^{\varepsilon}:={\mathop {argmin}_{\lambda\in \Lambda}}\Biggl[P_n(\ell \bullet f_{\lambda})+\varepsilon \sum_{j=1}^N\lambda_j\log \lambda_j\Biggr]\end{eqnarray*} along with its distribution dependent version \begin{eqnarray*}\lambda^{\varepsilon}:={\mathop {argmin}_{\lambda\in \Lambda}}\Biggl[P(\ell \bullet f_{\lambda})+\varepsilon \sum_{j=1}^N\lambda_j\log \lambda_j\Biggr],\end{eqnarray*} where $\varepsilon\geq 0$ is a regularization parameter. It is proved that the ``approximate sparsity'' of $\lambda^{\varepsilon}$ implies the ``approximate sparsity'' of $\hat{\lambda}^{\varepsilon}$ and the impact of ``sparsity'' on bounding the excess risk of the empirical solution is explored. Similar results are also discussed in the case of entropy penalized density estimation.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.