Model Building for Semiparametric Mixtures

Bruce G. Lindsay; Francesco Bartolucci; Ramani S. Pilla

read the original abstract

An important and yet difficult problem in fitting multivariate mixture models is determining the mixture complexity. We develop theory and a unified framework for finding the nonparametric maximum likelihood estimator of a multivariate mixing distribution and consequently estimating the mixture complexity. Multivariate mixtures provide a flexible approach to fitting high-dimensional data while offering data reduction through the number, location and shape of the component densities. The central principle of our method is to cast the mixture maximization problem in the concave optimization framework with finitely many linear inequality constraints and turn it into an unconstrained problem using a "penalty function". We establish the existence of parameter estimators and prove the convergence properties of the proposed algorithms. The role of a "sieve parameter'' in reducing the dimensionality of mixture models is demonstrated. We derive analytical machinery for building a collection of semiparametric mixture models, including the multivariate case, via the sieve parameter. The performance of the methods are shown with applications to several data sets including the cdc15 cell-cycle yeast microarray data.

Model Building for Semiparametric Mixtures

discussion (0)