pith. sign in

arxiv: 1401.2324 · v1 · pith:35ZETWF4new · submitted 2014-01-10 · 📊 stat.AP

Bayesian shrinkage methods for partially observed data with many predictors

classification 📊 stat.AP
keywords datamathbfmethodsbayesiancovariateslargemeasurementsmissing
0
0 comments X
read the original abstract

Motivated by the increasing use of and rapid changes in array technologies, we consider the prediction problem of fitting a linear regression relating a continuous outcome $Y$ to a large number of covariates $\mathbf {X}$, for example, measurements from current, state-of-the-art technology. For most of the samples, only the outcome $Y$ and surrogate covariates, $\mathbf {W}$, are available. These surrogates may be data from prior studies using older technologies. Owing to the dimension of the problem and the large fraction of missing information, a critical issue is appropriate shrinkage of model parameters for an optimal bias-variance trade-off. We discuss a variety of fully Bayesian and Empirical Bayes algorithms which account for uncertainty in the missing data and adaptively shrink parameter estimates for superior prediction. These methods are evaluated via a comprehensive simulation study. In addition, we apply our methods to a lung cancer data set, predicting survival time ($Y$) using qRT-PCR ($\mathbf {X}$) and microarray ($\mathbf {W}$) measurements.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.