Statistically efficient thinning of a Markov chain sampler

Art B. Owen

arxiv: 1510.07727 · v7 · pith:2EMZZR6Bnew · submitted 2015-10-27 · 📊 stat.CO · cs.LG· stat.ML

Statistically efficient thinning of a Markov chain sampler

Art B. Owen This is my paper

classification 📊 stat.CO cs.LGstat.ML

keywords thetaefficiencythinningautocorrelationschainefficientimprovemarkov

0 comments

read the original abstract

It is common to subsample Markov chain output to reduce the storage burden. Geyer (1992) shows that discarding $k-1$ out of every $k$ observations will not improve statistical efficiency, as quantified through variance in a given computational budget. That observation is often taken to mean that thinning MCMC output cannot improve statistical efficiency. Here we suppose that it costs one unit of time to advance a Markov chain and then $\theta>0$ units of time to compute a sampled quantity of interest. For a thinned process, that cost $\theta$ is incurred less often, so it can be advanced through more stages. Here we provide examples to show that thinning will improve statistical efficiency if $\theta$ is large and the sample autocorrelations decay slowly enough. If the lag $\ell\ge1$ autocorrelations of a scalar measurement satisfy $\rho_\ell\ge\rho_{\ell+1}\ge0$, then there is always a $\theta<\infty$ at which thinning becomes more efficient for averages of that scalar. Many sample autocorrelation functions resemble first order AR(1) processes with $\rho_\ell =\rho^{|\ell|}$ for some $-1<\rho<1$. For an AR(1) process it is possible to compute the most efficient subsampling frequency $k$. The optimal $k$ grows rapidly as $\rho$ increases towards $1$. The resulting efficiency gain depends primarily on $\theta$, not $\rho$. Taking $k=1$ (no thinning) is optimal when $\rho\le0$. For $\rho>0$ it is optimal if and only if $\theta \le (1-\rho)^2/(2\rho)$. This efficiency gain never exceeds $1+\theta$. This paper also gives efficiency bounds for autocorrelations bounded between those of two AR(1) processes.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Weighted quantization using MMD: From mean field to mean shift via gradient flows
stat.ML 2025-02 unverdicted novelty 6.0

Derives MSIP algorithm from MMD gradient flows for weighted quantization, extending mean shift and relating to preconditioned gradient descent and Lloyd's clustering.