The square root rule for adaptive importance sampling

Art B. Owen; Yi Zhou

read the original abstract

In adaptive importance sampling, and other contexts, we have $K>1$ unbiased and uncorrelated estimates $\hat\mu_k$ of a common quantity $\mu$. The optimal unbiased linear combination weights them inversely to their variances but those weights are unknown and hard to estimate. A simple deterministic square root rule based on a working model that $\mathrm{Var}(\hat\mu_k)\propto k^{-1/2}$ gives an unbisaed estimate of $\mu$ that is nearly optimal under a wide range of alternative variance patterns. We show that if $\mathrm{Var}(\hat\mu_k)\propto k^{-y}$ for an unknown rate parameter $y\in [0,1]$ then the square root rule yields the optimal variance rate with a constant that is too large by at most $9/8$ for any $0\le y\le 1$ and any number $K$ of estimates. Numerical work shows that rule is similarly robust to some other patterns with mildly decreasing variance as $k$ increases.

The square root rule for adaptive importance sampling

discussion (0)