The square root rule for adaptive importance sampling
read the original abstract
In adaptive importance sampling, and other contexts, we have $K>1$ unbiased and uncorrelated estimates $\hat\mu_k$ of a common quantity $\mu$. The optimal unbiased linear combination weights them inversely to their variances but those weights are unknown and hard to estimate. A simple deterministic square root rule based on a working model that $\mathrm{Var}(\hat\mu_k)\propto k^{-1/2}$ gives an unbisaed estimate of $\mu$ that is nearly optimal under a wide range of alternative variance patterns. We show that if $\mathrm{Var}(\hat\mu_k)\propto k^{-y}$ for an unknown rate parameter $y\in [0,1]$ then the square root rule yields the optimal variance rate with a constant that is too large by at most $9/8$ for any $0\le y\le 1$ and any number $K$ of estimates. Numerical work shows that rule is similarly robust to some other patterns with mildly decreasing variance as $k$ increases.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.