Unbiased Estimation of the Reciprocal Mean for Non-negative Random Variables

Dirk P. Kroese; Sandeep Juneja; Sarat Moka

arxiv: 1907.01843 · v1 · pith:PTIOGKODnew · submitted 2019-07-03 · 🧮 math.ST · math.PR· stat.TH

Unbiased Estimation of the Reciprocal Mean for Non-negative Random Variables

Sarat Moka , Dirk P. Kroese , Sandeep Juneja This is my paper

Pith reviewed 2026-05-25 09:34 UTC · model grok-4.3

classification 🧮 math.ST math.PRstat.TH

keywords unbiased estimationreciprocal meanMonte Carloratio estimationnon-negative random variablesasymptotic equivalenceconfidence intervals

0 comments

The pith

An unbiased Monte Carlo estimator for the reciprocal mean is asymptotically equivalent to the biased maximum likelihood ratio estimator.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs and analyzes an unbiased estimator for β = 1/E[Z] when Z is a non-negative random variable. The estimator takes the product form w over f_w(N) times the product from i=1 to N of (1 - w Z_i), where N follows a chosen distribution f_w and w is a tuning parameter kept below 2β. It demonstrates that the product of expected computation time and variance shrinks as w is reduced, even as the number of terms grows. The analysis further establishes that the estimator converges in behavior to the ordinary biased ratio estimator based on sample means, which in turn permits construction of practical confidence intervals.

Core claim

The estimatorwidehat β(w) remains unbiased for β = 1/E[Z] for any admissible w and any valid f_w. Its expected time-variance product decreases monotonically as w decreases. In addition,widehat β(w) is asymptotically equivalent to the maximum-likelihood ratio estimator formed from the sample means of the Z_i, and this equivalence supports explicit confidence intervals.

What carries the argument

The product-form estimatorwidehat β(w) = w/f_w(N) times the product over N terms of (1 - w Z_i), which uses a random stopping time N to cancel the bias that would otherwise appear in a simple ratio of averages.

If this is right

For any fixed w the optimal choice of the distribution f_w is already known and can be used directly.
Reducing w improves the time-variance tradeoff even though more terms are generated on average.
Asymptotic equivalence to the maximum-likelihood estimator supplies a normal limit and therefore usable confidence intervals.
The same construction applies to any ratio of expectations that can be recast as a reciprocal mean.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

In applications one could begin with a conservatively small w and increase it only until the observed time-variance product stops improving.
The product structure may extend to unbiased estimation of other nonlinear functionals of expectations by analogous exponential tilting or martingale corrections.
Numerical checks on exponential or uniform Z would directly verify the claimed monotonic improvement in the time-variance product.

Load-bearing premise

The tuning parameter w must remain strictly less than twice the unknown target value β.

What would settle it

Generate Z from a distribution with known β, compute the sample average of the estimator over many independent replications, and check whether this average equals β within Monte Carlo error when w is below 2β but deviates systematically once w reaches or exceeds 2β.

read the original abstract

Many simulation problems require the estimation of a ratio of two expectations. In recent years Monte Carlo estimators have been proposed that can estimate such ratios without bias. We investigate the theoretical properties of such estimators for the estimation of $\beta = 1/\mathbb{E}\, Z$, where $Z \geq 0$. The estimator, $\widehat \beta(w)$, is of the form $w/f_w(N) \prod_{i=1}^N (1 - w\, Z_i)$, where $w < 2\beta$ and $N$ is any random variable with probability mass function $f_w$ on the positive integers. For a fixed $w$, the optimal choice for $f_w$ is well understood, but less so the choice of $w$. We study the properties of $\widehat \beta(w)$ as a function of~$w$ and show that its expected time variance product decreases as $w$ decreases, even though the cost of constructing the estimator increases with $w$. We also show that the estimator is asymptotically equivalent to the maximum likelihood (biased) ratio estimator and establish practical confidence intervals.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper studies how the free parameter w affects an existing family of unbiased estimators for 1/E[Z], showing a time-variance tradeoff and asymptotic equivalence to the biased MLE.

read the letter

The main point is that this work takes a known form of unbiased Monte Carlo estimator for the reciprocal mean β = 1/E[Z] and analyzes its behavior as a function of the tuning parameter w. They show that the expected time-variance product decreases as w gets smaller (even as construction cost rises), state the validity condition w < 2β explicitly, prove asymptotic equivalence to the usual biased ratio estimator, and outline practical confidence intervals. The analysis of the w-dependence and the efficiency tradeoff is the clearest addition to prior unbiased ratio estimators. The derivations appear to rest on standard probability arguments and are internally consistent with the estimator definition given in the abstract. The condition on w is not hidden, and the paper positions itself as examining that dependence rather than claiming a parameter-free method. The main limitation is that the contribution is incremental: the optimal f_w is described as already well understood, so the new material is mostly the w-specific results rather than a new estimator or broad methodological advance. Without numerical examples or implementation checks visible in the provided text, the practical payoff of the confidence intervals remains untested in the abstract. This is for researchers working on Monte Carlo simulation and unbiased estimation of ratios in computational statistics. Someone already using or extending unbiased ratio estimators would find the w-tradeoff analysis worth reading. The theoretical claims are specific enough and the setup standard enough that it deserves peer review.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a family of unbiased Monte Carlo estimators for the reciprocal mean β = 1/E[Z] (Z ≥ 0) of the form β̂(w) = w/f_w(N) ∏_{i=1}^N (1 − w Z_i), where N has pmf f_w on the positive integers and the construction is valid for w < 2β. It analyzes the dependence of the estimator on the free parameter w, establishes that the expected time-variance product decreases as w decreases (despite increasing computational cost), proves asymptotic equivalence to the ordinary (biased) maximum-likelihood ratio estimator, and derives practical confidence intervals.

Significance. If the central claims hold, the work supplies a theoretically grounded approach to unbiased ratio estimation in simulation, together with explicit guidance on the bias-variance-time tradeoff induced by w and a link to the standard biased estimator. The explicit validity condition w < 2β and the construction of confidence intervals are practical strengths that could be adopted in Monte Carlo applications requiring unbiasedness.

major comments (2)

[Abstract] The abstract asserts that the expected time-variance product decreases as w decreases, yet the precise definitions of 'time' (e.g., expected number of Z samples or wall-clock cost) and the variance term are not stated; without these, the claimed monotonicity cannot be verified from the given material.
[Abstract] The validity condition w < 2β is load-bearing for unbiasedness, but the manuscript must show how an implementer can select or adapt w when β is unknown; the current statement leaves open whether the procedure remains useful in the regime where only an upper bound on β is available.

minor comments (2)

[Abstract] Notation: the dependence of both the pmf f_w and the random variable N on w should be made explicit in the estimator definition to avoid ambiguity when w varies.
[Abstract] The phrase 'practical confidence intervals' is used without indicating whether they are asymptotic, exact, or bootstrap-based; a brief clarification would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for the constructive comments. We address each major comment below and indicate the revisions we will make.

read point-by-point responses

Referee: [Abstract] The abstract asserts that the expected time-variance product decreases as w decreases, yet the precise definitions of 'time' (e.g., expected number of Z samples or wall-clock cost) and the variance term are not stated; without these, the claimed monotonicity cannot be verified from the given material.

Authors: We agree that the abstract is brief and does not spell out the definitions. In the body of the paper, 'time' is the expected number of samples E[N] (equivalently the expected number of Z draws) and the variance term is Var(β̂(w)). The claimed monotonicity is proved in Theorem 3.2 for the product E[N]·Var(β̂(w)) as a function of w. We will revise the abstract to state these definitions explicitly and to point to the theorem. revision: yes
Referee: [Abstract] The validity condition w < 2β is load-bearing for unbiasedness, but the manuscript must show how an implementer can select or adapt w when β is unknown; the current statement leaves open whether the procedure remains useful in the regime where only an upper bound on β is available.

Authors: The manuscript states the condition w < 2β but does not provide explicit guidance on selecting w from data. When a lower bound on β is known, any w below twice that bound is safe. When only an upper bound on β is available, a conservative (sufficiently small) w must be chosen to guarantee the inequality; this preserves unbiasedness at the expense of higher expected runtime. We will add a short practical subsection on w-selection, including the conservative strategy for the upper-bound case, together with a brief numerical illustration. revision: partial

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper derives properties of the unbiased estimator β̂(w) = w/f_w(N) ∏_{i=1}^N (1 - w Z_i) directly from its explicit functional form and the given condition w < 2β, using standard results on expectations, variances, and asymptotic equivalence to the MLE ratio estimator. No step reduces a claimed result to a fitted parameter renamed as a prediction, a self-definitional loop, or a load-bearing self-citation whose validity depends on the present work. The analysis of the w-dependence (variance-time product decreasing as w decreases) follows from the estimator definition without tautological substitution. The derivation remains self-contained against external probabilistic benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Based solely on abstract; the construction relies on existence of E[Z] and standard properties of products and expectations for non-negative random variables.

free parameters (1)

w
Tunable parameter satisfying w < 2β; controls bias removal and efficiency but must be chosen by the user.

axioms (1)

domain assumption Z is a non-negative random variable with finite positive expectation
Required for β to be well-defined and for the product-form estimator to be unbiased.

pith-pipeline@v0.9.0 · 5731 in / 1266 out tokens · 34443 ms · 2026-05-25T09:34:17.071743+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The estimator, ˆβ(w), is of the form w/f_w(N) ∏_{i=1}^N (1−w Z_i), where w<2β and N is any random variable with probability mass function f_w on the positive integers.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.