Minimax unbiased estimation for finite populations with bounded outcomes
Pith reviewed 2026-05-21 02:45 UTC · model grok-4.3
The pith
When each unit's outcome is confined to a known interval, the minimax unbiased estimator for the population total is a midpoint-adjusted Horvitz-Thompson estimator paired with independent sampling whose probabilities are proportional to the
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For any sampling design with positive inclusion probabilities, a sharp lower bound exists on the worst-case squared error over all possible outcomes in the product of the intervals [a_i, b_i]. Equality holds if and only if the inclusion indicators are pairwise independent, and the estimator that attains the bound is the midpoint-differenced Horvitz-Thompson estimator. Solving the joint optimization problem shows that the minimax design samples each unit independently with probability min(1, c times the interval length) for a constant c chosen to meet the size constraint.
What carries the argument
The midpoint-differenced Horvitz-Thompson estimator together with a sampling design that makes inclusion indicators pairwise independent.
If this is right
- Any pairwise-independent design achieves the lower bound for its given inclusion probabilities.
- The estimator is admissible among unbiased affine-equivariant estimators.
- The construction extends Gabler's linear minimax result to the full class of design-unbiased estimators.
- The optimal inclusion probabilities are set to min(1, c(b_i - a_i)) with c chosen to satisfy the expected sample size.
Where Pith is reading between the lines
- Practitioners can apply the length-proportional probabilities to obtain explicit worst-case guarantees in surveys where item values have known bounds.
- Poisson sampling with these probabilities approximates the required pairwise independence in large populations.
- The same rectangular-parameter-space argument may extend to minimax estimation of other functionals such as subpopulation totals.
Load-bearing premise
The possible values of the outcomes fill the entire rectangular region given by the product of the individual intervals, and only design-unbiased estimators are considered.
What would settle it
An unbiased estimator or design that achieves a strictly smaller worst-case squared error than the midpoint-differenced Horvitz-Thompson estimator under the proposed inclusion probabilities would falsify the claim that the bound is sharp and attained only by this construction.
read the original abstract
We study design-unbiased estimation of the finite-population total $\sum_{i=1}^N y_i$ when each outcome satisfies known bounds $y_i\in[a_i,b_i]$. For any sampling design with inclusion probabilities $\pi_i>0$, we prove a sharp lower bound on the worst-case squared error over the rectangular parameter space. This bound is attained if and only if the unit inclusion indicators are pairwise independent, in which case the minimax estimator is the midpoint-differenced Horvitz-Thompson estimator $\sum_{i=1}^N m_i+\sum_{i\in S}(y_i-m_i)/\pi_i$, with $m_i=(a_i+b_i)/{2}$. We then solve the joint design-and-estimation problem under the constraint $\sum_i \pi_i\le n$. We find that a minimax strategy samples units independently with probabilities $\pi_i^\ast=\min(1,c (b_i-a_i))$ where $c>0$ is chosen so that $\sum_i \pi_i^\ast=n$, and uses the midpoint-differenced estimator. This extends Gabler (1990)'s linear minimax result to the full class of design-unbiased estimators. We also show that the estimator is admissible among unbiased estimators and affine equivariant.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proves a sharp lower bound on the worst-case mean squared error of any design-unbiased estimator for the finite-population total when each unit outcome y_i lies in a known interval [a_i, b_i]. The bound is attained if and only if the inclusion indicators are pairwise independent; in that case the midpoint-differenced Horvitz-Thompson estimator achieves the bound. The authors then solve the joint design-estimation problem under the constraint that the expected sample size equals n, obtaining independent Bernoulli sampling with inclusion probabilities π_i^* = min(1, c(b_i - a_i)) for a suitable c, together with the same midpoint-differenced estimator. They further establish admissibility of this estimator within the class of unbiased and affine-equivariant estimators. The work extends Gabler (1990) from the linear subclass to all design-unbiased estimators.
Significance. If the derivations hold, the paper supplies a complete, explicit minimax solution for sampling and estimation under rectangular boundedness constraints. The separation of per-unit risk contributions via the product parameter space, the necessity of pairwise independence for attaining the bound, and the closed-form optimal design constitute a substantive advance in finite-population minimax theory. The admissibility result adds practical weight to the recommendation.
major comments (1)
- [Theorem 3.2] The necessity direction of the pairwise-independence characterization (that the lower bound is attained only when inclusions are pairwise independent) is load-bearing for the claim that the independent Bernoulli design is uniquely minimax. A concrete verification that the cross-term expectations vanish if and only if Cov(I_i, I_j) = 0 for i ≠ j would strengthen the argument.
minor comments (3)
- [Section 2] The definition of the midpoint m_i = (a_i + b_i)/2 is used repeatedly but first appears only after the statement of the main lower-bound theorem; introducing it in the notation section would improve readability.
- [Section 4] The constant c in the optimal inclusion probabilities π_i^* is defined implicitly by the equation ∑ min(1, c(b_i - a_i)) = n. An explicit algorithm or closed-form expression for c (or a reference to one) would help readers implement the design.
- [References] The citation to Gabler (1990) is given only by year; the full bibliographic details should be supplied in the references.
Simulated Author's Rebuttal
We thank the referee for the careful reading and the recommendation of minor revision. We address the single major comment below.
read point-by-point responses
-
Referee: [Theorem 3.2] The necessity direction of the pairwise-independence characterization (that the lower bound is attained only when inclusions are pairwise independent) is load-bearing for the claim that the independent Bernoulli design is uniquely minimax. A concrete verification that the cross-term expectations vanish if and only if Cov(I_i, I_j) = 0 for i ≠ j would strengthen the argument.
Authors: We appreciate the suggestion to make the necessity direction more explicit. In the proof of Theorem 3.2 the worst-case MSE expands into a sum of per-unit variance terms plus cross terms of the form E[(I_i - π_i)(I_j - π_j)(y_i - m_i)(y_j - m_j)]/(π_i π_j). Because the parameter space is a product of intervals, the sign of each (y_k - m_k) can be chosen independently; consequently the cross term is nonnegative and strictly positive for some choice of y whenever Cov(I_i, I_j) ≠ 0. The cross term vanishes for every y precisely when the covariance is zero. We will insert a short remark immediately after the statement of Theorem 3.2 that isolates this direct calculation and thereby renders the if-and-only-if claim fully concrete. revision: yes
Circularity Check
No significant objection identified
full rationale
The paper establishes its central minimax result via direct mathematical arguments: a sharp lower bound on worst-case squared error is derived over the rectangular product space of intervals [a_i, b_i], shown to be attained precisely when inclusion indicators are pairwise independent, and the midpoint-differenced Horvitz-Thompson estimator is identified as the unique attaining estimator for any fixed marginals π_i. The subsequent design optimization then selects the specific π_i^* = min(1, c(b_i - a_i)) that minimizes this bound subject to ∑π_i = n. These steps rely on explicit use of the product structure to separate per-unit risk contributions and extend the earlier linear-minimax result of Gabler (1990) without any self-referential definitions, fitted inputs renamed as predictions, or load-bearing self-citations. The derivation chain is therefore self-contained against the stated assumptions and external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Outcomes y_i lie in known fixed intervals [a_i, b_i] forming a rectangular parameter space.
- domain assumption Sampling designs have positive inclusion probabilities π_i > 0.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
rectangular parameter space Θ = ∏[a_i,b_i]
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Aggarwal, O. P. , title =. Ann. Math. Statist. , year =
-
[2]
Bickel, P. J. and Lehmann, E. L. , title =. Ann. Statist. , year =
-
[3]
Cassel, C. M. and S. Some results on generalized difference estimation and generalized regression estimation for finite populations , journal =. 1976 , volume =
work page 1976
- [4]
-
[5]
Deville, J.-C. and S. Calibration estimators in survey sampling , journal =. 1992 , volume =
work page 1992
- [6]
-
[7]
Godambe, V. P. , title =. J. Roy. Statist. Soc. B , year =
-
[8]
Godambe, V. P. and Joshi, V. M. , title =. Ann. Math. Statist. , year =
-
[9]
Horvitz, D. G. and Thompson, D. J. , title =. J. Amer. Statist. Assoc. , year =
-
[10]
The Annals of Statistics , volume=
The best strategy for estimating the mean of a finite population , author=. The Annals of Statistics , volume=. 1979 , publisher=
work page 1979
-
[11]
Statistics and probability: essays in honor of C.R
Minimax estimation in simple random sampling , author=. Statistics and probability: essays in honor of C.R. Rao. North-Holland Publishing Company , pages=
-
[12]
A conditional minimax approach in survey sampling , author=. Metrika , volume=. 1988 , publisher=
work page 1988
-
[13]
The Annals of Statistics , pages=
Asymptotic analysis of minimax strategies in survey sampling , author=. The Annals of Statistics , pages=. 1989 , publisher=
work page 1989
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.