arxiv: 2603.14561 · v4 · submitted 2026-03-15 · 📊 stat.ME · math.ST· stat.TH

Refined Inference for Asymptotically Linear Estimators with Non-Negligible Second-Order Remainders

Lin Li , Pengcheng Wu This is my paper

Pith reviewed 2026-05-15 11:15 UTC · model grok-4.3

classification 📊 stat.ME math.STstat.TH

keywords asymptotically linear estimatorsvon Mises expansionsecond-order remaindersandwich varianceleave-one-out jackknifepairs bootstrapclustered datanear-boundary regime

0 comments

The pith

When the second-order remainder adds non-negligible variance to asymptotically linear estimators, the sandwich variance underestimates total sampling variability but the leave-one-out jackknife and pairs bootstrap recover it.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Semiparametric estimators that admit a von Mises expansion usually reduce inference to the influence-function variance. This reduction holds only when the second-order remainder is negligible in variance, which is stricter than the usual product-rate condition for asymptotic linearity. In the near-boundary regime where the remainder contributes meaningful variance, the standard sandwich estimator underestimates the total variance and Wald intervals undercover. The paper derives a finite-sample variance decomposition that isolates the remainder contribution and shows that the leave-one-out jackknife (via self-normalization) and pairs cluster bootstrap (via Mallows-2 consistency) can estimate the full variance. This matters for estimators used in causal inference and stepped-wedge trials that often operate near such boundaries.

Core claim

The paper claims that a finite-sample variance decomposition separates the influence-function variance from the variance contributed by the second-order remainder in the von Mises expansion. In the near-boundary regime where the remainder variance is non-negligible, this decomposition explains why the sandwich estimator fails to capture total variance. The leave-one-out jackknife achieves consistency for the total variance through self-normalization, while the pairs cluster bootstrap does so under a Mallows-2 consistency condition. For clustered data an analytic expression quantifies how intra-cluster correlation amplifies the gap between sandwich and total variance.

What carries the argument

Finite-sample variance decomposition separating influence-function and remainder components, with self-normalization for jackknife consistency and Mallows-2 condition for bootstrap consistency.

Load-bearing premise

The second-order remainder contributes non-negligible variance to the estimator's sampling variability, together with the regularity conditions required for jackknife self-normalization and bootstrap Mallows-2 consistency.

What would settle it

A simulation in which the jackknife or bootstrap variance estimator shows no coverage improvement over the sandwich estimator even after the remainder variance is made deliberately non-negligible would falsify the practical claim.

read the original abstract

Semiparametric estimators admitting a von Mises expansion often reduce inference to the influence-function variance. This reduction is justified when the second-order remainder is negligible in variance, a condition that is stronger than the usual product-rate requirement guaranteeing classical asymptotic linearity. When the remainder contributes non-negligible variance, the standard sandwich can underestimate the total sampling variance and Wald intervals can undercover; we call this the \emph{near-boundary regime}. We derive a finite-sample variance decomposition separating influence-function and remainder components, give a practical characterization of when sandwich variance can fail, and show that the leave-one-out jackknife and pairs cluster bootstrap can estimate the total variance under explicit regularity conditions. For the jackknife, consistency follows from a self-normalization argument; for the bootstrap, we work under a Mallows-2 consistency condition. An analytic expression for the amplification of the sandwich gap by intra-cluster correlation is derived for clustered data. A simulation study using a surrogate-assisted targeted learning estimator in stepped-wedge cluster-randomized trials illustrates the regime: the variance ratio $\hat{V}_{\rm JK}/\hat{V}_{\rm Sand}$ is 1.14--1.38 and persistent across cluster counts, and the refined procedures substantially improve coverage.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a practical way to fix variance underestimation for semiparametric estimators when second-order remainders add non-negligible variance, using jackknife or bootstrap with explicit conditions.

read the letter

The main thing to know is that this paper handles cases where the second-order remainder in von Mises expansions for asymptotically linear estimators contributes real variance, so the usual sandwich underestimates total sampling variance and intervals undercover. They call this the near-boundary regime and focus on clustered settings like stepped-wedge trials. They derive a finite-sample variance decomposition that separates the influence-function term from the remainder term, characterize when the sandwich fails, and give an analytic factor showing how intra-cluster correlation amplifies the gap. Consistency for the leave-one-out jackknife follows from self-normalization, and for the pairs cluster bootstrap from a Mallows-2 condition. The simulation with a surrogate-assisted targeted learning estimator reports variance ratios of 1.14-1.38 and better coverage. The work does well by stating the regularity conditions explicitly and showing concrete numbers in the simulation, which makes the refinement usable rather than purely theoretical. The decomposition and amplification expression are clear additions relative to standard expansions. Soft spots are limited: the conditions for consistency are stated but left for users to verify in new applications, and the simulation covers only one estimator type, though it illustrates the regime effectively. No internal contradictions appear between the decomposition and the consistency arguments. This is for statisticians working on causal inference or clinical trials with semiparametric methods in clustered data who have seen coverage shortfalls with sandwich variance. Readers already using targeted learning or similar will get direct value from the jackknife and bootstrap options. It deserves serious peer review because the problem is practical, the proposal is focused, and the supporting evidence is there.

Referee Report

1 major / 1 minor

Summary. The paper claims that for semiparametric estimators admitting a von Mises expansion, the standard sandwich variance can underestimate total sampling variance when the second-order remainder contributes non-negligibly (the near-boundary regime), leading to undercovering Wald intervals. It derives a finite-sample variance decomposition isolating influence-function and remainder components, provides a practical characterization of sandwich failure, and shows that the leave-one-out jackknife (via self-normalization) and pairs cluster bootstrap (under Mallows-2 consistency) consistently estimate the total variance under explicit regularity conditions. An analytic expression for amplification of the sandwich gap by intra-cluster correlation is derived for clustered data. A simulation using a surrogate-assisted targeted learning estimator in stepped-wedge cluster-randomized trials reports variance ratios of 1.14-1.38 and improved coverage.

Significance. If the decomposition and consistency results hold, the work provides a targeted refinement to asymptotic inference for estimators where classical product-rate conditions are satisfied but the remainder affects variance. The explicit conditions (self-normalization for jackknife; Mallows-2 for bootstrap) and the closed-form expression for clustered data are strengths, as is the simulation evidence of measurable variance inflation and coverage gains in a relevant application. This addresses a practical gap in semiparametric inference without requiring stronger assumptions than standard asymptotic linearity.

major comments (1)

Simulation study: the abstract reports variance ratios of 1.14-1.38 and states that refined procedures substantially improve coverage, but the central claim of practical utility would be strengthened by reporting the actual coverage probabilities (e.g., for nominal 95% intervals) and the number of Monte Carlo replications used; without these, the magnitude of improvement in the near-boundary regime remains qualitative.

minor comments (1)

The introduction of the 'near-boundary regime' is clear, but a short paragraph contrasting it with standard higher-order asymptotic expansions (e.g., Edgeworth or von Mises remainder bounds) would help readers situate the contribution relative to existing literature on remainder terms.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive evaluation and the constructive suggestion regarding the simulation study. We address the comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: Simulation study: the abstract reports variance ratios of 1.14-1.38 and states that refined procedures substantially improve coverage, but the central claim of practical utility would be strengthened by reporting the actual coverage probabilities (e.g., for nominal 95% intervals) and the number of Monte Carlo replications used; without these, the magnitude of improvement in the near-boundary regime remains qualitative.

Authors: We agree that reporting the empirical coverage probabilities for nominal 95% intervals and the number of Monte Carlo replications would strengthen the presentation of the simulation results. In the revised manuscript we will expand the simulation section to include a table (or inline summary) of the observed coverage rates for the sandwich, jackknife, and bootstrap procedures, and we will explicitly state the number of replications used. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The derivation begins from the standard von Mises expansion for asymptotically linear estimators and produces an explicit finite-sample variance decomposition that isolates the influence-function term from the second-order remainder term. Consistency of the leave-one-out jackknife is obtained via a self-normalization argument and consistency of the pairs cluster bootstrap is obtained under a stated Mallows-2 condition; both sets of regularity conditions are written out explicitly rather than being fitted or defined in terms of the target variance. No step renames a fitted quantity as a prediction, imports a uniqueness theorem from the authors' prior work, or smuggles an ansatz through self-citation. The simulation merely illustrates the regime already characterized by the decomposition, so the central claims remain independent of the paper's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The claims rest on the existence of a von Mises expansion with a non-negligible second-order remainder, plus standard regularity conditions for jackknife and bootstrap consistency; no new free parameters or invented entities are introduced.

axioms (2)

domain assumption The estimator admits a von Mises expansion whose second-order remainder is non-negligible in variance
Invoked in the opening paragraph to define the near-boundary regime
domain assumption Regularity conditions for jackknife self-normalization and Mallows-2 bootstrap consistency hold
Required for the consistency statements in the abstract

pith-pipeline@v0.9.0 · 5523 in / 1491 out tokens · 68481 ms · 2026-05-15T11:15:43.839135+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

finite-sample variance decomposition separating influence-function and remainder components... near-boundary regime... Rrem = ∫(η̂1−η01)(η̂2−η02)dP0 + oP(n−1/2)
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

standard bilinear structure... product-rate boundary

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.