Subsample-Based Estimation under Dynamic Contamination

Rickard Sandberg; Yukai Yang

arxiv: 2604.17676 · v3 · submitted 2026-04-20 · 📊 stat.ME · econ.EM· math.ST· stat.TH

Subsample-Based Estimation under Dynamic Contamination

Yukai Yang , Rickard Sandberg This is my paper

Pith reviewed 2026-05-12 02:03 UTC · model grok-4.3

classification 📊 stat.ME econ.EMmath.STstat.TH

keywords subsample estimationdynamic contaminationtime seriesresidual propagationpatch removalconsistencyrobust estimation

0 comments

The pith

Simply removing known contaminated points leaves subsample estimators inconsistent in dynamic time series models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that in dynamic time series models, even exact knowledge of contamination locations and their removal fails to recover the clean-data estimation objective. Contamination spreads through the residual filter, so the subsample version of the criterion remains distorted and the estimator stays biased for the uncontaminated parameter. The authors introduce a patch removal operator that transforms index sets to match this propagation. Under general high-level conditions the operator leaves the estimator asymptotically identical to the original under clean data while restoring consistency when contamination is present. The result covers a wide class of residual-based estimators and requires no model for how the contamination arises.

Core claim

Subsample-based estimators are generically inconsistent for the clean-data parameter whenever contamination propagates through transformations that enter the estimation criterion, with dynamic time series models as the leading case. This structural incompatibility between pointwise subsampling and residual propagation is addressed by a patch removal operator that adjusts index sets compatibly with the propagation, ensuring the transformed estimator is asymptotically unchanged under the uncontaminated model and consistent under contamination.

What carries the argument

The patch removal operator, a propagation-compatible transformation of index sets that removes both contaminated observations and the downstream effects they induce in the residual structure.

If this is right

Consistency under contamination is restored for any residual-based estimator satisfying the high-level conditions.
The transformed estimator coincides with the usual one asymptotically when no contamination occurs.
No parametric model of the contamination process is needed for the consistency result.
The same incompatibility arises in any setting where contamination enters the criterion through a propagating transformation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same patch adjustment idea may apply to other models with recursive residuals, such as ARMA or state-space representations.
Robust time-series procedures that rely on simple deletion of outliers could be improved by incorporating propagation effects.
Extensions to nonlinear or multivariate dynamics would be natural next steps for the approach.

Load-bearing premise

The patch removal operator must be asymptotically equivalent to the original estimator on uncontaminated data.

What would settle it

If the patch removal estimator has a different limiting distribution from the standard estimator when applied to clean data, the claimed asymptotic equivalence would be false.

read the original abstract

This paper studies a structural failure of subsample-based estimation in dynamic time series models. Even under oracle knowledge of contamination locations, removing contaminated observations does not restore the uncontaminated objective. In such settings, contamination propagates through the residual filter and distorts the estimation criterion. As a result, subsample-based estimators are generically inconsistent for the clean-data parameter. We characterise this failure as a structural incompatibility between pointwise subsampling and residual propagation. More generally, the failure arises whenever contamination propagates through transformations that enter the estimation criterion, with dynamic time series models as a leading example. To address it, we propose a propagation-compatible transformation of index sets via a patch removal operator. Under general high-level conditions, this transformation leaves the estimator asymptotically unchanged under the uncontaminated model while restoring consistency under contamination. The results apply to a broad class of residual-based estimators and do not rely on modelling the contamination process.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper correctly flags that subsampling breaks in dynamic models because contamination leaks through residuals, but the patch removal fix depends on high-level conditions whose generic validity is not yet shown.

read the letter

The core observation is that in dynamic time series, even oracle removal of contaminated points fails to recover the clean-data estimator. Residuals carry the distortion forward into the criterion, so ordinary subsampling is generically inconsistent for the uncontaminated parameter. The authors characterize this as a structural mismatch between pointwise index selection and the propagation that enters most residual-based estimators, and they introduce a patch removal operator to adjust the index sets accordingly. Under their stated high-level conditions the operator leaves the uncontaminated asymptotics unchanged while restoring consistency when contamination is present. This line of thinking is new relative to the usual robust time-series literature, which tends to assume either i.i.d. contamination or a fully specified model for it. The paper therefore addresses a practical gap that shows up in ARMA, state-space, and other recursive estimators common in econometrics. Credit is due for keeping the argument free of circularity and for not requiring the user to model the contamination process itself. The main softness is that the equivalence conditions are invoked at a high level without explicit verification for standard estimators or dependence structures. Removing patches alters the effective filtration and the number of usable lags; it is not obvious that the original asymptotic expansion survives unchanged unless the conditions explicitly bound the induced dependence. The abstract supplies no derivations, no Monte Carlo checks, and no worked examples on AR(1) or GARCH-type models, so it remains unclear how often the conditions hold in practice or how sensitive the fix is to the precise form of the residual filter. The paper is aimed at statisticians and econometricians who already use subsample or block-bootstrap methods on time series and who encounter occasional contamination. A reader working on robust estimation would find the diagnosis useful and the proposed operator worth testing, even if the current write-up is still at the level of a promising sketch. It deserves peer review because the problem is real, the proposed remedy is concrete enough to evaluate, and referees can insist on the missing checks without the paper being fundamentally misguided.

Referee Report

2 major / 2 minor

Summary. The paper claims that subsample-based estimators in dynamic time series models are generically inconsistent for the clean-data parameter, even with oracle knowledge of contamination locations, because contamination propagates through the residual filter and distorts the estimation criterion. It characterizes this as a structural incompatibility between pointwise subsampling and residual propagation, and proposes a patch removal operator that transforms index sets to restore consistency under contamination while leaving the estimator asymptotically unchanged under the uncontaminated model, under general high-level conditions. The results are said to apply to a broad class of residual-based estimators without modeling the contamination process.

Significance. If the high-level conditions hold and the patch removal operator can be shown to preserve asymptotics in standard dynamic models, this would identify a previously under-appreciated failure mode of subsampling under dependence and provide a practical fix for robust estimation in contaminated time series. The approach avoids parametric contamination modeling and targets a wide class of M-estimators, which could be valuable for applications in econometrics and signal processing where residual propagation is common.

major comments (2)

[Abstract and the section stating the high-level conditions for the patch removal operator] The central claim that subsample-based estimators are generically inconsistent rests on the assertion that contamination propagates through the residual filter in a way that distorts the criterion even after removal of contaminated points. However, the manuscript invokes 'general high-level conditions' for the patch removal operator to be asymptotically equivalent to the original estimator under the clean model without providing explicit verification or sufficient conditions for common residual-based estimators (e.g., M-estimators in ARMA or state-space models). This equivalence is load-bearing for both the inconsistency result and the proposed fix, yet the dependence structure induced by patch removal on the effective filtration is not shown to be controlled in dependent settings.
[Introduction and the section on the structural failure] The characterization of the failure as 'structural incompatibility between pointwise subsampling and residual propagation' is presented at a high level. A concrete counter-example or derivation showing how the criterion is distorted for a standard dynamic model (e.g., AR(1) with contaminated innovations) would strengthen the generic inconsistency claim; without it, the result risks depending on the unverified bridge assumptions noted in the skeptic's analysis.

minor comments (2)

Notation for the patch removal operator and the transformed index sets should be introduced with a clear definition and an illustrative example early in the paper to aid readability.
[Abstract] The abstract states that results 'do not rely on modelling the contamination process,' but the manuscript should explicitly contrast this with existing robust methods that do model contamination to clarify the contribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments on our manuscript. The feedback identifies key areas where additional explicit illustrations and verifications will strengthen the presentation of both the inconsistency result and the proposed patch removal operator. We respond to each major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract and the section stating the high-level conditions for the patch removal operator] The central claim that subsample-based estimators are generically inconsistent rests on the assertion that contamination propagates through the residual filter in a way that distorts the criterion even after removal of contaminated points. However, the manuscript invokes 'general high-level conditions' for the patch removal operator to be asymptotically equivalent to the original estimator under the clean model without providing explicit verification or sufficient conditions for common residual-based estimators (e.g., M-estimators in ARMA or state-space models). This equivalence is load-bearing for both the inconsistency result and the proposed fix, yet the dependence structure induced by patch removal on the effective filtration is not shown to be controlled in dependent settings.

Authors: We agree that the high-level conditions, while intended to be broadly applicable, would benefit from explicit verification for standard models to make the results more immediately usable. In the revised manuscript we will add an appendix that verifies the conditions for M-estimators in ARMA and linear state-space models. This verification will include showing that the patch removal operator preserves the requisite dependence properties (e.g., mixing rates or martingale difference structure) on the effective filtration under the clean model, thereby confirming asymptotic equivalence. We will also update the abstract to note that the conditions are checkable for the leading classes of residual-based estimators. revision: yes
Referee: [Introduction and the section on the structural failure] The characterization of the failure as 'structural incompatibility between pointwise subsampling and residual propagation' is presented at a high level. A concrete counter-example or derivation showing how the criterion is distorted for a standard dynamic model (e.g., AR(1) with contaminated innovations) would strengthen the generic inconsistency claim; without it, the result risks depending on the unverified bridge assumptions noted in the skeptic's analysis.

Authors: We acknowledge that the structural incompatibility is currently characterized at a general level. To address this directly, the revision will expand the introduction and the section on the structural failure to include a self-contained derivation and counter-example for the AR(1) model with contaminated innovations. The example will explicitly trace how residual propagation distorts the estimation criterion even under oracle removal of contaminated observations, thereby illustrating the generic inconsistency without relying solely on the high-level bridge assumptions. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on independent high-level conditions for patch removal equivalence

full rationale

The paper's derivation chain invokes general high-level conditions ensuring the patch removal operator is asymptotically equivalent to the original estimator under the uncontaminated model while restoring consistency under contamination. No quoted equations, self-definitions, fitted parameters renamed as predictions, or self-citation chains reduce the central result to its inputs by construction. The argument is framed as applying to a broad class of residual-based estimators without modeling contamination, leaving the derivation self-contained and independent of the target inconsistency claim.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claims rest on unspecified high-level conditions for asymptotic equivalence and on the newly introduced patch removal operator; no free parameters or additional invented entities beyond the operator are mentioned.

axioms (1)

domain assumption General high-level conditions under which the patch removal operator leaves the estimator asymptotically unchanged without contamination
Invoked to guarantee the fix works while preserving the uncontaminated case.

invented entities (1)

patch removal operator no independent evidence
purpose: Propagation-compatible transformation of index sets that accounts for residual filter effects
Newly proposed construct to resolve the identified incompatibility between subsampling and dynamic propagation.

pith-pipeline@v0.9.0 · 5449 in / 1263 out tokens · 53719 ms · 2026-05-12T02:03:13.507401+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Under AO contamination, the residual satisfies ẽ_t(ϕ) = e_t(ϕ) + π(L)δ_t ζ_t.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages

[1]

Abraham, B. and G. E. P. Box (1979). Bayesian analysis of some outlier problems in time series. Biometrika 66(2), 229–236. Amemiya, T. (1985). Asymptotic properties of extremum estimators. InAdvanced Econometrics, pp. 105–158. Cambridge, MA: Harvard University Press. Andrews, D. W. K. (1993). Tests for parameter instability and structural change with unkn...

work page 1979
[2]

“Clean” denotes uncontaminated data

B Tables Table 3: Total bias and RMSE for the V AR model. “Clean” denotes uncontaminated data. For the V AR model, IO yields results identical to the clean case. T=500 T=1000 ζ α(%) Clean/IO AO Clean/IO AO κ=0κ=1κ=0κ=1 κ=0κ=1κ=0κ=1 Panel A: Total bias 5 1 0.0049 0.0049 0.1239 0.0049 0.0027 0.0028 0.1214 0.0028 5 0.0062 0.0060 0.4112 0.0060 0.0031 0.0035 0...

work page 2097

[1] [1]

Abraham, B. and G. E. P. Box (1979). Bayesian analysis of some outlier problems in time series. Biometrika 66(2), 229–236. Amemiya, T. (1985). Asymptotic properties of extremum estimators. InAdvanced Econometrics, pp. 105–158. Cambridge, MA: Harvard University Press. Andrews, D. W. K. (1993). Tests for parameter instability and structural change with unkn...

work page 1979

[2] [2]

“Clean” denotes uncontaminated data

B Tables Table 3: Total bias and RMSE for the V AR model. “Clean” denotes uncontaminated data. For the V AR model, IO yields results identical to the clean case. T=500 T=1000 ζ α(%) Clean/IO AO Clean/IO AO κ=0κ=1κ=0κ=1 κ=0κ=1κ=0κ=1 Panel A: Total bias 5 1 0.0049 0.0049 0.1239 0.0049 0.0027 0.0028 0.1214 0.0028 5 0.0062 0.0060 0.4112 0.0060 0.0031 0.0035 0...

work page 2097