Design Stability in Adaptive Experiments: Implications for Treatment Effect Estimation
Pith reviewed 2026-05-18 05:10 UTC · model grok-4.3
The pith
Design stability ensures central limit theorems for IPW and AIPW estimators of average treatment effects in adaptive experiments.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under the condition of design stability, both the IPW estimator and the AIPW estimator for the average treatment effect are asymptotically normal in sequentially adaptive experiments, and the paper supplies explicit expressions for their asymptotic variances.
What carries the argument
Design stability: as the number of units grows, either the assignment probabilities converge or sample averages of the inverse propensity scores and inverse complement propensity scores converge in probability to fixed non-random limits.
If this is right
- Consistent estimators of the asymptotic variances allow construction of valid confidence intervals for the average treatment effect.
- Both the plain IPW and the augmented IPW estimators admit central limit theorems under the same stability condition.
- The results apply directly to Wei's adaptive coin design and Efron's biased coin design.
Where Pith is reading between the lines
- Many practical sequential experiments may satisfy design stability, which would let researchers retain reliable large-sample inference while still using adaptive assignment.
- The same stability lens could be applied to other causal estimators or to settings with time-varying treatments.
Load-bearing premise
As the number of experimental units increases, the treatment assignment probabilities either converge or the sample averages of the inverse propensity scores converge in probability to fixed constants.
What would settle it
Run an adaptive experiment in which the sample averages of the inverse propensity scores fail to converge to any fixed limit and check whether the IPW estimator remains asymptotically normal with the claimed variance.
Figures
read the original abstract
We study the problem of estimating the average treatment effect (ATE) under sequentially adaptive treatment assignment mechanisms. In contrast to classical completely randomized designs, we consider a setting in which the probability of assigning treatment to each experimental unit may depend on prior assignments and observed outcomes. Within the potential outcomes framework, we propose and analyze two natural estimators for the ATE: the inverse propensity weighted (IPW) estimator and an augmented IPW (AIPW) estimator. The cornerstone of our analysis is the concept of design stability, which requires that as the number of units grows, either the assignment probabilities converge, or sample averages of the inverse propensity scores and of the inverse complement propensity scores converge in probability to fixed, non-random limits. Our main results establish central limit theorems for both the IPW and AIPW estimators under design stability and provide explicit expressions for their asymptotic variances. We further propose estimators for these variances, enabling the construction of asymptotically valid confidence intervals. Finally, we illustrate our theoretical results in the context of Wei's adaptive coin design and Efron's biased coin design, highlighting the applicability of the proposed methods to sequential experimentation with adaptive randomization.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies ATE estimation under sequentially adaptive treatment assignment in the potential outcomes framework. It introduces the design stability condition (convergence of assignment probabilities or sample averages of inverse propensities to non-random limits) and establishes CLTs for the IPW and AIPW estimators with explicit asymptotic variance expressions. Variance estimators are proposed for confidence intervals, and the results are verified for Wei's urn design and Efron's biased coin design.
Significance. If design stability holds, the explicit CLTs and variance formulas provide a practical route to asymptotically valid inference in adaptive experiments, where dependence induced by sequential adaptation typically complicates standard arguments. The verification for two canonical adaptive designs and the proposal of variance estimators are concrete strengths that enhance applicability.
major comments (2)
- §3 (Main Results), Theorem 1: The martingale CLT is applied to the triangular array of IPW terms after invoking design stability to obtain convergence of the conditional variances; however, the argument does not explicitly verify the Lindeberg condition, which is load-bearing for the CLT conclusion and should be checked or sketched.
- §4 (AIPW estimator), Equation (12): The asymptotic variance expression for the AIPW estimator subtracts the augmentation term, but the proof sketch does not quantify the rate at which the cross term vanishes under design stability; this affects whether the variance reduction is asymptotically strict or only o_p(1).
minor comments (3)
- Notation: The inverse propensity scores are denoted p_i and 1-p_i without a consistent subscript for the limiting values; introducing a separate symbol for the design-stability limits would improve readability.
- §5 (Examples): The verification that design stability holds for Efron's biased coin is given only in probability; adding a brief remark on almost-sure convergence (if available) would strengthen the illustration.
- References: The manuscript cites the classical martingale CLT but omits a recent reference on adaptive designs with similar stability conditions; adding one or two such citations would contextualize the contribution.
Simulated Author's Rebuttal
We thank the referee for the positive evaluation of our manuscript and for the constructive comments. We address each major comment below and will incorporate the suggested clarifications into the revised version.
read point-by-point responses
-
Referee: §3 (Main Results), Theorem 1: The martingale CLT is applied to the triangular array of IPW terms after invoking design stability to obtain convergence of the conditional variances; however, the argument does not explicitly verify the Lindeberg condition, which is load-bearing for the CLT conclusion and should be checked or sketched.
Authors: We appreciate the referee highlighting this point. Design stability ensures convergence of the conditional variances to a non-random positive limit. Under the paper's maintained assumptions that potential outcomes are bounded and propensity scores are bounded away from 0 and 1, each term in the triangular array is uniformly bounded by a constant independent of n. Consequently the Lindeberg condition holds automatically. We will add an explicit verification of this fact to the proof of Theorem 1 in the revision. revision: yes
-
Referee: §4 (AIPW estimator), Equation (12): The asymptotic variance expression for the AIPW estimator subtracts the augmentation term, but the proof sketch does not quantify the rate at which the cross term vanishes under design stability; this affects whether the variance reduction is asymptotically strict or only o_p(1).
Authors: We thank the referee for this observation. In the current proof sketch we establish that the cross term between the IPW component and the augmentation is o_p(1) under design stability, yielding the stated asymptotic variance formula. A more precise argument shows that design stability implies the cross term is actually O_p(n^{-1/2}), which guarantees that the asymptotic variance reduction is strict. We will expand the proof sketch to include this rate calculation. revision: yes
Circularity Check
No significant circularity; derivations rest on external design stability assumption
full rationale
The paper posits design stability as a primitive assumption (convergence in probability of assignment probabilities or of sample averages of inverse propensities to non-random limits) and derives CLTs for the IPW and AIPW estimators under that assumption using standard martingale arguments in the potential outcomes framework. The assumption is independently verified for Wei's urn and Efron's biased coin designs, but this verification is not load-bearing for the main theorems and does not reduce any result to a fitted quantity or self-citation by construction. No quoted step equates a derived quantity to its own inputs; the analysis remains self-contained against the stated external condition and classical statistical tools.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Potential outcomes framework for defining ATE
- ad hoc to paper Design stability condition on assignment probabilities
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.lean (J-uniqueness); IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction; washburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Definition 2 (Strong design stability): pi → p* in probability; Definition 3 (Weak): averages of 1/pi and 1/(1-pi) converge to fixed limits; Theorems 1/4 establish CLTs under these for IPW/AIPW with explicit VIPW, VAIPW.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
On the application of probability theory to agricultural experiments. Essay on principles
Jerzy Neyman. “On the application of probability theory to agricultural experiments. Essay on principles”. In:Statistical Science5.4 (1923). Reprinted from Roczniki Nauk Rolniczych, 1923, pp. 465–480
work page 1923
-
[2]
The adaptive biased coin design for sequential experiments
L J Wei. “The adaptive biased coin design for sequential experiments”. In:Ann. Stat.6.1 (Jan. 1978), pp. 92–100
work page 1978
-
[3]
Forcing a sequential experiment to be balanced
Bradley Efron. “Forcing a sequential experiment to be balanced”. In:Biometrika58.3 (1971), pp. 403–417
work page 1971
-
[4]
Xinpeng Shen et al. “Challenges and opportunities with causal discovery algorithms: Appli- cation to Alzheimer’s pathophysiology”. en. In:Sci. Rep.10.1 (Feb. 2020), p. 2975
work page 2020
-
[5]
David Kaplan. “Causal inference with large-scale assessments in education from a Bayesian perspective: a review and synthesis”. en. In:Large Scale Assess. Educ.4.1 (Dec. 2016)
work page 2016
-
[6]
Welfare analysis meets causal inference
Amy Finkelstein and Nathaniel Hendren. “Welfare analysis meets causal inference”. en. In: J. Econ. Perspect.34.4 (Nov. 2020), pp. 146–167
work page 2020
-
[7]
Nonparametric estimation of average treatment effects under exogeneity: A review
Guido W Imbens. “Nonparametric estimation of average treatment effects under exogeneity: A review”. en. In:Rev. Econ. Stat.86.1 (Feb. 2004), pp. 4–29
work page 2004
-
[8]
Dynamic causal effects evaluation in A/B testing with a reinforcement learning framework
Chengchun Shi et al. “Dynamic causal effects evaluation in A/B testing with a reinforcement learning framework”. en. In:J. Am. Stat. Assoc.118.543 (July 2023), pp. 2059–2071
work page 2023
- [9]
-
[10]
Cambridge University Press, 2000
Aad W van der Vaart.Asymptotic Statistics. Cambridge University Press, 2000. 20
work page 2000
- [11]
-
[12]
William G Cochran.Sampling Techniques. 3rd. Wiley, 1977
work page 1977
-
[13]
Paul R Rosenbaum.Observational Studies. 2nd. Springer, 2002
work page 2002
-
[14]
Cambridge University Press, 2015
Guido W Imbens and Donald B Rubin.Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press, 2015
work page 2015
-
[15]
On the Limiting Distributions of Estimates Based on Samples from Finite Universes
William G Madow. “On the Limiting Distributions of Estimates Based on Samples from Finite Universes”. In:Annals of Mathematical Statistics19.4 (1948), pp. 535–545
work page 1948
-
[16]
On the Central Limit Theorem for Samples from a Finite Pop- ulation
Paul Erd˝ os and Alfr´ ed R´ enyi. “On the Central Limit Theorem for Samples from a Finite Pop- ulation”. In:Publication of the Mathematical Institute of the Hungarian Academy of Sciences 4 (1959), pp. 49–61
work page 1959
-
[17]
Limiting Distributions in Simple Random Sampling from a Finite Popula- tion
Jaroslav H´ ajek. “Limiting Distributions in Simple Random Sampling from a Finite Popula- tion”. In:Publications of the Mathematical Institute of the Hungarian Academy of Sciences 5 (1960), pp. 361–374
work page 1960
-
[18]
Erich L Lehmann.Nonparametrics: Statistical Methods Based on Ranks. Holden-Day, 1975
work page 1975
-
[19]
On Cumulative Sums of Random Variables
Abraham Wald. “On Cumulative Sums of Random Variables”. In:Annals of Mathematical Statistics15.3 (1944), pp. 283–296
work page 1944
-
[20]
On a theorem by Wald and Wolfowitz
Gottfried E Noether. “On a theorem by Wald and Wolfowitz”. In:Ann. Math. Stat.20.3 (Sept. 1949), pp. 455–458
work page 1949
-
[21]
A vector form of the Wald-Wolfowitz-Hoeffding theorem
D A S Fraser. “A vector form of the Wald-Wolfowitz-Hoeffding theorem”. In:Ann. Math. Stat.27.2 (June 1956), pp. 540–543
work page 1956
-
[22]
Some Extensions of the Wald–Wolfowitz–Noether Theorem
Jaroslav H´ ajek. “Some Extensions of the Wald–Wolfowitz–Noether Theorem”. In:Annals of Mathematical Statistics32.2 (1961), pp. 506–523
work page 1961
-
[23]
Probability inequalities for sums of bounded random variables
Wassily Hoeffding. “Probability inequalities for sums of bounded random variables”. en. In: J. Am. Stat. Assoc.58.301 (Mar. 1963), pp. 13–30
work page 1963
-
[24]
Weak Convergence ofU-Statistics and Von Mises’ Differentiable Statistical Functions
R. G. Miller and Pranab Kumar Sen. “Weak Convergence ofU-Statistics and Von Mises’ Differentiable Statistical Functions”. en. In:Ann. Math. Statist.43.6 (1972), pp. 31–41.url: http://dml.mathdoc.fr/item/1177692698
-
[25]
Large sample randomization inference of causal effects in the presence of interference
Lan Liu and Michael G Hudgens. “Large sample randomization inference of causal effects in the presence of interference”. en. In:J. Am. Stat. Assoc.109.505 (Jan. 2014), pp. 288–301
work page 2014
-
[26]
Peng Ding and Tirthankar Dasgupta. “A randomization-based perspective on analysis of variance: a test statistic robust to treatment effect heterogeneity”. en. In:Biometrika105.1 (Mar. 2018), pp. 45–56
work page 2018
-
[27]
On Mitigating the Analytical Limitations of Finely Stratified Experi- ments
Colin B. Fogarty. “On Mitigating the Analytical Limitations of Finely Stratified Experi- ments”. In:Journal of the Royal Statistical Society Series B: Statistical Methodology80.5 (Aug. 2018), pp. 1035–1056.issn: 1369-7412.doi:10.1111/rssb.12290. eprint:https:// academic.oup.com/jrsssb/article-pdf/80/5/1035/49269533/jrsssb_80_5_1035.pdf. url:https://doi.or...
-
[28]
General forms of finite population central limit theorems with applications to causal inference
Xinran Li and Peng Ding. “General forms of finite population central limit theorems with applications to causal inference”. In:Journal of the American Statistical Association112.520 (2017), pp. 1759–1769
work page 2017
-
[29]
William F Rosenberger and John M Lachin.Randomization in Clinical Trials: Theory and Practice. Wiley, 2016
work page 2016
-
[30]
Always Valid Inference: Continuous Monitoring of A/B Tests
Ramesh Johari et al. “Always Valid Inference: Continuous Monitoring of A/B Tests”. In: Operations Research70 (Aug. 2021).doi:10.1287/opre.2021.2135
-
[31]
Chapter 3 - The Econometrics of Randomized Experiments
S. Athey and G.W. Imbens. “Chapter 3 - The Econometrics of Randomized Experiments”. In: Handbook of Field Experiments. Ed. by Abhijit Vinayak Banerjee and Esther Duflo. Vol. 1. Handbook of Economic Field Experiments. North-Holland, 2017, pp. 73–140.doi:https: //doi.org/10.1016/bs.hefe.2016.10.003.url:https://www.sciencedirect.com/ science/article/pii/S221...
work page doi:10.1016/bs.hefe.2016.10.003.url:https://www.sciencedirect.com/ 2017
-
[32]
P Hall and C C Heyde. “The Central Limit Theorem”. In:Martingale Limit Theory and its Application. Elsevier, 1980, pp. 51–96
work page 1980
-
[33]
Efficient adaptive experimental design for average treatment effect estimation
Masahiro Kato et al. “Efficient adaptive experimental design for average treatment effect estimation”. In: (2020). eprint:2002.05308(stat.ML)
-
[34]
Semiparametric Efficient Inference in Adaptive Experiments
Thomas Cook, Alan Mishler, and Aaditya Ramdas. “Semiparametric Efficient Inference in Adaptive Experiments”. In:Proceedings of the Third Conference on Causal Learning and Reasoning. Ed. by Francesco Locatello and Vanessa Didelez. Vol. 236. Proceedings of Machine Learning Research. PMLR, Jan. 2024, pp. 1033–1064.url:https : / / proceedings . mlr . press/v2...
work page 2024
-
[35]
Estimation of Regression Co- efficients When Some Regressors are not Always Observed
James M Robins, Andrea Rotnitzky, and Lue Ping Zhao. “Estimation of Regression Co- efficients When Some Regressors are not Always Observed”. In:Journal of the American Statistical Association89.427 (1994), pp. 846–866
work page 1994
-
[36]
Estimating causal effects of treatments in randomized and nonrandomized studies
Donald B Rubin. “Estimating causal effects of treatments in randomized and nonrandomized studies”. en. In:J. Educ. Psychol.66.5 (Oct. 1974), pp. 688–701
work page 1974
-
[37]
A Generalization of Sampling Without Re- placement From a Finite Universe
Daniel G Horvitz and Donovan J Thompson. “A Generalization of Sampling Without Re- placement From a Finite Universe”. In:Journal of the American Statistical Association47.260 (1952), pp. 663–685
work page 1952
-
[38]
Anastasios A Tsiatis.Semiparametric Theory and Missing Data. Springer, 2006
work page 2006
-
[39]
On the stochastic matrices associated with certain queuing processes
F G Foster. “On the stochastic matrices associated with certain queuing processes”. In:Ann. Math. Stat.24.3 (Sept. 1953), pp. 355–360
work page 1953
-
[40]
A finite selection model for experimental design of the health insurance study
C. Morris. “A finite selection model for experimental design of the health insurance study”. In:Journal of Econometrics11 (1979), pp. 43–61
work page 1979
-
[41]
Rerandomization to Improve Covariate Balance in Experiments
Kari L Morgan and Donald B Rubin. “Rerandomization to Improve Covariate Balance in Experiments”. In:Annals of Statistics40.2 (2012), pp. 1263–1282
work page 2012
-
[42]
Arun Ravichandran et al. In:Journal of Causal Inference12.1 (2024), p. 20230046.doi: doi:10.1515/jci-2023-0046.url:https://doi.org/10.1515/jci-2023-0046. 22 7 Proofs of Theorems In this section, we collect the proofs of our main Theorems 1-6. We begin by recalling the IPW and AIPW estimators introduced in (6) and (8), respectively. Before proceeding to th...
work page doi:10.1515/jci-2023-0046.url:https://doi.org/10.1515/jci-2023-0046 2024
-
[43]
Before doing this, we first show that N1 N “ 1 N Nÿ i“1 Ki p Ý Ñp‹.(41) We decompose 1 N Nÿ i“1 Ki “ 1 N Nÿ i“1 pKi ´p iq ` 1 N Nÿ i“1 pi. Under a strongly stable design, sincep i p Ý Ñp‹, the second term, being the Ces` aro mean of the sequencetp iuiě1, also converges in probability top ‹. Hence, it remains to show that 1 N Nÿ i“1 pKi ´p iq p Ý Ñ0.(42) S...
-
[44]
Therefore, Wei’s adaptive coin design satisfies strong design stability with limiting inclusion probabilityp ‹ “ 1 2. 8.2 Proof of Lemma 2 We begin by showing that Efron’s biased coin design [3] satisfies weak stability. Suppose a total of kunits have been assigned to treatment or control. Letm k andn k denote, respectively, the number of units assigned t...
-
[45]
1 i´1 i´1ÿ j“1 pKj ´p jqYjp1q pj . 37 SinceE
Next, we show thatB N Ñ0. Fixεą0. By Assumption 2(c), sYN p1q Ñ sY1, so there existsKPNsuch that for alliěK`1, ˇˇ sYN p1q ´ Y i´1p1q ˇˇ ď2ε. Using the boundedness ofY ip1q(Assumption 2(b)), we can decomposeB N as BN “ 1 N Kÿ i“1 ` sYN p1q ´ Y i´1p1q ˘2 ` 1 N Nÿ i“K`1 ` sYN p1q ´ Y i´1p1q ˘2 . The first term is bounded by 4KM 2 N and the second by 4ε 2, yi...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.