Design Stability in Adaptive Experiments: Implications for Treatment Effect Estimation

Koulik Khamaru; Saikat Sengupta; Suvrojit Ghosh; Tirthankar Dasgupta

arxiv: 2510.22351 · v2 · submitted 2025-10-25 · 🧮 math.ST · stat.ME· stat.ML· stat.TH

Design Stability in Adaptive Experiments: Implications for Treatment Effect Estimation

Saikat Sengupta , Koulik Khamaru , Suvrojit Ghosh , Tirthankar Dasgupta This is my paper

Pith reviewed 2026-05-18 05:10 UTC · model grok-4.3

classification 🧮 math.ST stat.MEstat.MLstat.TH

keywords adaptive experimentsaverage treatment effectdesign stabilityinverse propensity weightingaugmented IPWcentral limit theoremsequential randomization

0 comments

The pith

Design stability ensures central limit theorems for IPW and AIPW estimators of average treatment effects in adaptive experiments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how to estimate the average treatment effect when each unit's treatment assignment can depend on previous assignments and outcomes. It defines design stability as the requirement that assignment probabilities converge or that averages of the inverse propensity scores and their complements converge in probability to fixed constants as the sample grows. Under this condition the inverse propensity weighted estimator and the augmented version both obey central limit theorems, with explicit formulas for the limiting variances. The paper also constructs consistent variance estimators that support asymptotically valid confidence intervals. The theory is illustrated on two standard adaptive randomization procedures.

Core claim

Under the condition of design stability, both the IPW estimator and the AIPW estimator for the average treatment effect are asymptotically normal in sequentially adaptive experiments, and the paper supplies explicit expressions for their asymptotic variances.

What carries the argument

Design stability: as the number of units grows, either the assignment probabilities converge or sample averages of the inverse propensity scores and inverse complement propensity scores converge in probability to fixed non-random limits.

If this is right

Consistent estimators of the asymptotic variances allow construction of valid confidence intervals for the average treatment effect.
Both the plain IPW and the augmented IPW estimators admit central limit theorems under the same stability condition.
The results apply directly to Wei's adaptive coin design and Efron's biased coin design.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Many practical sequential experiments may satisfy design stability, which would let researchers retain reliable large-sample inference while still using adaptive assignment.
The same stability lens could be applied to other causal estimators or to settings with time-varying treatments.

Load-bearing premise

As the number of experimental units increases, the treatment assignment probabilities either converge or the sample averages of the inverse propensity scores converge in probability to fixed constants.

What would settle it

Run an adaptive experiment in which the sample averages of the inverse propensity scores fail to converge to any fixed limit and check whether the IPW estimator remains asymptotically normal with the claimed variance.

Figures

Figures reproduced from arXiv: 2510.22351 by Koulik Khamaru, Saikat Sengupta, Suvrojit Ghosh, Tirthankar Dasgupta.

**Figure 2.** Figure 2: Comparison of the average lengths of confidence intervals for Wei’s design. [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗

**Figure 3.** Figure 3: Comparison of the theoretical and empirical coverages for Efron’s design. [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗

**Figure 4.** Figure 4: Comparison of the average lengths of confidence intervals for Efron’s design. [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗

read the original abstract

We study the problem of estimating the average treatment effect (ATE) under sequentially adaptive treatment assignment mechanisms. In contrast to classical completely randomized designs, we consider a setting in which the probability of assigning treatment to each experimental unit may depend on prior assignments and observed outcomes. Within the potential outcomes framework, we propose and analyze two natural estimators for the ATE: the inverse propensity weighted (IPW) estimator and an augmented IPW (AIPW) estimator. The cornerstone of our analysis is the concept of design stability, which requires that as the number of units grows, either the assignment probabilities converge, or sample averages of the inverse propensity scores and of the inverse complement propensity scores converge in probability to fixed, non-random limits. Our main results establish central limit theorems for both the IPW and AIPW estimators under design stability and provide explicit expressions for their asymptotic variances. We further propose estimators for these variances, enabling the construction of asymptotically valid confidence intervals. Finally, we illustrate our theoretical results in the context of Wei's adaptive coin design and Efron's biased coin design, highlighting the applicability of the proposed methods to sequential experimentation with adaptive randomization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a usable sufficient condition called design stability that justifies CLTs for IPW and AIPW estimators under sequential adaptive assignment.

read the letter

The main point is that this paper defines design stability as a condition under which you can still get asymptotic normality for the usual IPW and AIPW estimators of the average treatment effect, even when treatment probabilities adapt based on earlier outcomes. They prove central limit theorems with explicit variance formulas and show how to estimate those variances for confidence intervals. They also check that the condition holds for Wei's urn and Efron's biased coin designs, two standard adaptive schemes. The derivations rely on martingale CLT arguments to handle the dependence from adaptation, which keeps things straightforward within the potential outcomes setup. This is a direct extension of fixed-design results to the adaptive case without introducing circularity or post-selection issues. The work is technically clean on its own terms and gives concrete variance expressions that practitioners could use. One limitation is that design stability still needs to be verified for any new adaptive rule, so it is not automatic for every possible mechanism. The paper handles the two canonical examples well, but broader applicability depends on how easy the condition is to check in other settings. This is aimed at statisticians working on causal inference for adaptive experiments, such as in clinical trials or sequential A/B testing. Readers who need rigorous justification for inference after dependent randomization will find the explicit results useful. The paper shows clear engagement with the relevant literature and the math holds up without obvious gaps. It deserves peer review because the condition is verifiable and the results are precise enough to be worth referee attention.

Referee Report

2 major / 3 minor

Summary. The paper studies ATE estimation under sequentially adaptive treatment assignment in the potential outcomes framework. It introduces the design stability condition (convergence of assignment probabilities or sample averages of inverse propensities to non-random limits) and establishes CLTs for the IPW and AIPW estimators with explicit asymptotic variance expressions. Variance estimators are proposed for confidence intervals, and the results are verified for Wei's urn design and Efron's biased coin design.

Significance. If design stability holds, the explicit CLTs and variance formulas provide a practical route to asymptotically valid inference in adaptive experiments, where dependence induced by sequential adaptation typically complicates standard arguments. The verification for two canonical adaptive designs and the proposal of variance estimators are concrete strengths that enhance applicability.

major comments (2)

§3 (Main Results), Theorem 1: The martingale CLT is applied to the triangular array of IPW terms after invoking design stability to obtain convergence of the conditional variances; however, the argument does not explicitly verify the Lindeberg condition, which is load-bearing for the CLT conclusion and should be checked or sketched.
§4 (AIPW estimator), Equation (12): The asymptotic variance expression for the AIPW estimator subtracts the augmentation term, but the proof sketch does not quantify the rate at which the cross term vanishes under design stability; this affects whether the variance reduction is asymptotically strict or only o_p(1).

minor comments (3)

Notation: The inverse propensity scores are denoted p_i and 1-p_i without a consistent subscript for the limiting values; introducing a separate symbol for the design-stability limits would improve readability.
§5 (Examples): The verification that design stability holds for Efron's biased coin is given only in probability; adding a brief remark on almost-sure convergence (if available) would strengthen the illustration.
References: The manuscript cites the classical martingale CLT but omits a recent reference on adaptive designs with similar stability conditions; adding one or two such citations would contextualize the contribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive evaluation of our manuscript and for the constructive comments. We address each major comment below and will incorporate the suggested clarifications into the revised version.

read point-by-point responses

Referee: §3 (Main Results), Theorem 1: The martingale CLT is applied to the triangular array of IPW terms after invoking design stability to obtain convergence of the conditional variances; however, the argument does not explicitly verify the Lindeberg condition, which is load-bearing for the CLT conclusion and should be checked or sketched.

Authors: We appreciate the referee highlighting this point. Design stability ensures convergence of the conditional variances to a non-random positive limit. Under the paper's maintained assumptions that potential outcomes are bounded and propensity scores are bounded away from 0 and 1, each term in the triangular array is uniformly bounded by a constant independent of n. Consequently the Lindeberg condition holds automatically. We will add an explicit verification of this fact to the proof of Theorem 1 in the revision. revision: yes
Referee: §4 (AIPW estimator), Equation (12): The asymptotic variance expression for the AIPW estimator subtracts the augmentation term, but the proof sketch does not quantify the rate at which the cross term vanishes under design stability; this affects whether the variance reduction is asymptotically strict or only o_p(1).

Authors: We thank the referee for this observation. In the current proof sketch we establish that the cross term between the IPW component and the augmentation is o_p(1) under design stability, yielding the stated asymptotic variance formula. A more precise argument shows that design stability implies the cross term is actually O_p(n^{-1/2}), which guarantees that the asymptotic variance reduction is strict. We will expand the proof sketch to include this rate calculation. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivations rest on external design stability assumption

full rationale

The paper posits design stability as a primitive assumption (convergence in probability of assignment probabilities or of sample averages of inverse propensities to non-random limits) and derives CLTs for the IPW and AIPW estimators under that assumption using standard martingale arguments in the potential outcomes framework. The assumption is independently verified for Wei's urn and Efron's biased coin designs, but this verification is not load-bearing for the main theorems and does not reduce any result to a fitted quantity or self-citation by construction. No quoted step equates a derived quantity to its own inputs; the analysis remains self-contained against the stated external condition and classical statistical tools.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Relies on the standard potential outcomes framework and introduces design stability as the primary new modeling assumption; no free parameters or invented entities.

axioms (2)

domain assumption Potential outcomes framework for defining ATE
Standard setup in causal inference invoked throughout the analysis.
ad hoc to paper Design stability condition on assignment probabilities
The key assumption introduced to obtain the central limit theorems.

pith-pipeline@v0.9.0 · 5748 in / 1097 out tokens · 32482 ms · 2026-05-18T05:10:40.909745+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean (J-uniqueness); IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction; washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Definition 2 (Strong design stability): pi → p* in probability; Definition 3 (Weak): averages of 1/pi and 1/(1-pi) converge to fixed limits; Theorems 1/4 establish CLTs under these for IPW/AIPW with explicit VIPW, VAIPW.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages

[1]

On the application of probability theory to agricultural experiments. Essay on principles

Jerzy Neyman. “On the application of probability theory to agricultural experiments. Essay on principles”. In:Statistical Science5.4 (1923). Reprinted from Roczniki Nauk Rolniczych, 1923, pp. 465–480

work page 1923
[2]

The adaptive biased coin design for sequential experiments

L J Wei. “The adaptive biased coin design for sequential experiments”. In:Ann. Stat.6.1 (Jan. 1978), pp. 92–100

work page 1978
[3]

Forcing a sequential experiment to be balanced

Bradley Efron. “Forcing a sequential experiment to be balanced”. In:Biometrika58.3 (1971), pp. 403–417

work page 1971
[4]

Challenges and opportunities with causal discovery algorithms: Appli- cation to Alzheimer’s pathophysiology

Xinpeng Shen et al. “Challenges and opportunities with causal discovery algorithms: Appli- cation to Alzheimer’s pathophysiology”. en. In:Sci. Rep.10.1 (Feb. 2020), p. 2975

work page 2020
[5]

Causal inference with large-scale assessments in education from a Bayesian perspective: a review and synthesis

David Kaplan. “Causal inference with large-scale assessments in education from a Bayesian perspective: a review and synthesis”. en. In:Large Scale Assess. Educ.4.1 (Dec. 2016)

work page 2016
[6]

Welfare analysis meets causal inference

Amy Finkelstein and Nathaniel Hendren. “Welfare analysis meets causal inference”. en. In: J. Econ. Perspect.34.4 (Nov. 2020), pp. 146–167

work page 2020
[7]

Nonparametric estimation of average treatment effects under exogeneity: A review

Guido W Imbens. “Nonparametric estimation of average treatment effects under exogeneity: A review”. en. In:Rev. Econ. Stat.86.1 (Feb. 2004), pp. 4–29

work page 2004
[8]

Dynamic causal effects evaluation in A/B testing with a reinforcement learning framework

Chengchun Shi et al. “Dynamic causal effects evaluation in A/B testing with a reinforcement learning framework”. en. In:J. Am. Stat. Assoc.118.543 (July 2023), pp. 2059–2071

work page 2023
[9]

Springer, 1999

Erich L Lehmann.Elements of Large-Sample Theory. Springer, 1999

work page 1999
[10]

Cambridge University Press, 2000

Aad W van der Vaart.Asymptotic Statistics. Cambridge University Press, 2000. 20

work page 2000
[11]

Oliver & Boyd, 1935

Ronald A Fisher.The Design of Experiments. Oliver & Boyd, 1935

work page 1935
[12]

William G Cochran.Sampling Techniques. 3rd. Wiley, 1977

work page 1977
[13]

Paul R Rosenbaum.Observational Studies. 2nd. Springer, 2002

work page 2002
[14]

Cambridge University Press, 2015

Guido W Imbens and Donald B Rubin.Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press, 2015

work page 2015
[15]

On the Limiting Distributions of Estimates Based on Samples from Finite Universes

William G Madow. “On the Limiting Distributions of Estimates Based on Samples from Finite Universes”. In:Annals of Mathematical Statistics19.4 (1948), pp. 535–545

work page 1948
[16]

On the Central Limit Theorem for Samples from a Finite Pop- ulation

Paul Erd˝ os and Alfr´ ed R´ enyi. “On the Central Limit Theorem for Samples from a Finite Pop- ulation”. In:Publication of the Mathematical Institute of the Hungarian Academy of Sciences 4 (1959), pp. 49–61

work page 1959
[17]

Limiting Distributions in Simple Random Sampling from a Finite Popula- tion

Jaroslav H´ ajek. “Limiting Distributions in Simple Random Sampling from a Finite Popula- tion”. In:Publications of the Mathematical Institute of the Hungarian Academy of Sciences 5 (1960), pp. 361–374

work page 1960
[18]

Holden-Day, 1975

Erich L Lehmann.Nonparametrics: Statistical Methods Based on Ranks. Holden-Day, 1975

work page 1975
[19]

On Cumulative Sums of Random Variables

Abraham Wald. “On Cumulative Sums of Random Variables”. In:Annals of Mathematical Statistics15.3 (1944), pp. 283–296

work page 1944
[20]

On a theorem by Wald and Wolfowitz

Gottfried E Noether. “On a theorem by Wald and Wolfowitz”. In:Ann. Math. Stat.20.3 (Sept. 1949), pp. 455–458

work page 1949
[21]

A vector form of the Wald-Wolfowitz-Hoeffding theorem

D A S Fraser. “A vector form of the Wald-Wolfowitz-Hoeffding theorem”. In:Ann. Math. Stat.27.2 (June 1956), pp. 540–543

work page 1956
[22]

Some Extensions of the Wald–Wolfowitz–Noether Theorem

Jaroslav H´ ajek. “Some Extensions of the Wald–Wolfowitz–Noether Theorem”. In:Annals of Mathematical Statistics32.2 (1961), pp. 506–523

work page 1961
[23]

Probability inequalities for sums of bounded random variables

Wassily Hoeffding. “Probability inequalities for sums of bounded random variables”. en. In: J. Am. Stat. Assoc.58.301 (Mar. 1963), pp. 13–30

work page 1963
[24]

Weak Convergence ofU-Statistics and Von Mises’ Differentiable Statistical Functions

R. G. Miller and Pranab Kumar Sen. “Weak Convergence ofU-Statistics and Von Mises’ Differentiable Statistical Functions”. en. In:Ann. Math. Statist.43.6 (1972), pp. 31–41.url: http://dml.mathdoc.fr/item/1177692698

work page arXiv 1972
[25]

Large sample randomization inference of causal effects in the presence of interference

Lan Liu and Michael G Hudgens. “Large sample randomization inference of causal effects in the presence of interference”. en. In:J. Am. Stat. Assoc.109.505 (Jan. 2014), pp. 288–301

work page 2014
[26]

A randomization-based perspective on analysis of variance: a test statistic robust to treatment effect heterogeneity

Peng Ding and Tirthankar Dasgupta. “A randomization-based perspective on analysis of variance: a test statistic robust to treatment effect heterogeneity”. en. In:Biometrika105.1 (Mar. 2018), pp. 45–56

work page 2018
[27]

On Mitigating the Analytical Limitations of Finely Stratified Experi- ments

Colin B. Fogarty. “On Mitigating the Analytical Limitations of Finely Stratified Experi- ments”. In:Journal of the Royal Statistical Society Series B: Statistical Methodology80.5 (Aug. 2018), pp. 1035–1056.issn: 1369-7412.doi:10.1111/rssb.12290. eprint:https:// academic.oup.com/jrsssb/article-pdf/80/5/1035/49269533/jrsssb_80_5_1035.pdf. url:https://doi.or...

work page doi:10.1111/rssb.12290 2018
[28]

General forms of finite population central limit theorems with applications to causal inference

Xinran Li and Peng Ding. “General forms of finite population central limit theorems with applications to causal inference”. In:Journal of the American Statistical Association112.520 (2017), pp. 1759–1769

work page 2017
[29]

Wiley, 2016

William F Rosenberger and John M Lachin.Randomization in Clinical Trials: Theory and Practice. Wiley, 2016

work page 2016
[30]

Always Valid Inference: Continuous Monitoring of A/B Tests

Ramesh Johari et al. “Always Valid Inference: Continuous Monitoring of A/B Tests”. In: Operations Research70 (Aug. 2021).doi:10.1287/opre.2021.2135

work page doi:10.1287/opre.2021.2135 2021
[31]

Chapter 3 - The Econometrics of Randomized Experiments

S. Athey and G.W. Imbens. “Chapter 3 - The Econometrics of Randomized Experiments”. In: Handbook of Field Experiments. Ed. by Abhijit Vinayak Banerjee and Esther Duflo. Vol. 1. Handbook of Economic Field Experiments. North-Holland, 2017, pp. 73–140.doi:https: //doi.org/10.1016/bs.hefe.2016.10.003.url:https://www.sciencedirect.com/ science/article/pii/S221...

work page doi:10.1016/bs.hefe.2016.10.003.url:https://www.sciencedirect.com/ 2017
[32]

The Central Limit Theorem

P Hall and C C Heyde. “The Central Limit Theorem”. In:Martingale Limit Theory and its Application. Elsevier, 1980, pp. 51–96

work page 1980
[33]

Efficient adaptive experimental design for average treatment effect estimation

Masahiro Kato et al. “Efficient adaptive experimental design for average treatment effect estimation”. In: (2020). eprint:2002.05308(stat.ML)

work page arXiv 2020
[34]

Semiparametric Efficient Inference in Adaptive Experiments

Thomas Cook, Alan Mishler, and Aaditya Ramdas. “Semiparametric Efficient Inference in Adaptive Experiments”. In:Proceedings of the Third Conference on Causal Learning and Reasoning. Ed. by Francesco Locatello and Vanessa Didelez. Vol. 236. Proceedings of Machine Learning Research. PMLR, Jan. 2024, pp. 1033–1064.url:https : / / proceedings . mlr . press/v2...

work page 2024
[35]

Estimation of Regression Co- efficients When Some Regressors are not Always Observed

James M Robins, Andrea Rotnitzky, and Lue Ping Zhao. “Estimation of Regression Co- efficients When Some Regressors are not Always Observed”. In:Journal of the American Statistical Association89.427 (1994), pp. 846–866

work page 1994
[36]

Estimating causal effects of treatments in randomized and nonrandomized studies

Donald B Rubin. “Estimating causal effects of treatments in randomized and nonrandomized studies”. en. In:J. Educ. Psychol.66.5 (Oct. 1974), pp. 688–701

work page 1974
[37]

A Generalization of Sampling Without Re- placement From a Finite Universe

Daniel G Horvitz and Donovan J Thompson. “A Generalization of Sampling Without Re- placement From a Finite Universe”. In:Journal of the American Statistical Association47.260 (1952), pp. 663–685

work page 1952
[38]

Springer, 2006

Anastasios A Tsiatis.Semiparametric Theory and Missing Data. Springer, 2006

work page 2006
[39]

On the stochastic matrices associated with certain queuing processes

F G Foster. “On the stochastic matrices associated with certain queuing processes”. In:Ann. Math. Stat.24.3 (Sept. 1953), pp. 355–360

work page 1953
[40]

A finite selection model for experimental design of the health insurance study

C. Morris. “A finite selection model for experimental design of the health insurance study”. In:Journal of Econometrics11 (1979), pp. 43–61

work page 1979
[41]

Rerandomization to Improve Covariate Balance in Experiments

Kari L Morgan and Donald B Rubin. “Rerandomization to Improve Covariate Balance in Experiments”. In:Annals of Statistics40.2 (2012), pp. 1263–1282

work page 2012
[42]

Y ip1q, and whenK i “0 we haveY i “Y ip0q. Consequently, KiYi “K iYip1qandp1´K iqYi “ p1´K iqYip0q. Thus, the estimators from (6) and (8) simplify to pτIPW “ 1 N Nÿ i“1

Arun Ravichandran et al. In:Journal of Causal Inference12.1 (2024), p. 20230046.doi: doi:10.1515/jci-2023-0046.url:https://doi.org/10.1515/jci-2023-0046. 22 7 Proofs of Theorems In this section, we collect the proofs of our main Theorems 1-6. We begin by recalling the IPW and AIPW estimators introduced in (6) and (8), respectively. Before proceeding to th...

work page doi:10.1515/jci-2023-0046.url:https://doi.org/10.1515/jci-2023-0046 2024
[43]

A2 i `2A iBi `B 2 i ‰ , 30 where Ai “ pYi´1p1q ´ Y i´1p1q pi andB i “ pYi´1p0q ´ Y i´1p0q 1´p i . SinceE

Before doing this, we first show that N1 N “ 1 N Nÿ i“1 Ki p Ý Ñp‹.(41) We decompose 1 N Nÿ i“1 Ki “ 1 N Nÿ i“1 pKi ´p iq ` 1 N Nÿ i“1 pi. Under a strongly stable design, sincep i p Ý Ñp‹, the second term, being the Ces` aro mean of the sequencetp iuiě1, also converges in probability top ‹. Hence, it remains to show that 1 N Nÿ i“1 pKi ´p iq p Ý Ñ0.(42) S...

work page
[44]

8.2 Proof of Lemma 2 We begin by showing that Efron’s biased coin design [3] satisfies weak stability

Therefore, Wei’s adaptive coin design satisfies strong design stability with limiting inclusion probabilityp ‹ “ 1 2. 8.2 Proof of Lemma 2 We begin by showing that Efron’s biased coin design [3] satisfies weak stability. Suppose a total of kunits have been assigned to treatment or control. Letm k andn k denote, respectively, the number of units assigned t...

work page
[45]

1 i´1 i´1ÿ j“1 pKj ´p jqYjp1q pj . 37 SinceE

Next, we show thatB N Ñ0. Fixεą0. By Assumption 2(c), sYN p1q Ñ sY1, so there existsKPNsuch that for alliěK`1, ˇˇ sYN p1q ´ Y i´1p1q ˇˇ ď2ε. Using the boundedness ofY ip1q(Assumption 2(b)), we can decomposeB N as BN “ 1 N Kÿ i“1 ` sYN p1q ´ Y i´1p1q ˘2 ` 1 N Nÿ i“K`1 ` sYN p1q ´ Y i´1p1q ˘2 . The first term is bounded by 4KM 2 N and the second by 4ε 2, yi...

work page

[1] [1]

On the application of probability theory to agricultural experiments. Essay on principles

Jerzy Neyman. “On the application of probability theory to agricultural experiments. Essay on principles”. In:Statistical Science5.4 (1923). Reprinted from Roczniki Nauk Rolniczych, 1923, pp. 465–480

work page 1923

[2] [2]

The adaptive biased coin design for sequential experiments

L J Wei. “The adaptive biased coin design for sequential experiments”. In:Ann. Stat.6.1 (Jan. 1978), pp. 92–100

work page 1978

[3] [3]

Forcing a sequential experiment to be balanced

Bradley Efron. “Forcing a sequential experiment to be balanced”. In:Biometrika58.3 (1971), pp. 403–417

work page 1971

[4] [4]

Challenges and opportunities with causal discovery algorithms: Appli- cation to Alzheimer’s pathophysiology

Xinpeng Shen et al. “Challenges and opportunities with causal discovery algorithms: Appli- cation to Alzheimer’s pathophysiology”. en. In:Sci. Rep.10.1 (Feb. 2020), p. 2975

work page 2020

[5] [5]

Causal inference with large-scale assessments in education from a Bayesian perspective: a review and synthesis

David Kaplan. “Causal inference with large-scale assessments in education from a Bayesian perspective: a review and synthesis”. en. In:Large Scale Assess. Educ.4.1 (Dec. 2016)

work page 2016

[6] [6]

Welfare analysis meets causal inference

Amy Finkelstein and Nathaniel Hendren. “Welfare analysis meets causal inference”. en. In: J. Econ. Perspect.34.4 (Nov. 2020), pp. 146–167

work page 2020

[7] [7]

Nonparametric estimation of average treatment effects under exogeneity: A review

Guido W Imbens. “Nonparametric estimation of average treatment effects under exogeneity: A review”. en. In:Rev. Econ. Stat.86.1 (Feb. 2004), pp. 4–29

work page 2004

[8] [8]

Dynamic causal effects evaluation in A/B testing with a reinforcement learning framework

Chengchun Shi et al. “Dynamic causal effects evaluation in A/B testing with a reinforcement learning framework”. en. In:J. Am. Stat. Assoc.118.543 (July 2023), pp. 2059–2071

work page 2023

[9] [9]

Springer, 1999

Erich L Lehmann.Elements of Large-Sample Theory. Springer, 1999

work page 1999

[10] [10]

Cambridge University Press, 2000

Aad W van der Vaart.Asymptotic Statistics. Cambridge University Press, 2000. 20

work page 2000

[11] [11]

Oliver & Boyd, 1935

Ronald A Fisher.The Design of Experiments. Oliver & Boyd, 1935

work page 1935

[12] [12]

William G Cochran.Sampling Techniques. 3rd. Wiley, 1977

work page 1977

[13] [13]

Paul R Rosenbaum.Observational Studies. 2nd. Springer, 2002

work page 2002

[14] [14]

Cambridge University Press, 2015

Guido W Imbens and Donald B Rubin.Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press, 2015

work page 2015

[15] [15]

On the Limiting Distributions of Estimates Based on Samples from Finite Universes

William G Madow. “On the Limiting Distributions of Estimates Based on Samples from Finite Universes”. In:Annals of Mathematical Statistics19.4 (1948), pp. 535–545

work page 1948

[16] [16]

On the Central Limit Theorem for Samples from a Finite Pop- ulation

Paul Erd˝ os and Alfr´ ed R´ enyi. “On the Central Limit Theorem for Samples from a Finite Pop- ulation”. In:Publication of the Mathematical Institute of the Hungarian Academy of Sciences 4 (1959), pp. 49–61

work page 1959

[17] [17]

Limiting Distributions in Simple Random Sampling from a Finite Popula- tion

Jaroslav H´ ajek. “Limiting Distributions in Simple Random Sampling from a Finite Popula- tion”. In:Publications of the Mathematical Institute of the Hungarian Academy of Sciences 5 (1960), pp. 361–374

work page 1960

[18] [18]

Holden-Day, 1975

Erich L Lehmann.Nonparametrics: Statistical Methods Based on Ranks. Holden-Day, 1975

work page 1975

[19] [19]

On Cumulative Sums of Random Variables

Abraham Wald. “On Cumulative Sums of Random Variables”. In:Annals of Mathematical Statistics15.3 (1944), pp. 283–296

work page 1944

[20] [20]

On a theorem by Wald and Wolfowitz

Gottfried E Noether. “On a theorem by Wald and Wolfowitz”. In:Ann. Math. Stat.20.3 (Sept. 1949), pp. 455–458

work page 1949

[21] [21]

A vector form of the Wald-Wolfowitz-Hoeffding theorem

D A S Fraser. “A vector form of the Wald-Wolfowitz-Hoeffding theorem”. In:Ann. Math. Stat.27.2 (June 1956), pp. 540–543

work page 1956

[22] [22]

Some Extensions of the Wald–Wolfowitz–Noether Theorem

Jaroslav H´ ajek. “Some Extensions of the Wald–Wolfowitz–Noether Theorem”. In:Annals of Mathematical Statistics32.2 (1961), pp. 506–523

work page 1961

[23] [23]

Probability inequalities for sums of bounded random variables

Wassily Hoeffding. “Probability inequalities for sums of bounded random variables”. en. In: J. Am. Stat. Assoc.58.301 (Mar. 1963), pp. 13–30

work page 1963

[24] [24]

Weak Convergence ofU-Statistics and Von Mises’ Differentiable Statistical Functions

R. G. Miller and Pranab Kumar Sen. “Weak Convergence ofU-Statistics and Von Mises’ Differentiable Statistical Functions”. en. In:Ann. Math. Statist.43.6 (1972), pp. 31–41.url: http://dml.mathdoc.fr/item/1177692698

work page arXiv 1972

[25] [25]

Large sample randomization inference of causal effects in the presence of interference

Lan Liu and Michael G Hudgens. “Large sample randomization inference of causal effects in the presence of interference”. en. In:J. Am. Stat. Assoc.109.505 (Jan. 2014), pp. 288–301

work page 2014

[26] [26]

A randomization-based perspective on analysis of variance: a test statistic robust to treatment effect heterogeneity

Peng Ding and Tirthankar Dasgupta. “A randomization-based perspective on analysis of variance: a test statistic robust to treatment effect heterogeneity”. en. In:Biometrika105.1 (Mar. 2018), pp. 45–56

work page 2018

[27] [27]

On Mitigating the Analytical Limitations of Finely Stratified Experi- ments

Colin B. Fogarty. “On Mitigating the Analytical Limitations of Finely Stratified Experi- ments”. In:Journal of the Royal Statistical Society Series B: Statistical Methodology80.5 (Aug. 2018), pp. 1035–1056.issn: 1369-7412.doi:10.1111/rssb.12290. eprint:https:// academic.oup.com/jrsssb/article-pdf/80/5/1035/49269533/jrsssb_80_5_1035.pdf. url:https://doi.or...

work page doi:10.1111/rssb.12290 2018

[28] [28]

General forms of finite population central limit theorems with applications to causal inference

Xinran Li and Peng Ding. “General forms of finite population central limit theorems with applications to causal inference”. In:Journal of the American Statistical Association112.520 (2017), pp. 1759–1769

work page 2017

[29] [29]

Wiley, 2016

William F Rosenberger and John M Lachin.Randomization in Clinical Trials: Theory and Practice. Wiley, 2016

work page 2016

[30] [30]

Always Valid Inference: Continuous Monitoring of A/B Tests

Ramesh Johari et al. “Always Valid Inference: Continuous Monitoring of A/B Tests”. In: Operations Research70 (Aug. 2021).doi:10.1287/opre.2021.2135

work page doi:10.1287/opre.2021.2135 2021

[31] [31]

Chapter 3 - The Econometrics of Randomized Experiments

S. Athey and G.W. Imbens. “Chapter 3 - The Econometrics of Randomized Experiments”. In: Handbook of Field Experiments. Ed. by Abhijit Vinayak Banerjee and Esther Duflo. Vol. 1. Handbook of Economic Field Experiments. North-Holland, 2017, pp. 73–140.doi:https: //doi.org/10.1016/bs.hefe.2016.10.003.url:https://www.sciencedirect.com/ science/article/pii/S221...

work page doi:10.1016/bs.hefe.2016.10.003.url:https://www.sciencedirect.com/ 2017

[32] [32]

The Central Limit Theorem

P Hall and C C Heyde. “The Central Limit Theorem”. In:Martingale Limit Theory and its Application. Elsevier, 1980, pp. 51–96

work page 1980

[33] [33]

Efficient adaptive experimental design for average treatment effect estimation

Masahiro Kato et al. “Efficient adaptive experimental design for average treatment effect estimation”. In: (2020). eprint:2002.05308(stat.ML)

work page arXiv 2020

[34] [34]

Semiparametric Efficient Inference in Adaptive Experiments

Thomas Cook, Alan Mishler, and Aaditya Ramdas. “Semiparametric Efficient Inference in Adaptive Experiments”. In:Proceedings of the Third Conference on Causal Learning and Reasoning. Ed. by Francesco Locatello and Vanessa Didelez. Vol. 236. Proceedings of Machine Learning Research. PMLR, Jan. 2024, pp. 1033–1064.url:https : / / proceedings . mlr . press/v2...

work page 2024

[35] [35]

Estimation of Regression Co- efficients When Some Regressors are not Always Observed

James M Robins, Andrea Rotnitzky, and Lue Ping Zhao. “Estimation of Regression Co- efficients When Some Regressors are not Always Observed”. In:Journal of the American Statistical Association89.427 (1994), pp. 846–866

work page 1994

[36] [36]

Estimating causal effects of treatments in randomized and nonrandomized studies

Donald B Rubin. “Estimating causal effects of treatments in randomized and nonrandomized studies”. en. In:J. Educ. Psychol.66.5 (Oct. 1974), pp. 688–701

work page 1974

[37] [37]

A Generalization of Sampling Without Re- placement From a Finite Universe

Daniel G Horvitz and Donovan J Thompson. “A Generalization of Sampling Without Re- placement From a Finite Universe”. In:Journal of the American Statistical Association47.260 (1952), pp. 663–685

work page 1952

[38] [38]

Springer, 2006

Anastasios A Tsiatis.Semiparametric Theory and Missing Data. Springer, 2006

work page 2006

[39] [39]

On the stochastic matrices associated with certain queuing processes

F G Foster. “On the stochastic matrices associated with certain queuing processes”. In:Ann. Math. Stat.24.3 (Sept. 1953), pp. 355–360

work page 1953

[40] [40]

A finite selection model for experimental design of the health insurance study

C. Morris. “A finite selection model for experimental design of the health insurance study”. In:Journal of Econometrics11 (1979), pp. 43–61

work page 1979

[41] [41]

Rerandomization to Improve Covariate Balance in Experiments

Kari L Morgan and Donald B Rubin. “Rerandomization to Improve Covariate Balance in Experiments”. In:Annals of Statistics40.2 (2012), pp. 1263–1282

work page 2012

[42] [42]

Y ip1q, and whenK i “0 we haveY i “Y ip0q. Consequently, KiYi “K iYip1qandp1´K iqYi “ p1´K iqYip0q. Thus, the estimators from (6) and (8) simplify to pτIPW “ 1 N Nÿ i“1

Arun Ravichandran et al. In:Journal of Causal Inference12.1 (2024), p. 20230046.doi: doi:10.1515/jci-2023-0046.url:https://doi.org/10.1515/jci-2023-0046. 22 7 Proofs of Theorems In this section, we collect the proofs of our main Theorems 1-6. We begin by recalling the IPW and AIPW estimators introduced in (6) and (8), respectively. Before proceeding to th...

work page doi:10.1515/jci-2023-0046.url:https://doi.org/10.1515/jci-2023-0046 2024

[43] [43]

A2 i `2A iBi `B 2 i ‰ , 30 where Ai “ pYi´1p1q ´ Y i´1p1q pi andB i “ pYi´1p0q ´ Y i´1p0q 1´p i . SinceE

Before doing this, we first show that N1 N “ 1 N Nÿ i“1 Ki p Ý Ñp‹.(41) We decompose 1 N Nÿ i“1 Ki “ 1 N Nÿ i“1 pKi ´p iq ` 1 N Nÿ i“1 pi. Under a strongly stable design, sincep i p Ý Ñp‹, the second term, being the Ces` aro mean of the sequencetp iuiě1, also converges in probability top ‹. Hence, it remains to show that 1 N Nÿ i“1 pKi ´p iq p Ý Ñ0.(42) S...

work page

[44] [44]

8.2 Proof of Lemma 2 We begin by showing that Efron’s biased coin design [3] satisfies weak stability

Therefore, Wei’s adaptive coin design satisfies strong design stability with limiting inclusion probabilityp ‹ “ 1 2. 8.2 Proof of Lemma 2 We begin by showing that Efron’s biased coin design [3] satisfies weak stability. Suppose a total of kunits have been assigned to treatment or control. Letm k andn k denote, respectively, the number of units assigned t...

work page

[45] [45]

1 i´1 i´1ÿ j“1 pKj ´p jqYjp1q pj . 37 SinceE

Next, we show thatB N Ñ0. Fixεą0. By Assumption 2(c), sYN p1q Ñ sY1, so there existsKPNsuch that for alliěK`1, ˇˇ sYN p1q ´ Y i´1p1q ˇˇ ď2ε. Using the boundedness ofY ip1q(Assumption 2(b)), we can decomposeB N as BN “ 1 N Kÿ i“1 ` sYN p1q ´ Y i´1p1q ˘2 ` 1 N Nÿ i“K`1 ` sYN p1q ´ Y i´1p1q ˘2 . The first term is bounded by 4KM 2 N and the second by 4ε 2, yi...

work page