Semiparametric Efficiency in Sequential Experiments: Characterization and Design via Average Propensity
Pith reviewed 2026-07-02 18:12 UTC · model grok-4.3
The pith
Every non-anticipating sequential design induces an average propensity score that sets the semiparametric efficiency bound for causal estimators.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Every non-anticipating design induces an average propensity score, and we establish a semiparametric lower bound: for regular locally unbiased estimators, attainable precision is bounded by the i.i.d. efficiency benchmark evaluated at this induced score. The average propensity score thereby serves as a common benchmark and design target, allowing sequential experimental design to be viewed as choosing or learning an efficient allocation rule, with operational constraints entering through the admissible set when present.
What carries the argument
The average propensity score induced by the design, which serves as the effective treatment probability for the semiparametric efficiency bound.
If this is right
- Batched adaptive designs that use regression adjustment based on efficient influence functions attain the bound for general smooth estimands under standard nuisance-rate conditions.
- For linear functionals of outcome means the same adjustment achieves a sharp second-order rate.
- Adaptive covariate balancing attains the same bound through the assignment mechanism and permits simple moment-based estimation.
- Both families of designs require only a small number of policy updates and remain compatible with delayed feedback.
- The framework applies directly to multi-treatment settings.
Where Pith is reading between the lines
- The characterization could be used to compare efficiency across designs that differ only in their constraint sets.
- It suggests a natural target for online learning algorithms that update allocation rules: convergence of the empirical average propensity to an efficient fixed score.
- In practice the bound supplies a diagnostic: if observed precision falls short, the gap can be attributed either to the induced score or to failure to attain the bound.
- The approach may extend to settings with partial anticipation if an effective average propensity can still be defined.
Load-bearing premise
The designs must be non-anticipating and the estimators must be regular and locally unbiased.
What would settle it
A non-anticipating sequential design and a regular locally unbiased estimator that achieves strictly lower asymptotic variance than the i.i.d. efficiency bound computed at the design's induced average propensity score.
read the original abstract
Modern experiments, including evaluations of AI-enabled services and platform interventions, often depart from independent and identically distributed (i.i.d.) sampling because assignments may be adaptive, balanced across covariates, or subject to rollout constraints such as exposure, fairness, and budget limits. This paper studies the efficiency benchmark for estimating causal targets in such sequential experiments. We show that every non-anticipating design induces an average propensity score, and we establish a semiparametric lower bound: for regular locally unbiased estimators, attainable precision is bounded by the i.i.d. efficiency benchmark evaluated at this induced score. The average propensity score thereby serves as a common benchmark and design target, allowing sequential experimental design to be viewed as choosing or learning an efficient allocation rule, with operational constraints entering through the admissible set when present. We then develop implementable batched adaptive designs that approach this benchmark through two complementary mechanisms. The first uses regression adjustment based on efficient influence functions; for general smooth estimands it attains the benchmark under standard nuisance-rate conditions, while for linear functionals of outcome means it achieves a sharp second-order rate. The second uses adaptive covariate balancing to attain the same benchmark through the assignment mechanism, enabling simple moment-based estimation. Both routes require only a small number of policy updates, making them compatible with delayed feedback and easier to monitor in operational deployments. Numerical experiments and an empirical study of AI medical-assistant evaluation demonstrate the practical efficiency gains, including in multi-treatment settings. Overall, the paper provides a unified framework for characterizing and designing efficient sequential experiments.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that every non-anticipating sequential design induces an average propensity score, and that for regular locally unbiased estimators the semiparametric efficiency lower bound for causal targets equals the classical i.i.d. efficiency bound evaluated at this induced score. It further constructs two families of batched adaptive designs (regression adjustment via efficient influence functions and adaptive covariate balancing) that attain the bound with only a small number of policy updates, under standard nuisance-rate conditions or via the assignment mechanism itself.
Significance. If the lower-bound result holds, the work supplies a clean unification of sequential experimental design with semiparametric efficiency theory, showing that design effort can be reduced to targeting an appropriate average propensity while respecting operational constraints. The two attainment routes (EIF-based adjustment and balancing) are practically relevant for delayed-feedback settings common in platform and AI experiments.
major comments (2)
- [§3, Theorem 1] §3, Theorem 1 (lower bound): the argument that the tangent space remains identical to the i.i.d. case under non-anticipating but adaptive assignments needs an explicit verification that the filtration does not enlarge the set of regular parametric submodels; the current sketch appears to invoke the classical definition of local unbiasedness without re-deriving the score under the sequential sigma-field.
- [§4.2, Proposition 2] §4.2, Proposition 2 (second-order rate for linear functionals): the claim of a sharp second-order remainder requires that the nuisance estimators satisfy the product-rate condition uniformly over the adaptive propensity sequence; the proof sketch does not display the uniform integrability argument needed when the propensity is itself data-dependent.
minor comments (2)
- [Definition 2] Notation for the induced average propensity (Definition 2) should be distinguished more clearly from the instantaneous propensity; a short remark on measurability with respect to the filtration would help.
- [Numerical experiments] The numerical experiments section would benefit from an explicit statement of the number of Monte Carlo replications and the precise metric used to compare against the i.i.d. benchmark.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. The two major comments identify places where the proof sketches can be strengthened with additional explicit arguments. We will revise the manuscript to address both points.
read point-by-point responses
-
Referee: [§3, Theorem 1] §3, Theorem 1 (lower bound): the argument that the tangent space remains identical to the i.i.d. case under non-anticipating but adaptive assignments needs an explicit verification that the filtration does not enlarge the set of regular parametric submodels; the current sketch appears to invoke the classical definition of local unbiasedness without re-deriving the score under the sequential sigma-field.
Authors: We agree that the sketch in Theorem 1 would benefit from a more explicit derivation. Because assignments are non-anticipating, the local parametric submodels perturb only the conditional outcome distributions given the observed history; the resulting scores therefore coincide exactly with those of the classical i.i.d. tangent space. In the revision we will insert a self-contained paragraph that re-derives the score functions under the sequential sigma-field and verifies that no additional directions are introduced by the filtration. revision: yes
-
Referee: [§4.2, Proposition 2] §4.2, Proposition 2 (second-order rate for linear functionals): the claim of a sharp second-order remainder requires that the nuisance estimators satisfy the product-rate condition uniformly over the adaptive propensity sequence; the proof sketch does not display the uniform integrability argument needed when the propensity is itself data-dependent.
Authors: The referee correctly notes that the sketch omits an explicit uniform-integrability step. Under the maintained boundedness of the propensities away from zero and one, together with the product-rate condition on the nuisance estimators, the second-order remainder is dominated by an integrable sequence that does not depend on the realized adaptive path. We will add this domination argument to the proof of Proposition 2 so that the o_p(n^{-1/2}) claim holds uniformly over the data-dependent sequence. revision: yes
Circularity Check
No circularity: standard semiparametric bound applied to induced average propensity
full rationale
The paper's core result states that non-anticipating designs induce an average propensity score and that the semiparametric efficiency bound for regular locally unbiased estimators is the classical i.i.d. bound evaluated at that score. This is a direct application of existing semiparametric theory (tangent space, influence functions) to a derived marginal quantity; no equation reduces a claimed prediction to a fitted input by construction, no uniqueness theorem is imported from self-citation, and no ansatz is smuggled. The derivation chain remains self-contained against external benchmarks and does not rely on the paper's own fitted quantities or prior results by the same authors as load-bearing steps.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Estimators are regular and locally unbiased
Reference graph
Works this paper leans on
-
[1]
Armstrong, T. B. (2022). Asymptotic efficiency bounds for a class of experimental designs. arXiv preprint arXiv:2205.02726\/
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[2]
Bai, Y. (2022). Optimality of matched-pair designs in randomized controlled trials. American Economic Review\/ 112\/ (12), 3911--3940
2022
- [3]
-
[4]
Bean, A. M., R. E. Payne, G. Parsons, H. R. Kirk, J. Ciro, R. Mosquera-G \'o mez, S. Hincapi \'e M, A. S. Ekanayaka, L. Tarassenko, L. Rocher, et al. (2026). Reliability of llms as medical assistants for the general public: a randomized preregistered study. Nature Medicine\/ , 1--7
2026
-
[5]
Budzyn, A. et al. (2025). The deskilling effect of artificial intelligence in clinical endoscopy: observational evidence. Nature Medicine (or relevant clinical journal placeholder)\/ . As cited in your text
2025
-
[6]
Cai, Y. and A. Rafi (2024). On the performance of the neyman allocation with small pilots. Journal of Econometrics\/ 242\/ (1), 105793
2024
-
[7]
Mishler, and A
Cook, T., A. Mishler, and A. Ramdas (2024). Semiparametric efficient inference in adaptive experiments. In Causal Learning and Reasoning , pp.\ 1033--1064. PMLR
2024
-
[8]
Cytrynbaum, M. (2021). Optimal stratification of survey experiments. arXiv preprint arXiv:2111.08157\/
work page internal anchor Pith review Pith/arXiv arXiv 2021
- [9]
-
[10]
Dai, J., P. Gradu, and C. Harshaw (2023). Clip-ogd: An experimental design for adaptive neyman allocation in sequential experiments. arXiv preprint arXiv:2305.17187\/
-
[11]
Dietvorst, B. J., J. P. Simmons, and C. Massey (2015). Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General\/ 144\/ (1), 114--126
2015
-
[12]
Calauz \`e nes, T
Gilotte, A., C. Calauz \`e nes, T. Nedelec, A. Abraham, and S. Doll \'e (2018). Offline a/b testing for recommender systems. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining , pp.\ 198--206
2018
-
[13]
Hu, F. and W. F. Rosenberger (2006). The theory of response-adaptive randomization in clinical trials . John Wiley & Sons
2006
-
[14]
Kato, M., T. Ishihara, J. Honda, and Y. Narita (2020). Efficient adaptive experimental design for average treatment effect estimation. arXiv preprint arXiv:2002.05308\/
-
[15]
Li, J., D. Simchi-Levi, and Y. Zhao (2024). Optimal adaptive experimental design for estimating treatment effect. arXiv preprint arXiv:2410.05552\/
-
[16]
Ding, and D
Li, X., P. Ding, and D. B. Rubin (2018). Asymptotic theory of rerandomization in treatment--control experiments. Proceedings of the National Academy of Sciences\/ 115\/ (37), 9157--9162
2018
-
[17]
Newey, W. K. (1994). The asymptotic variance of semiparametric estimators. Econometrica: Journal of the Econometric Society\/ , 1349--1382
1994
- [18]
-
[19]
Van der Vaart, A. W. (2000). Asymptotic statistics , Volume 3. Cambridge university press
2000
-
[20]
Zhao, J. (2023). Adaptive neyman allocation
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.