Adaptive Experimentation for Censored Survival Outcomes

Dennis Frauen; Emil Javurek; Jonas Schweisthal; Maresa Schr\"oder; Stefan Feuerriegel; Yuxin Wang

arxiv: 2605.18459 · v1 · pith:JOOQO7QRnew · submitted 2026-05-18 · 💻 cs.LG · stat.ML

Adaptive Experimentation for Censored Survival Outcomes

Yuxin Wang , Dennis Frauen , Jonas Schweisthal , Maresa Schr\"oder , Emil Javurek , Stefan Feuerriegel This is my paper

Pith reviewed 2026-05-20 12:26 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords adaptive experimentationsurvival analysisright censoringefficiency boundstreatment allocationcausal inferencemachine learningsemiparametric estimation

0 comments

The pith

The paper derives a closed-form efficiency-optimal treatment allocation policy for estimating average survival effects under right censoring.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a framework for adaptive experiments that estimate causal survival effects when event times are only partially observed due to right censoring. It derives the semiparametric efficiency bound for the average survival effect curve expressed as a function of the allocation policy, then solves for the policy that minimizes that bound. The resulting policy generalizes classical Neyman allocation by directing more samples toward patient strata where both the event process and the censoring process create high uncertainty. An adaptive estimator called ASE is built on this policy, allowing any machine learning method for nuisance functions while delivering asymptotic normality through a martingale central limit theorem.

Core claim

The semiparametric efficiency bound for the average survival effect curve is derived explicitly as a function of the treatment allocation policy, producing a closed-form efficiency-optimal allocation that prioritizes strata in which event and censoring dynamics jointly induce high uncertainty; the Adaptive Survival Estimator then implements this policy sequentially while estimating the curve.

What carries the argument

The semiparametric efficiency bound for the average survival effect curve expressed as a function of the allocation policy, which is minimized to obtain the closed-form optimal policy.

If this is right

The optimal policy produces lower-variance estimates of survival effects than uniform randomization when censoring is present.
The framework admits asymptotic normality of the estimators under the martingale central limit theorem.
Arbitrary machine learning models can be plugged in for nuisance estimation without breaking the theoretical guarantees.
Efficiency gains are observed over both uniform allocation and methods that ignore the censoring mechanism.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same efficiency-bound derivation could be applied to left-censored or interval-censored outcomes by adjusting the influence function accordingly.
In practice the policy might allow shorter clinical trials by concentrating samples where uncertainty is highest rather than spreading them evenly.
If nuisance estimation rates fall short, the efficiency gain over uniform allocation would shrink but the procedure would still remain consistent.
The approach raises the question of how to adapt the policy when treatment effects themselves vary strongly across strata.

Load-bearing premise

Nuisance functions for conditional survival and censoring distributions can be estimated at rates fast enough for the efficiency bound and asymptotic normality to hold when arbitrary machine learning models are used.

What would settle it

An experiment in which the derived allocation policy is followed yet the resulting estimator for the average survival effect curve exhibits variance larger than that obtained under uniform randomization, for the same censoring distribution and sample size.

Figures

Figures reproduced from arXiv: 2605.18459 by Dennis Frauen, Emil Javurek, Jonas Schweisthal, Maresa Schr\"oder, Stefan Feuerriegel, Yuxin Wang.

**Figure 1.** Figure 1: Our semi-parametric framework for adaptive experimentation with censored survival outcomes. thus leading to an allocation policy based on outcome variability, such as Neyman allocation [e.g., 21, 28, 10, 13]. However, both paradigms are developed for fully observed outcomes, and are not directly applicable to survival data with censoring, where event times are only partially observed (e.g., overall surviva… view at source ↗

**Figure 2.** Figure 2: Censoring aware A-Optimal policy π ⋆ (X) vs. censoring hazard. Consider the case where both arms share the same censoring hazard (i.e., λ G t (x, 0) = λ G t (x, 1) =: λ G); yet where the event dynamics differ (i.e., λ S t (x, 0) ̸= λ S t (x, 1)). One might expect that equal censoring renders π ⋆ insensitive to λ G, and that one would recover the classical Neyman allocation. However, [PITH_FULL_IMAGE:figu… view at source ↗

**Figure 3.** Figure 3: Synthetic experiments: a (left): Relative MSE with respect to Oracle; ASE achieves the lowest error among baselines. b (middle): MSE across rounds; ASE converges fastest while ASE-MS remains consistent under nuisance misspecification. c (right): Empirical coverage of nominal 95% CIs; ASE achieves nominal coverage while plug-in and A2IPW-NAïve deteriorate as R grows. Baselines: We compare our estimator (ASE… view at source ↗

**Figure 4.** Figure 4: Optimal allocation π ⋆ under arm-dependent censoring ratio g(x) = Gt(x, 0)/Gt(x, 1): π ⋆ departs from uniform allocation 1/2 and Neyman-allocation as the censoring asymmetry increases. Proof. We derive the closed-form expression for π ⋆ claimed in the motivating illustration above. Assume that the event dynamics may differ across arms and that the censoring survival functions have a time-invariant ratio: S… view at source ↗

**Figure 5.** Figure 5: Twins data: Adaptive allocation improves estimation efficiency, accuracy, and inferential validity on the semi-synthetic Twins dataset. Left: relative MSE with respect to the Oracle; ASE remains closest to oracle performance among all feasible estimators. Middle: MSE across rounds; ASE converges fastest and ASE-MS remains consistent under nuisance misspecification. Right: empirical coverage of nominal 95% … view at source ↗

**Figure 6.** Figure 6: Estimated average survival curves Sˆ t(a) for treatment (a = 1, blue) and control (a = 0, orange) produced by ASE at increasing sample sizes (dashed lines), compared against ground-truth oracle curves St(a) (solid lines with dots). Shaded bands denote ±1 standard error across repeated trials. (a) Synthetic data, t ∈ {0, . . . , 4}, n ∈ {1000, 1500, 2000}. (b) Semi-synthetic Twins data, t ∈ {0, . . . , 3}, … view at source ↗

read the original abstract

Adaptive experimentation enables efficient estimation of causal effects, but existing methods are not designed for survival data with censoring, where event times are only partially observed (e.g., overall survival in cancer trials but with dropout). In this paper, we develop a novel framework for adaptive experimentation to estimate causal effects under right censoring. For this, we derive the semiparametric efficiency bound for the average survival effect curve as a function of the treatment allocation policy and thereby obtain a closed-form efficiency-optimal allocation policy. The policy generalizes classical Neyman allocation to survival settings by prioritizing patient strata where both event and censoring dynamics induce high uncertainty. Building on this, we propose the Adaptive Survival Estimator (ASE), an adaptive framework that learns the allocation policy and estimates the average survival effect curve sequentially. Our framework has three main benefits: (i) it accommodates arbitrary machine learning models for nuisance estimation; (ii) it is guided by a closed-form efficiency-optimal allocation policy; and (iii) it admits strong theoretical guarantees, including asymptotic normality via a martingale central limit theorem. We demonstrate our framework across various numerical experiments to show consistent efficiency gains over uniform randomization and censoring-agnostic baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper derives a closed-form efficiency-optimal allocation policy for adaptive experiments with right-censored survival data.

read the letter

The key takeaway from this paper is that they have worked out a closed-form efficiency-optimal allocation policy specifically for adaptive experiments involving right-censored survival data. This seems to be the central new piece, as it extends classical ideas like Neyman allocation to account for both the event dynamics and the censoring process. What the paper does well is lay out the Adaptive Survival Estimator framework. It derives the semiparametric efficiency bound for the average survival effect curve in terms of the treatment allocation policy. From there, they get an explicit policy that prioritizes patient groups where uncertainty is driven by high event rates or censoring. The framework allows plugging in any machine learning model for estimating the nuisance functions, such as the conditional survival and censoring distributions. They also provide asymptotic normality results using a martingale central limit theorem, which is appropriate for the sequential nature of the data collection. The numerical experiments indicate consistent efficiency improvements compared to uniform randomization and methods that do not account for censoring. On the soft spots, the main one is around the conditions for the nuisance estimators. The derivation assumes that these functions can be estimated at rates faster than n to the minus one fourth, even though the data collection is adaptive and thus the observations are dependent. Standard results for machine learning estimators assume i.i.d. data, so it is not immediate that they carry over here. The paper would benefit from more explicit discussion of how the adaptive policy updates affect the entropy conditions or Donsker properties needed for the efficiency bound to be achieved. If those details are handled carefully in the full proofs, then the claims hold up better. Overall, this paper is for methodologists working on causal inference for time-to-event data or on adaptive clinical trial designs. A reader interested in extending adaptive experimentation to survival settings would get value from the closed-form policy and the theoretical setup. It deserves a serious referee because the contribution is targeted and the experiments provide some empirical support, though the theoretical guarantees would be the main point of scrutiny.

Referee Report

2 major / 2 minor

Summary. The manuscript develops a framework for adaptive experimentation under right censoring for survival outcomes. It derives the semiparametric efficiency bound for the average survival effect curve as a functional of the treatment allocation policy, yielding a closed-form efficiency-optimal allocation policy that generalizes Neyman allocation by prioritizing strata with high uncertainty from both event and censoring dynamics. The Adaptive Survival Estimator (ASE) sequentially learns this policy while estimating the effect curve, accommodating arbitrary machine learning models for nuisance functions (conditional survival and censoring distributions) and establishing asymptotic normality via a martingale central limit theorem. Numerical experiments demonstrate efficiency gains relative to uniform randomization and censoring-agnostic methods.

Significance. If the central results hold, the work would meaningfully extend adaptive experimental design to censored survival settings prevalent in clinical trials and reliability studies. Strengths include the closed-form optimal policy, explicit accommodation of flexible ML nuisance estimators, and use of martingale CLT to handle the non-i.i.d. adaptive sampling; these elements support potential for more efficient causal estimation under partial observation.

major comments (2)

[§3.2] §3.2 (Efficiency Bound and Optimal Policy): The derivation of the semiparametric efficiency bound and the closed-form optimal allocation policy requires nuisance estimators (S(t|X,A) and G(t|X,A)) to converge faster than n^{-1/4} uniformly over the adaptive policy sequence. Because policies are updated sequentially, observations are dependent; the manuscript does not supply conditions ensuring that standard ML estimators satisfy the requisite Donsker or entropy-integral conditions under this dependence, which is load-bearing for attaining the bound and the subsequent martingale CLT.
[Theorem 4] Theorem 4 (Asymptotic Normality): The application of the martingale CLT assumes the score process forms a martingale difference sequence with respect to the filtration generated by past data and policy updates. The paper should explicitly verify that the adaptive dependence does not violate the Lindeberg or conditional variance convergence conditions needed for the asymptotic variance to equal the derived efficiency bound.

minor comments (2)

[§2] The notation for the survival effect curve θ(π) should be introduced with an explicit contrast to the standard average treatment effect to clarify the role of the censoring distribution.
[Figure 3] Figure 3: Add pointwise confidence bands or standard-error shading to the efficiency-gain curves so readers can assess whether reported improvements are statistically distinguishable from the baselines.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We address each of the major comments below and will incorporate revisions to strengthen the theoretical foundations as suggested.

read point-by-point responses

Referee: [§3.2] §3.2 (Efficiency Bound and Optimal Policy): The derivation of the semiparametric efficiency bound and the closed-form optimal allocation policy requires nuisance estimators (S(t|X,A) and G(t|X,A)) to converge faster than n^{-1/4} uniformly over the adaptive policy sequence. Because policies are updated sequentially, observations are dependent; the manuscript does not supply conditions ensuring that standard ML estimators satisfy the requisite Donsker or entropy-integral conditions under this dependence, which is load-bearing for attaining the bound and the subsequent martingale CLT.

Authors: We agree with the referee that additional conditions are needed to ensure the nuisance estimators achieve the required convergence rates uniformly over the adaptive policy sequence. In the revised version, we will introduce explicit assumptions on the nuisance function estimators that guarantee the n^{-1/4} rate under the dependence structure induced by sequential policy updates. We will also provide a discussion on how these conditions can be satisfied by standard machine learning estimators, drawing from results in the adaptive design literature. revision: yes
Referee: [Theorem 4] Theorem 4 (Asymptotic Normality): The application of the martingale CLT assumes the score process forms a martingale difference sequence with respect to the filtration generated by past data and policy updates. The paper should explicitly verify that the adaptive dependence does not violate the Lindeberg or conditional variance convergence conditions needed for the asymptotic variance to equal the derived efficiency bound.

Authors: We appreciate this point. While the martingale difference property holds by construction due to the filtration being generated by past observations and policies, we acknowledge that the Lindeberg condition and conditional variance convergence require explicit verification in the adaptive setting. In the revision, we will expand the proof of Theorem 4 to include these verifications, showing that the boundedness of the relevant functions and the consistency of the policy updates ensure the conditions are met, thereby confirming the asymptotic variance equals the efficiency bound. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation applies standard semiparametric theory to survival functional

full rationale

The paper states it derives the semiparametric efficiency bound for the average survival effect curve as a functional of the allocation policy, then obtains a closed-form optimal policy that generalizes Neyman allocation. This follows from standard semiparametric efficiency theory under right censoring; the bound and policy are not shown to reduce to fitted quantities or self-citations by construction. Nuisance estimation rates and martingale CLT are invoked as external assumptions rather than derived tautologically from the target result. No load-bearing self-citation chains or ansatz smuggling appear in the provided derivation outline. The framework remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard semiparametric efficiency theory for censored data and the ability to estimate nuisances consistently; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (2)

domain assumption Semiparametric efficiency bound exists and can be derived for the average survival effect curve under right censoring as a function of the allocation policy.
Invoked to obtain the closed-form efficiency-optimal allocation policy and asymptotic normality.
domain assumption Nuisance functions can be estimated at rates sufficient for the martingale central limit theorem to apply using arbitrary machine learning models.
Required for the strong theoretical guarantees stated in the abstract.

pith-pipeline@v0.9.0 · 5755 in / 1357 out tokens · 33646 ms · 2026-05-20T12:26:23.637750+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We derive the semiparametric efficiency bound for the average survival effect curve as a function of the treatment allocation policy and thereby obtain a closed-form efficiency-optimal allocation policy.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

79 extracted references · 79 canonical work pages

[1]

A. C. Atkinson, A. N. Donev, and R. Tobias.Optimum experimental designs, with SAS. Number 34 in Oxford Statistical Science Series. Oxford University Press, 2023

work page 2023
[2]

Barker, C

A. Barker, C. Sigman, G. Kelloff, N. Hylton, D. Berry, and L. Esserman. I-SPY 2: An adaptive breast cancer trial design in the setting of neoadjuvant chemotherapy.Clinical Pharmacology & Therapeutics, 86(1):97–100, 2009

work page 2009
[3]

D. A. Berry. Adaptive clinical trials: The promise and the caution.Journal of Clinical Oncology, 29(6):606–609, 2011

work page 2011
[4]

S. M. Berry, B. P. Carlin, J. J. Lee, and P. Muller.Bayesian Adaptive Methods for Clinical Trials. CRC Press, 2010

work page 2010
[5]

Cai and M

W. Cai and M. J. van der Laan. One-step targeted maximum likelihood estimation for time-to- event outcomes.Biometrics, 76(3):722–733, 2020

work page 2020
[6]

Candès, L

E. Candès, L. Lei, and Z. Ren. Conformalized survival analysis.Journal of the Royal Statistical Society Series B: Statistical Methodology, 85(1):24–45, 2023

work page 2023
[7]

F. Chen, S. Ge, J. Qian, and C. Harshaw. Sigmoid-FTRL: Design-based adaptive neyman allocation for AIPW estimators.arXiv preprint, arXiv:2511.19905, 2025

work page arXiv 2025
[8]

Chow and M

S.-C. Chow and M. Chang. Adaptive design methods in clinical trials – a review.Orphanet Journal of Rare Diseases, 3:11, 2008

work page 2008
[9]

T. F. Cloughesy, B. M. Alexander, D. A. Berry, H. Colman, J. F. De Groot, B. M. Ellingson, G. B. Gordon, M. Khasraw, A. B. Lassman, E. Q. Lee, M. Lim, I. K. Mellinghoff, A. Nelli, J. R. Perry, E. P. Sulman, K. Tanner, M. Weller, P. Y . Wen, W. K. A. Yung, and GBM AGILE Investigators. GBM AGILE: A global, phase 2/3 adaptive platform trial to evaluate multi...

work page 2022
[10]

T. Cook, A. Mishler, and A. Ramdas. Semiparametric efficient inference in adaptive experiments. InProceedings of the Third Conference on Causal Learning and Reasoning, pages 1033–1064. PMLR, 2024

work page 2024
[11]

Y . Cui, M. R. Kosorok, E. Sverdrup, S. Wager, and R. Zhu. Estimating heterogeneous treatment effects with right-censored data via causal survival forests.Journal of the Royal Statistical Society Series B: Statistical Methodology, 85(2):179–211, 2023

work page 2023
[12]

Curth and M

A. Curth and M. van der Schaar. Nonparametric estimation of heterogeneous treatment effects: From theory to learning algorithms.arXiv preprint, arXiv:2101.10943, 2021

work page arXiv 2021
[13]

J. Dai, P. Gradu, and C. Harshaw. CLIP-OGD: An experimental design for adaptive neyman allocation in sequential experiments. InNeurIPS, 2023

work page 2023
[14]

Dalal, P

A. Dalal, P. Blöbaum, S. Kasiviswanathan, and A. Ramdas. Anytime-valid inference for double/debiased machine learning of causal parameters.arXiv preprint, arXiv:2408.09598, 2024

work page arXiv 2024
[15]

Davidov, S

H. Davidov, S. Feldman, G. Shamai, R. Kimmel, and Y . Romano. Conformalized survival analysis for general right-censored data. InICLR, 2025

work page 2025
[16]

Dvoretzky

A. Dvoretzky. Asymptotic normality for sums of dependent random variables. InProceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory, volume 6.2, pages 513–536. University of California Press, 1972

work page 1972
[17]

Frauen, M

D. Frauen, M. Schröder, K. Hess, and S. Feuerriegel. Orthogonal survival learners for estimating heterogeneous treatment effects from time-to-event data. InNeurIPS, 2025

work page 2025
[18]

and Hastie, T

Z. Gao and T. Hastie. Estimating heterogeneous treatment effects for general responses.arXiv preprint, arXiv:2103.04277, 2022. 10

work page arXiv 2022
[19]

Y . Gui, R. Hore, Z. Ren, and R. F. Barber. Conformalized survival analysis with adaptive cut-offs.Biometrika, 111(2):459–477, 2024

work page 2024
[20]

Guidance

F. Guidance. Adaptive design clinical trials for drugs and biologics.Biotechnol Law Rep, 29(2): 173, 2010

work page 2010
[21]

J. Hahn, K. Hirano, and D. Karlan. Adaptive experimental design using the propensity score. Journal of Business & Economic Statistics, 29(1):96–108, 2011

work page 2011
[22]

N. C. Henderson, T. A. Louis, G. L. Rosner, and R. Varadhan. Individualized treatment effects with censored data via fully nonparametric Bayesian accelerated failure time models. Biostatistics, 21(1):50–68, 2020

work page 2020
[23]

S. R. Howard, A. Ramdas, J. McAuliffe, and J. Sekhon. Time-uniform chernoff bounds via nonnegative supermartingales.Probability Surveys, 17, 2020

work page 2020
[24]

Hu and W

F. Hu and W. F. Rosenberger.The Theory of Response-Adaptive Randomization in Clinical Trials. John Wiley & Sons, 2006

work page 2006
[25]

L. Hu, J. Ji, and F. Li. Estimating heterogeneous survival treatment effect in observational data using machine learning.Statistics in Medicine, 40(21):4691–4713, 2021

work page 2021
[26]

G. W. Imbens. Nonparametric estimation of average treatment effects under exogeneity: A review.The Review of Economics and Statistics, 86(1):4–29, 2004

work page 2004
[27]

G. W. Imbens and D. B. Rubin.Causal inference in statistics, social, and biomedical sciences. Cambridge university press, 2015

work page 2015
[28]

M. Kato, A. Oga, W. Komatsubara, and R. Inokuchi. Active adaptive experimental design for treatment effect estimation with covariate choices.arXiv preprint, arXiv:2403.03589, 2024

work page arXiv 2024
[29]

J. L. Katzman, U. Shaham, A. Cloninger, J. Bates, T. Jiang, and Y . Kluger. DeepSurv: person- alized treatment recommender system using a cox proportional hazards deep neural network. BMC Medical Research Methodology, 18(1):24, 2018

work page 2018
[30]

Kaufmann and A

E. Kaufmann and A. Garivier. Learning the distribution with largest mean: two bandit frame- works.ESAIM: Proceedings and Surveys, 60:114–131, 2017

work page 2017
[31]

E. S. Kim, R. S. Herbst, I. I. Wistuba, J. J. Lee, G. R. Blumenschein, A. Tsao, D. J. Stewart, M. E. Hicks, J. Erasmus, S. Gupta, C. M. Alden, S. Liu, X. Tang, F. R. Khuri, H. T. Tran, B. E. Johnson, J. V . Heymach, L. Mao, F. Fossella, M. S. Kies, V . Papadimitrakopoulou, S. E. Davis, S. M. Lippman, and W. K. Hong. The BATTLE trial: Personalizing therapy...

work page 2011
[32]

J. P. Klein and M. L. Moeschberger.Survival Analysis: Techniques for Censored and Truncated Data. Statistics for Biology and Health. Springer, 2003

work page 2003
[33]

Lattimore and C

T. Lattimore and C. Szepesvári.Bandit Algorithms. Cambridge University Press, 1 edition, 2020

work page 2020
[34]

Lee and C

J. Lee and C. Ma. Off-policy estimation with adaptively collected data: the power of online learning. InNeurIPS, 2024

work page 2024
[35]

J. Li, D. Simchi-Levi, and Y . Zhao. Optimal adaptive experimental design for estimating treatment effect.arXiv preprint, arXiv:2410.05552, 2024

work page arXiv 2024
[36]

Lindon and N

M. Lindon and N. Kallus. Anytime-valid inference under outcome delay: A design-based approach.arXiv preprint, arXiv:2603.25971, 2026

work page arXiv 2026
[37]

Louizos, U

C. Louizos, U. Shalit, J. Mooij, D. Sontag, R. Zemel, and M. Welling. Causal effect inference with deep latent-variable models. InNeurIPS, 2017

work page 2017
[38]

H. Mao, L. Li, W. Yang, and Y . Shen. On the propensity score weighting analysis with survival outcome: Estimands, estimation, and inference.Statistics in Medicine, 37(26):3745–3763, 2018

work page 2018
[39]

Mukherjee, S

A. Mukherjee, S. Jana, and S. Coad. Covariate-adjusted response-adaptive designs for semi- parametric survival models.Statistical Methods in Medical Research, 34(9):1697–1723, 2025

work page 2025
[40]

Neopane, A

O. Neopane, A. Ramdas, and A. Singh. Logarithmic neyman regret for adaptive estimation of the average treatment effect.arXiv preprint, 2024

work page 2024
[41]

Neopane, A

O. Neopane, A. Ramdas, and A. Singh. Optimistic algorithms for adaptive estimation of the average treatment effect. InICML, 2025. 11

work page 2025
[42]

J. Neyman. On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection.Journal of the Royal Statistical Society, 97(4): 558–625, 1934

work page 1934
[43]

Noarov, R

G. Noarov, R. Fogliato, M. A. Bertran, and A. Roth. Stronger neyman regret guarantees for adaptive experimental design. InICML, 2025

work page 2025
[44]

Oprescu, B

M. Oprescu, B. M. Cho, and N. Kallus. Efficient adaptive experimentation with noncompliance. InNeurIPS, 2025

work page 2025
[45]

Ramdas, P

A. Ramdas, P. Grünwald, V . V ovk, and G. Shafer. Game-theoretic statistics and safe anytime- valid inference.arXiv preprint, arXiv:2210.01948, 2022

work page arXiv 2022
[46]

D. S. Robertson, K. M. Lee, B. C. López-Kolkovska, and S. S. Villar. Response-adaptive randomization in clinical trials: From myths to practical considerations.Statistical Science, 38 (2), 2023

work page 2023
[47]

J. M. Robins. Robust estimation in sequentially ignorable missing data and causal inference models. InProceedings of the American statistical association, volume 1999, pages 6–10. Indianapolis, IN, 2000

work page 1999
[48]

W. F. Rosenberger and O. Sverdlov. Handling covariates in the design of clinical trials.Statistical Science, 23(3):404–419, 2008

work page 2008
[49]

W. F. Rosenberger, A. N. Vidyashankar, and D. K. Agarwal. Covariate-adjusted response- adaptive designs for binary response.Journal of Biopharmaceutical Statistics, 11(4):227–236, 2001

work page 2001
[50]

Rubin and M

D. Rubin and M. J. van der Laan. A doubly robust censoring unbiased transformation.The International Journal of Biostatistics, 3(1):Article 4, 2007

work page 2007
[51]

D. B. Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5):688, 1974

work page 1974
[52]

D. E. Schaubel and G. Wei. Double inverse-weighted estimation of cumulative treatment effects under nonproportional hazards and dependent censoring.Biometrics, 67(1):29–38, 2011

work page 2011
[53]

Schrod, A

S. Schrod, A. Schäfer, S. Solbrig, R. Lohmayer, W. Gronwald, P. J. Oefner, T. Beißbarth, R. Spang, H. U. Zacharias, and M. Altenbuchinger. BITES: balanced individual treatment effect for survival data.Bioinformatics, 38(Supplement_1):i60–i67, 2022

work page 2022
[54]

Tabib and D

S. Tabib and D. Larocque. Non-parametric individual treatment effect estimation for survival data with random forests.Bioinformatics (Oxford, England), 36(2):629–636, 2020

work page 2020
[55]

W. R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples.Biometrika, 25(3/4):285–294, 1933

work page 1933
[56]

M. J. van der Laan. The construction and analysis of adaptive group sequential designs.U.C. Berkeley Division of Biostatistics Working Paper Series, 2008

work page 2008
[57]

M. J. van der Laan and J. M. Robins.Unified Methods for Censored Longitudinal Data and Causality. Springer Series in Statistics. Springer, New York, NY , 2003

work page 2003
[58]

M. J. van der Laan and D. Rubin. Targeted maximum likelihood learning.The International Journal of Biostatistics, 2(1), 2006

work page 2006
[59]

S. Wager. Causal inference: A statistical learning approach.Preprint, 2024

work page 2024
[60]

Waudby-Smith, D

I. Waudby-Smith, D. Arbour, R. Sinha, E. H. Kennedy, and A. Ramdas. Time-uniform central limit theory and asymptotic confidence sequences.The Annals of Statistics, 52(6):2613–2640, 2024

work page 2024
[61]

Westling, A

T. Westling, A. Luedtke, P. B. Gilbert, and M. Carone. Inference for treatment-specific survival curves using machine learning.Journal of the American Statistical Association, 119(546): 1541–1553, 2024

work page 2024
[62]

Wiegrebe, P

S. Wiegrebe, P. Kopper, R. Sonabend, B. Bischl, and A. Bender. Deep learning for survival analysis: A review.Artificial Intelligence Review, 57(3):65, 2024

work page 2024
[63]

Wouter A

M. Wouter A. C., van Amsterdamand Oberst, J. Feng, J. Wiens, S. Tang, S. Joshi, R. Ranganath, M. Sendak, U. Shalit, J. E. V ogt, B. Beaulieu-Jones, M. Mamdani, D. Kent, P. J. Heagerty, T. R. Fleming, and A. Goldenberg. Clinical trials for continuously monitored and updated AI systems. Nature Medicine, pages 1–3, 2026. 12

work page 2026
[64]

S. Xu, R. Cobzaru, S. N. Finkelstein, R. E. Welsch, K. Ng, and Z. Shahn. Estimating het- erogeneous treatment effects on survival outcomes using counterfactual censoring unbiased transformations.arXiv preprint, arXiv:2401.11263, 2024

work page arXiv 2024
[65]

Y . Xu, N. Ignatiadis, E. Sverdrup, S. Fleming, S. Wager, and N. Shah. Treatment heterogeneity for survival outcomes.arXiv preprint, arXiv:2207.07758, 2022

work page arXiv 2022
[66]

Zhang, L

K. Zhang, L. Janson, and S. Murphy. Statistical inference with m-estimators on adaptively collected data. InNeurIPS, 2021

work page 2021
[67]

Zhang, F

L.-X. Zhang, F. Hu, S. H. Cheung, and W. S. Chan. Asymptotic properties of covariate-adjusted response-adaptive designs.The Annals of Statistics, 35(3):1166–1182, 2007

work page 2007
[68]

Zhang and M

W. Zhang and M. van der Laan. Efficient statistical estimation for sequential adaptive experi- ments with implications for adaptive designs.arXiv preprint, arXiv:2508.09135, 2025

work page arXiv 2025
[69]

Zhang, T

W. Zhang, T. D. Le, L. Liu, Z.-H. Zhou, and J. Li. Mining heterogeneous causal effects for personalized cancer treatment.Bioinformatics, 33(15):2372–2378, 2017. 13 A Notation Table 1: Notation Symbol Description XObserved covariates,X∈ X ⊆R p ABinary treatment,A∈ {0,1} TEvent time of interest (e.g., overall survival time (OS), disease-free survival time (...

work page 2017
[71]

: (a) Estimate Σeff(π(k)) using the accumulated data Hr−1 or the population formula if available

Fork= 0,1,2, . . .: (a) Estimate Σeff(π(k)) using the accumulated data Hr−1 or the population formula if available. (b) Update ˜π(k+1)(x) = clipα   q tr Σeff(π(k))−1Σ1(x) q tr Σeff(π(k))−1Σ1(x) + q tr Σeff(π(k))−1Σ0(x)   .(26) (c) Setπ (k+1) ←˜π(k+1) and stop when the iterates stabilize. D.2 E-optimality E-optimality minimizes the largest eigenvalue o...

work page
[72]

Initializeπ (0) (e.g., the A-optimal policy from Proposition 4.3)

work page
[73]

: (a) EstimateΣ eff(π(k))from accumulated dataH r−1

Fork= 0,1,2, . . .: (a) EstimateΣ eff(π(k))from accumulated dataH r−1. (b) Compute the leading eigenvectorv (k) = arg max∥v∥=1 v⊤Σeff(π(k))v. (c) Update ˜π(k+1)(x) = clipα p (v(k))⊤Σ1(x)v (k) p (v(k))⊤Σ1(x)v (k) + p (v(k))⊤Σ0(x)v (k) ! .(29) (d) Setπ (k+1) ←˜π(k+1) and stop when the iterates stabilize. D.3 Comparison of A-, D-, and E-optimality All three ...

work page
[74]

A-optimality usesQ a(x) = tr(Σa(x)) =V a(x)(sum of marginal variances, closed-form)

work page
[75]

D-optimality usesQ a(x) = tr(Σ−1 eff Σa(x))(precision-weighted trace, fixed-point)

work page
[76]

A−π(X) π(X)(1−π(X)) 2 U2 t |X # =π(X)E

E-optimality usesQ a(x) = (v⋆)⊤Σa(x)v⋆ (worst-case directional variance, fixed-point). When tmax = 0, v⋆ = 1 and, therefore, all three criteria coincide. The A-optimal criterion is the only one that separates additively over t, which is why it yields a closed-form solution; D- and E-optimal criteria couple all horizons jointly through Σeff and its eigenve...

work page
[77]

(Martingale difference sequence) {zt,r}R r=1 is a martingale difference sequence; that is E[zt,r | H r−1] = 0for allr∈[1, R]

work page
[78]

(Conditional variance convergence) There exists a constantV eff,t(π)>0such that 1 R RX r=1 E[z2 t,r | H r−1] p − →Veff,t(π),(46)

work page
[79]

tX i=0 1( ˜T=i)1(∆ = 1)−1( ˜T≥i)λ s i (Xr, Ar) Si(Xr, Ar)Gi−1(Xr, Ar) |A r, Xr,H r−1 # (50) = tX i=0 E

(Lindeberg condition) For everyε >0, 1 R RX r=1 E h z2 t,r1{|z t,r|> ε √ R} | H r−1 i p − →0.(47) 23 Then, √ R¯zt,r d − → N(0, Veff,t(π)). The first step is that we need to proveE[z t,r | H r−1] = 0: E[zt,r | H r−1] =E[St(Xr,1)−S t(Xr,0)− Ar −π r(Xr) πr(Xr){1−π r(Xr)} ξ(Hr, ηt)St(Xr, Ar)| H r−1]−τ t =τt −τ t −E E Ar −π r(Xr) πr(Xr){1−π r(Xr)} ξ(Hr, η)St(X...

work page
[80]

2.Asymptotic time-uniform coverage:lim R0→∞ P(∀r≥R 0 :τ t ∈ eCt,r)≥1−α

Asymptotic confidence sequence:there exists an exact (potentially unknown) (1−α) confidence sequence C ⋆ t,r = [L ⋆ t,r, U ⋆ t,r]t≥1 such that eLt,r/L⋆ t,r →1 and eUt,r/U ⋆ t,r →1 almost surely. 2.Asymptotic time-uniform coverage:lim R0→∞ P(∀r≥R 0 :τ t ∈ eCt,r)≥1−α. Definition H.1 can be read as follows: if one waits to “peek” until the sample size is suf...

work page 1989

[1] [1]

A. C. Atkinson, A. N. Donev, and R. Tobias.Optimum experimental designs, with SAS. Number 34 in Oxford Statistical Science Series. Oxford University Press, 2023

work page 2023

[2] [2]

Barker, C

A. Barker, C. Sigman, G. Kelloff, N. Hylton, D. Berry, and L. Esserman. I-SPY 2: An adaptive breast cancer trial design in the setting of neoadjuvant chemotherapy.Clinical Pharmacology & Therapeutics, 86(1):97–100, 2009

work page 2009

[3] [3]

D. A. Berry. Adaptive clinical trials: The promise and the caution.Journal of Clinical Oncology, 29(6):606–609, 2011

work page 2011

[4] [4]

S. M. Berry, B. P. Carlin, J. J. Lee, and P. Muller.Bayesian Adaptive Methods for Clinical Trials. CRC Press, 2010

work page 2010

[5] [5]

Cai and M

W. Cai and M. J. van der Laan. One-step targeted maximum likelihood estimation for time-to- event outcomes.Biometrics, 76(3):722–733, 2020

work page 2020

[6] [6]

Candès, L

E. Candès, L. Lei, and Z. Ren. Conformalized survival analysis.Journal of the Royal Statistical Society Series B: Statistical Methodology, 85(1):24–45, 2023

work page 2023

[7] [7]

F. Chen, S. Ge, J. Qian, and C. Harshaw. Sigmoid-FTRL: Design-based adaptive neyman allocation for AIPW estimators.arXiv preprint, arXiv:2511.19905, 2025

work page arXiv 2025

[8] [8]

Chow and M

S.-C. Chow and M. Chang. Adaptive design methods in clinical trials – a review.Orphanet Journal of Rare Diseases, 3:11, 2008

work page 2008

[9] [9]

T. F. Cloughesy, B. M. Alexander, D. A. Berry, H. Colman, J. F. De Groot, B. M. Ellingson, G. B. Gordon, M. Khasraw, A. B. Lassman, E. Q. Lee, M. Lim, I. K. Mellinghoff, A. Nelli, J. R. Perry, E. P. Sulman, K. Tanner, M. Weller, P. Y . Wen, W. K. A. Yung, and GBM AGILE Investigators. GBM AGILE: A global, phase 2/3 adaptive platform trial to evaluate multi...

work page 2022

[10] [10]

T. Cook, A. Mishler, and A. Ramdas. Semiparametric efficient inference in adaptive experiments. InProceedings of the Third Conference on Causal Learning and Reasoning, pages 1033–1064. PMLR, 2024

work page 2024

[11] [11]

Y . Cui, M. R. Kosorok, E. Sverdrup, S. Wager, and R. Zhu. Estimating heterogeneous treatment effects with right-censored data via causal survival forests.Journal of the Royal Statistical Society Series B: Statistical Methodology, 85(2):179–211, 2023

work page 2023

[12] [12]

Curth and M

A. Curth and M. van der Schaar. Nonparametric estimation of heterogeneous treatment effects: From theory to learning algorithms.arXiv preprint, arXiv:2101.10943, 2021

work page arXiv 2021

[13] [13]

J. Dai, P. Gradu, and C. Harshaw. CLIP-OGD: An experimental design for adaptive neyman allocation in sequential experiments. InNeurIPS, 2023

work page 2023

[14] [14]

Dalal, P

A. Dalal, P. Blöbaum, S. Kasiviswanathan, and A. Ramdas. Anytime-valid inference for double/debiased machine learning of causal parameters.arXiv preprint, arXiv:2408.09598, 2024

work page arXiv 2024

[15] [15]

Davidov, S

H. Davidov, S. Feldman, G. Shamai, R. Kimmel, and Y . Romano. Conformalized survival analysis for general right-censored data. InICLR, 2025

work page 2025

[16] [16]

Dvoretzky

A. Dvoretzky. Asymptotic normality for sums of dependent random variables. InProceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory, volume 6.2, pages 513–536. University of California Press, 1972

work page 1972

[17] [17]

Frauen, M

D. Frauen, M. Schröder, K. Hess, and S. Feuerriegel. Orthogonal survival learners for estimating heterogeneous treatment effects from time-to-event data. InNeurIPS, 2025

work page 2025

[18] [18]

and Hastie, T

Z. Gao and T. Hastie. Estimating heterogeneous treatment effects for general responses.arXiv preprint, arXiv:2103.04277, 2022. 10

work page arXiv 2022

[19] [19]

Y . Gui, R. Hore, Z. Ren, and R. F. Barber. Conformalized survival analysis with adaptive cut-offs.Biometrika, 111(2):459–477, 2024

work page 2024

[20] [20]

Guidance

F. Guidance. Adaptive design clinical trials for drugs and biologics.Biotechnol Law Rep, 29(2): 173, 2010

work page 2010

[21] [21]

J. Hahn, K. Hirano, and D. Karlan. Adaptive experimental design using the propensity score. Journal of Business & Economic Statistics, 29(1):96–108, 2011

work page 2011

[22] [22]

N. C. Henderson, T. A. Louis, G. L. Rosner, and R. Varadhan. Individualized treatment effects with censored data via fully nonparametric Bayesian accelerated failure time models. Biostatistics, 21(1):50–68, 2020

work page 2020

[23] [23]

S. R. Howard, A. Ramdas, J. McAuliffe, and J. Sekhon. Time-uniform chernoff bounds via nonnegative supermartingales.Probability Surveys, 17, 2020

work page 2020

[24] [24]

Hu and W

F. Hu and W. F. Rosenberger.The Theory of Response-Adaptive Randomization in Clinical Trials. John Wiley & Sons, 2006

work page 2006

[25] [25]

L. Hu, J. Ji, and F. Li. Estimating heterogeneous survival treatment effect in observational data using machine learning.Statistics in Medicine, 40(21):4691–4713, 2021

work page 2021

[26] [26]

G. W. Imbens. Nonparametric estimation of average treatment effects under exogeneity: A review.The Review of Economics and Statistics, 86(1):4–29, 2004

work page 2004

[27] [27]

G. W. Imbens and D. B. Rubin.Causal inference in statistics, social, and biomedical sciences. Cambridge university press, 2015

work page 2015

[28] [28]

M. Kato, A. Oga, W. Komatsubara, and R. Inokuchi. Active adaptive experimental design for treatment effect estimation with covariate choices.arXiv preprint, arXiv:2403.03589, 2024

work page arXiv 2024

[29] [29]

J. L. Katzman, U. Shaham, A. Cloninger, J. Bates, T. Jiang, and Y . Kluger. DeepSurv: person- alized treatment recommender system using a cox proportional hazards deep neural network. BMC Medical Research Methodology, 18(1):24, 2018

work page 2018

[30] [30]

Kaufmann and A

E. Kaufmann and A. Garivier. Learning the distribution with largest mean: two bandit frame- works.ESAIM: Proceedings and Surveys, 60:114–131, 2017

work page 2017

[31] [31]

E. S. Kim, R. S. Herbst, I. I. Wistuba, J. J. Lee, G. R. Blumenschein, A. Tsao, D. J. Stewart, M. E. Hicks, J. Erasmus, S. Gupta, C. M. Alden, S. Liu, X. Tang, F. R. Khuri, H. T. Tran, B. E. Johnson, J. V . Heymach, L. Mao, F. Fossella, M. S. Kies, V . Papadimitrakopoulou, S. E. Davis, S. M. Lippman, and W. K. Hong. The BATTLE trial: Personalizing therapy...

work page 2011

[32] [32]

J. P. Klein and M. L. Moeschberger.Survival Analysis: Techniques for Censored and Truncated Data. Statistics for Biology and Health. Springer, 2003

work page 2003

[33] [33]

Lattimore and C

T. Lattimore and C. Szepesvári.Bandit Algorithms. Cambridge University Press, 1 edition, 2020

work page 2020

[34] [34]

Lee and C

J. Lee and C. Ma. Off-policy estimation with adaptively collected data: the power of online learning. InNeurIPS, 2024

work page 2024

[35] [35]

J. Li, D. Simchi-Levi, and Y . Zhao. Optimal adaptive experimental design for estimating treatment effect.arXiv preprint, arXiv:2410.05552, 2024

work page arXiv 2024

[36] [36]

Lindon and N

M. Lindon and N. Kallus. Anytime-valid inference under outcome delay: A design-based approach.arXiv preprint, arXiv:2603.25971, 2026

work page arXiv 2026

[37] [37]

Louizos, U

C. Louizos, U. Shalit, J. Mooij, D. Sontag, R. Zemel, and M. Welling. Causal effect inference with deep latent-variable models. InNeurIPS, 2017

work page 2017

[38] [38]

H. Mao, L. Li, W. Yang, and Y . Shen. On the propensity score weighting analysis with survival outcome: Estimands, estimation, and inference.Statistics in Medicine, 37(26):3745–3763, 2018

work page 2018

[39] [39]

Mukherjee, S

A. Mukherjee, S. Jana, and S. Coad. Covariate-adjusted response-adaptive designs for semi- parametric survival models.Statistical Methods in Medical Research, 34(9):1697–1723, 2025

work page 2025

[40] [40]

Neopane, A

O. Neopane, A. Ramdas, and A. Singh. Logarithmic neyman regret for adaptive estimation of the average treatment effect.arXiv preprint, 2024

work page 2024

[41] [41]

Neopane, A

O. Neopane, A. Ramdas, and A. Singh. Optimistic algorithms for adaptive estimation of the average treatment effect. InICML, 2025. 11

work page 2025

[42] [42]

J. Neyman. On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection.Journal of the Royal Statistical Society, 97(4): 558–625, 1934

work page 1934

[43] [43]

Noarov, R

G. Noarov, R. Fogliato, M. A. Bertran, and A. Roth. Stronger neyman regret guarantees for adaptive experimental design. InICML, 2025

work page 2025

[44] [44]

Oprescu, B

M. Oprescu, B. M. Cho, and N. Kallus. Efficient adaptive experimentation with noncompliance. InNeurIPS, 2025

work page 2025

[45] [45]

Ramdas, P

A. Ramdas, P. Grünwald, V . V ovk, and G. Shafer. Game-theoretic statistics and safe anytime- valid inference.arXiv preprint, arXiv:2210.01948, 2022

work page arXiv 2022

[46] [46]

D. S. Robertson, K. M. Lee, B. C. López-Kolkovska, and S. S. Villar. Response-adaptive randomization in clinical trials: From myths to practical considerations.Statistical Science, 38 (2), 2023

work page 2023

[47] [47]

J. M. Robins. Robust estimation in sequentially ignorable missing data and causal inference models. InProceedings of the American statistical association, volume 1999, pages 6–10. Indianapolis, IN, 2000

work page 1999

[48] [48]

W. F. Rosenberger and O. Sverdlov. Handling covariates in the design of clinical trials.Statistical Science, 23(3):404–419, 2008

work page 2008

[49] [49]

W. F. Rosenberger, A. N. Vidyashankar, and D. K. Agarwal. Covariate-adjusted response- adaptive designs for binary response.Journal of Biopharmaceutical Statistics, 11(4):227–236, 2001

work page 2001

[50] [50]

Rubin and M

D. Rubin and M. J. van der Laan. A doubly robust censoring unbiased transformation.The International Journal of Biostatistics, 3(1):Article 4, 2007

work page 2007

[51] [51]

D. B. Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5):688, 1974

work page 1974

[52] [52]

D. E. Schaubel and G. Wei. Double inverse-weighted estimation of cumulative treatment effects under nonproportional hazards and dependent censoring.Biometrics, 67(1):29–38, 2011

work page 2011

[53] [53]

Schrod, A

S. Schrod, A. Schäfer, S. Solbrig, R. Lohmayer, W. Gronwald, P. J. Oefner, T. Beißbarth, R. Spang, H. U. Zacharias, and M. Altenbuchinger. BITES: balanced individual treatment effect for survival data.Bioinformatics, 38(Supplement_1):i60–i67, 2022

work page 2022

[54] [54]

Tabib and D

S. Tabib and D. Larocque. Non-parametric individual treatment effect estimation for survival data with random forests.Bioinformatics (Oxford, England), 36(2):629–636, 2020

work page 2020

[55] [55]

W. R. Thompson. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples.Biometrika, 25(3/4):285–294, 1933

work page 1933

[56] [56]

M. J. van der Laan. The construction and analysis of adaptive group sequential designs.U.C. Berkeley Division of Biostatistics Working Paper Series, 2008

work page 2008

[57] [57]

M. J. van der Laan and J. M. Robins.Unified Methods for Censored Longitudinal Data and Causality. Springer Series in Statistics. Springer, New York, NY , 2003

work page 2003

[58] [58]

M. J. van der Laan and D. Rubin. Targeted maximum likelihood learning.The International Journal of Biostatistics, 2(1), 2006

work page 2006

[59] [59]

S. Wager. Causal inference: A statistical learning approach.Preprint, 2024

work page 2024

[60] [60]

Waudby-Smith, D

I. Waudby-Smith, D. Arbour, R. Sinha, E. H. Kennedy, and A. Ramdas. Time-uniform central limit theory and asymptotic confidence sequences.The Annals of Statistics, 52(6):2613–2640, 2024

work page 2024

[61] [61]

Westling, A

T. Westling, A. Luedtke, P. B. Gilbert, and M. Carone. Inference for treatment-specific survival curves using machine learning.Journal of the American Statistical Association, 119(546): 1541–1553, 2024

work page 2024

[62] [62]

Wiegrebe, P

S. Wiegrebe, P. Kopper, R. Sonabend, B. Bischl, and A. Bender. Deep learning for survival analysis: A review.Artificial Intelligence Review, 57(3):65, 2024

work page 2024

[63] [63]

Wouter A

M. Wouter A. C., van Amsterdamand Oberst, J. Feng, J. Wiens, S. Tang, S. Joshi, R. Ranganath, M. Sendak, U. Shalit, J. E. V ogt, B. Beaulieu-Jones, M. Mamdani, D. Kent, P. J. Heagerty, T. R. Fleming, and A. Goldenberg. Clinical trials for continuously monitored and updated AI systems. Nature Medicine, pages 1–3, 2026. 12

work page 2026

[64] [64]

S. Xu, R. Cobzaru, S. N. Finkelstein, R. E. Welsch, K. Ng, and Z. Shahn. Estimating het- erogeneous treatment effects on survival outcomes using counterfactual censoring unbiased transformations.arXiv preprint, arXiv:2401.11263, 2024

work page arXiv 2024

[65] [65]

Y . Xu, N. Ignatiadis, E. Sverdrup, S. Fleming, S. Wager, and N. Shah. Treatment heterogeneity for survival outcomes.arXiv preprint, arXiv:2207.07758, 2022

work page arXiv 2022

[66] [66]

Zhang, L

K. Zhang, L. Janson, and S. Murphy. Statistical inference with m-estimators on adaptively collected data. InNeurIPS, 2021

work page 2021

[67] [67]

Zhang, F

L.-X. Zhang, F. Hu, S. H. Cheung, and W. S. Chan. Asymptotic properties of covariate-adjusted response-adaptive designs.The Annals of Statistics, 35(3):1166–1182, 2007

work page 2007

[68] [68]

Zhang and M

W. Zhang and M. van der Laan. Efficient statistical estimation for sequential adaptive experi- ments with implications for adaptive designs.arXiv preprint, arXiv:2508.09135, 2025

work page arXiv 2025

[69] [69]

Zhang, T

W. Zhang, T. D. Le, L. Liu, Z.-H. Zhou, and J. Li. Mining heterogeneous causal effects for personalized cancer treatment.Bioinformatics, 33(15):2372–2378, 2017. 13 A Notation Table 1: Notation Symbol Description XObserved covariates,X∈ X ⊆R p ABinary treatment,A∈ {0,1} TEvent time of interest (e.g., overall survival time (OS), disease-free survival time (...

work page 2017

[70] [71]

: (a) Estimate Σeff(π(k)) using the accumulated data Hr−1 or the population formula if available

Fork= 0,1,2, . . .: (a) Estimate Σeff(π(k)) using the accumulated data Hr−1 or the population formula if available. (b) Update ˜π(k+1)(x) = clipα   q tr Σeff(π(k))−1Σ1(x) q tr Σeff(π(k))−1Σ1(x) + q tr Σeff(π(k))−1Σ0(x)   .(26) (c) Setπ (k+1) ←˜π(k+1) and stop when the iterates stabilize. D.2 E-optimality E-optimality minimizes the largest eigenvalue o...

work page

[71] [72]

Initializeπ (0) (e.g., the A-optimal policy from Proposition 4.3)

work page

[72] [73]

: (a) EstimateΣ eff(π(k))from accumulated dataH r−1

Fork= 0,1,2, . . .: (a) EstimateΣ eff(π(k))from accumulated dataH r−1. (b) Compute the leading eigenvectorv (k) = arg max∥v∥=1 v⊤Σeff(π(k))v. (c) Update ˜π(k+1)(x) = clipα p (v(k))⊤Σ1(x)v (k) p (v(k))⊤Σ1(x)v (k) + p (v(k))⊤Σ0(x)v (k) ! .(29) (d) Setπ (k+1) ←˜π(k+1) and stop when the iterates stabilize. D.3 Comparison of A-, D-, and E-optimality All three ...

work page

[73] [74]

A-optimality usesQ a(x) = tr(Σa(x)) =V a(x)(sum of marginal variances, closed-form)

work page

[74] [75]

D-optimality usesQ a(x) = tr(Σ−1 eff Σa(x))(precision-weighted trace, fixed-point)

work page

[75] [76]

A−π(X) π(X)(1−π(X)) 2 U2 t |X # =π(X)E

E-optimality usesQ a(x) = (v⋆)⊤Σa(x)v⋆ (worst-case directional variance, fixed-point). When tmax = 0, v⋆ = 1 and, therefore, all three criteria coincide. The A-optimal criterion is the only one that separates additively over t, which is why it yields a closed-form solution; D- and E-optimal criteria couple all horizons jointly through Σeff and its eigenve...

work page

[76] [77]

(Martingale difference sequence) {zt,r}R r=1 is a martingale difference sequence; that is E[zt,r | H r−1] = 0for allr∈[1, R]

work page

[77] [78]

(Conditional variance convergence) There exists a constantV eff,t(π)>0such that 1 R RX r=1 E[z2 t,r | H r−1] p − →Veff,t(π),(46)

work page

[78] [79]

tX i=0 1( ˜T=i)1(∆ = 1)−1( ˜T≥i)λ s i (Xr, Ar) Si(Xr, Ar)Gi−1(Xr, Ar) |A r, Xr,H r−1 # (50) = tX i=0 E

(Lindeberg condition) For everyε >0, 1 R RX r=1 E h z2 t,r1{|z t,r|> ε √ R} | H r−1 i p − →0.(47) 23 Then, √ R¯zt,r d − → N(0, Veff,t(π)). The first step is that we need to proveE[z t,r | H r−1] = 0: E[zt,r | H r−1] =E[St(Xr,1)−S t(Xr,0)− Ar −π r(Xr) πr(Xr){1−π r(Xr)} ξ(Hr, ηt)St(Xr, Ar)| H r−1]−τ t =τt −τ t −E E Ar −π r(Xr) πr(Xr){1−π r(Xr)} ξ(Hr, η)St(X...

work page

[79] [80]

2.Asymptotic time-uniform coverage:lim R0→∞ P(∀r≥R 0 :τ t ∈ eCt,r)≥1−α

Asymptotic confidence sequence:there exists an exact (potentially unknown) (1−α) confidence sequence C ⋆ t,r = [L ⋆ t,r, U ⋆ t,r]t≥1 such that eLt,r/L⋆ t,r →1 and eUt,r/U ⋆ t,r →1 almost surely. 2.Asymptotic time-uniform coverage:lim R0→∞ P(∀r≥R 0 :τ t ∈ eCt,r)≥1−α. Definition H.1 can be read as follows: if one waits to “peek” until the sample size is suf...

work page 1989