An Online Meta-Level Adaptive Design Framework with Targeted Learning Inference: Applications to Evaluating and Utilizing Surrogate Outcomes in Adaptive Designs

Aaron Hudson; Mark van der Laan; Maya Petersen; Wenxin Zhang

arxiv: 2408.02667 · v6 · submitted 2024-08-05 · 📊 stat.ME

An Online Meta-Level Adaptive Design Framework with Targeted Learning Inference: Applications to Evaluating and Utilizing Surrogate Outcomes in Adaptive Designs

Wenxin Zhang , Aaron Hudson , Maya Petersen , Mark van der Laan This is my paper

Pith reviewed 2026-05-23 21:59 UTC · model grok-4.3

classification 📊 stat.ME

keywords adaptive designstargeted maximum likelihood estimationcausal estimandssurrogate outcomesonline selectionclinical trialsheterogeneous treatment effects

0 comments

The pith

A meta-level framework defines new causal estimands for adaptive designs and supplies TMLE estimators that support online selection while handling dependence without parametric models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to create a framework that lets experimenters evaluate several candidate adaptive designs in real time as data arrive and switch among them based on performance. It introduces a new class of causal estimands that measure how well each design would have performed, including when designs use surrogate outcomes to update randomization. Targeted maximum likelihood estimators are then derived for these estimands. The estimators are shown to be asymptotically normal under the specific dependence created by adaptive randomization and without any parametric assumptions on the data process. This setup is illustrated with surrogate-based designs, where it quantifies how much each surrogate speeds detection of treatment effect heterogeneity and improves participant outcomes.

Core claim

We define a new class of causal estimands to evaluate adaptive designs and propose Targeted Maximum Likelihood Estimators for these estimands. These estimators are asymptotically normal while accommodating dependence in adaptive-design data without parametric assumptions, enabling online selection among candidate designs. We further apply this framework to a motivating example where multiple surrogates of a long-term outcome are considered for updating randomization probabilities in adaptive experiments, comprehensively quantifying surrogates' utility to accelerate detection of heterogeneous treatment effects.

What carries the argument

New class of causal estimands for adaptive design evaluation, paired with Targeted Maximum Likelihood Estimators that remain asymptotically normal under adaptive randomization dependence without parametric models.

If this is right

Experimenters can perform real-time, data-driven selection among multiple candidate adaptive designs instead of committing to one in advance.
The utility of different surrogate outcomes for guiding randomization can be quantified directly in terms of faster detection of heterogeneous treatment effects and better participant outcomes.
Valid inference for design comparisons becomes available even though the data exhibit dependence induced by the adaptive process.
Dynamic updating of randomization probabilities can be guided by observed performance of each design without relying on strong parametric assumptions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same estimands and estimators could be applied to online A/B testing platforms outside clinical trials to compare candidate adaptive allocation rules.
Extensions might incorporate additional machine-learning predictors for the nuisance functions inside the TMLE while preserving the asymptotic guarantees.
Finite-sample behavior of the online selection procedure under varying degrees of adaptivity remains open for direct investigation.

Load-bearing premise

The TMLE estimators achieve asymptotic normality and valid inference under the specific dependence structure induced by adaptive randomization without requiring parametric models for the data-generating process.

What would settle it

A simulation or real adaptive trial in which the TMLE point estimates fail to converge at the expected rate or produce confidence intervals with incorrect coverage under realistic adaptive dependence would falsify the asymptotic normality result.

Figures

Figures reproduced from arXiv: 2408.02667 by Aaron Hudson, Mark van der Laan, Maya Petersen, Wenxin Zhang.

**Figure 2.** Figure 2: Frequency of candidate adaptive design selection by the Online Superlearner at each [PITH_FULL_IMAGE:figures/full_fig_p027_2.png] view at source ↗

**Figure 3.** Figure 3: The left plot illustrates the frequency of candidate adaptive design selections by the [PITH_FULL_IMAGE:figures/full_fig_p034_3.png] view at source ↗

**Figure 4.** Figure 4: Regret at time t for different design strategies in Scenario 1 (left) and Scenario 2 (right). This figure includes seven curves: five curves correspond to regrets from adaptive designs using outcomes from Y1 to Y5, respectively; one curve illustrates the regret for an adaptive design employing an Online Superlearner to evaluate and utilize surrogates; and the last curve represents the regret of a fixed des… view at source ↗

**Figure 5.** Figure 5: Estimated Conditional Average Treatment Effect (CATE) functions of Navigator v.s. [PITH_FULL_IMAGE:figures/full_fig_p055_5.png] view at source ↗

read the original abstract

Adaptive designs are increasingly used in clinical trials and online experiments to improve participant outcomes by dynamically updating treatment allocation as data accumulate. In practice, experimenters often consider multiple candidate designs, each with distinct trade-offs. However, typically only one design is implemented at a time, leaving benefits and costs of alternative designs unobserved and unquantified. To address this, we propose a novel meta-level adaptive design framework that enables real-time, data-driven evaluation and selection among candidate adaptive designs. Specifically, we define a new class of causal estimands to evaluate adaptive designs and propose Targeted Maximum Likelihood Estimators for these estimands. These estimators are asymptotically normal while accommodating dependence in adaptive-design data without parametric assumptions, enabling online selection among candidate designs. We further apply this framework to a motivating example where multiple surrogates of a long-term outcome are considered for updating randomization probabilities in adaptive experiments. Unlike existing surrogate evaluation methods, our approach comprehensively quantifies surrogates' utility to accelerate detection of heterogeneous treatment effects, expedite updates to treatment randomization, and improve participant outcomes, facilitating dynamic selection among surrogate-guided designs. Overall, our framework provides a unified approach for evaluating opportunities and costs of various adaptive designs and guiding real-time decision-making in adaptive experiments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines new causal estimands for real-time comparison of adaptive designs and applies TMLE to online selection with surrogates, but the asymptotic normality claim under data-dependent switching lacks visible support.

read the letter

The core contribution is a meta-level framework that defines causal quantities to evaluate how well different adaptive designs would have performed, then uses TMLE to estimate them on the fly so an experimenter can switch among candidates, including those driven by different surrogates. This directly tackles the practical issue that only one design runs at a time and the performance of the others stays unknown. The surrogate application is a reasonable extension, showing how the framework can quantify whether a surrogate speeds up detection of heterogeneous effects and improves outcomes under adaptive randomization. That part feels grounded in the existing TMLE literature for dependent data. The main soft spot is the central technical claim. The abstract asserts asymptotic normality and valid inference without parametric assumptions while accommodating the dependence from adaptive randomization, yet supplies no derivation, influence-function expansion, or simulation evidence. The stress-test point about online selection feeding estimates back into the design rule is therefore hard to dismiss; that step can turn the process into a regime-switching martingale whose remainder term may not vanish at the usual rate. If the full paper contains a clean proof that the additional dependence is controlled, the claim holds; otherwise the inference guarantee is unverified. The work is aimed at causal-inference and clinical-trial methodologists who already use TMLE and adaptive designs. A reader focused on surrogate endpoints or multi-design evaluation would extract the framework and the surrogate utility quantification. It is worth sending to peer review because the problem is concrete and the method builds on established tools, even though the key asymptotic step requires referee-level checking of the details.

Referee Report

2 major / 1 minor

Summary. The paper proposes a meta-level adaptive design framework for real-time, data-driven evaluation and selection among multiple candidate adaptive designs. It defines a new class of causal estimands for this purpose and develops Targeted Maximum Likelihood Estimators (TMLE) asserted to be asymptotically normal while accommodating dependence in adaptive-design data without parametric assumptions; the framework is applied to surrogate-guided designs to quantify their utility for detecting heterogeneous treatment effects and improving outcomes.

Significance. If the asymptotic normality and valid inference results hold under the dependence induced by adaptive randomization and online selection, the work would supply a unified, non-parametric tool for comparing the benefits and costs of alternative adaptive designs in clinical trials and online experiments, extending TMLE theory to meta-level design evaluation and surrogate assessment.

major comments (2)

[Abstract] Abstract: the claim that the TMLE estimators 'are asymptotically normal while accommodating dependence in adaptive-design data without parametric assumptions, enabling online selection among candidate designs' is load-bearing for the entire framework, yet the manuscript provides no derivation, proof sketch, or simulation evidence that the influence-function remainder vanishes at the required rate once the same running estimates are used for data-dependent design switching.
[Abstract] Abstract (and motivating example paragraph): the online selection step among surrogate-guided designs creates a regime-switching or stopped process whose effect on the asymptotic expansion is not automatically controlled by standard TMLE results for fixed or slowly varying nuisances; no analysis addresses whether the additional layer of dependence preserves asymptotic linearity.

minor comments (1)

The abstract refers to 'a motivating example' and 'simulation results' implicitly through the framework's application, but no concrete numerical results, tables, or figures are described in the provided text to illustrate performance.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and insightful comments on our manuscript. We address each major comment below in a point-by-point manner and commit to revisions that strengthen the presentation of the asymptotic results without altering the core contributions.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the TMLE estimators 'are asymptotically normal while accommodating dependence in adaptive-design data without parametric assumptions, enabling online selection among candidate designs' is load-bearing for the entire framework, yet the manuscript provides no derivation, proof sketch, or simulation evidence that the influence-function remainder vanishes at the required rate once the same running estimates are used for data-dependent design switching.

Authors: We appreciate the referee's emphasis on the need for transparent justification of the asymptotic claims. The full derivation establishing asymptotic normality of the TMLE under adaptive dependence, including control of the remainder term when nuisance estimates are updated online, appears in the theoretical results (Section 3) and supplementary proofs. To make this more accessible and directly responsive to the abstract claim, we will add a concise proof sketch as a new appendix subsection and include targeted simulation results demonstrating the rate conditions on the remainder. These additions will explicitly address the data-dependent design switching without requiring parametric assumptions. revision: yes
Referee: [Abstract] Abstract (and motivating example paragraph): the online selection step among surrogate-guided designs creates a regime-switching or stopped process whose effect on the asymptotic expansion is not automatically controlled by standard TMLE results for fixed or slowly varying nuisances; no analysis addresses whether the additional layer of dependence preserves asymptotic linearity.

Authors: The referee correctly notes that online selection among designs introduces an additional layer of dependence beyond standard adaptive randomization. Our meta-level causal estimands and the corresponding TMLE are constructed precisely to accommodate this by targeting the relevant functionals while allowing the design to depend on accumulating data at both levels; the asymptotic linearity follows from the conditions stated in our theorems (which bound the selection-induced variation). That said, we agree an explicit discussion of the regime-switching process would improve clarity. In revision we will insert a dedicated paragraph (with supporting lemma) in the methods section explaining why the additional dependence preserves asymptotic linearity under the stated regularity conditions. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation builds on external TMLE theory

full rationale

The paper defines new causal estimands for evaluating adaptive designs and proposes TMLE estimators claimed to achieve asymptotic normality under adaptive dependence without parametric models. These steps extend established TMLE results rather than redefining targets in terms of the estimators themselves or fitting parameters that are then relabeled as predictions. No self-definitional, fitted-input, or self-citation-load-bearing reductions appear in the provided abstract and description; the framework is presented as applying prior TMLE machinery to a new meta-design setting. Self-citations to TMLE foundations are expected and do not bear the load of the central claim by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on standard causal inference assumptions plus the novel claim that TMLE remains asymptotically normal under adaptive dependence; no free parameters or invented entities are introduced in the abstract.

axioms (2)

domain assumption Standard causal assumptions (consistency, positivity, no unmeasured confounding) hold for the defined estimands in the adaptive setting.
Required for TMLE validity; invoked implicitly when claiming the estimators target causal quantities.
domain assumption The dependence structure induced by adaptive designs permits asymptotic normality of the TMLE without parametric modeling.
Central technical assumption enabling online inference and selection.

pith-pipeline@v0.9.0 · 5759 in / 1274 out tokens · 30489 ms · 2026-05-23T21:59:28.223502+00:00 · methodology

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

CBARA: Covariate-Balanced-and-Adjusted Response-Adaptive Randomization
stat.ME 2026-04 unverdicted novelty 6.0

CBARA integrates response-adaptive and covariate-adaptive randomization via a new imbalance vector and pseudo-Markov framework to achieve covariate balance and consistent estimators without model correctness assumptions.
CBARA: Covariate-Balanced-and-Adjusted Response-Adaptive Randomization
stat.ME 2026-04 unverdicted novelty 5.0

CBARA integrates covariate-adaptive and response-adaptive randomization via a new imbalance vector and pseudo-Markov chain framework to achieve better covariate balance while preserving allocation consistency.

Reference graph

Works this paper leans on

69 extracted references · 69 canonical work pages · cited by 1 Pith paper

[1]

Atkinson, A., Biswas, A., and Pronzato, L. (2011). Covariate-balanced response-adaptive designs for clinical trials with continuous responses that target allocation probabilities. Technical report, Technical Report NI11042-DAE, Isaac Newton Institute for Mathematical …

work page 2011
[2]

and van der Laan, M

Benkeser, D. and van der Laan, M. (2016). The highly adaptive lasso estimator. In 2016 IEEE international conference on data science and advanced analytics (DSAA) , pages 689--696. IEEE

work page 2016
[3]

Bibaut, A., Dimakopoulou, M., Kallus, N., Chambaz, A., and van Der Laan, M. (2021). Post-contextual-bandit inference. Advances in neural information processing systems , 34:28548--28559

work page 2021
[4]

Bibaut, A. F. and van der Laan, M. J. (2019). Fast rates for empirical risk minimization over c adl ag functions with bounded sectional variation norm. arXiv preprint arXiv:1907.09244

work page arXiv 2019
[5]

Breiman, L. (2001). Random forests. Machine learning , 45:5--32

work page 2001
[6]

Brown, B. M. (1971). Martingale central limit theorems. The Annals of Mathematical Statistics , pages 59--66

work page 1971
[7]

Bubeck, S., Cesa-Bianchi, N., et al. (2012). Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends in Machine Learning , 5(1):1--122

work page 2012
[8]

Buyse, M., Molenberghs, G., Burzykowski, T., Renard, D., and Geys, H. (2000). The validation of surrogate endpoints in meta-analyses of randomized experiments. Biostatistics , 1(1):49--67

work page 2000
[9]

and van der Laan, M

Chambaz, A. and van der Laan, M. J. (2011). Targeting the optimal design in randomized clinical trials with binary outcomes and no covariate: simulation study. The International Journal of Biostatistics , 7(1)

work page 2011
[10]

and van der Laan, M

Chambaz, A. and van der Laan, M. J. (2014). Inference in targeted group-sequential covariate-adjusted randomized clinical trials. Scandinavian Journal of Statistics , 41(1):104--140

work page 2014
[11]

Chambaz, A., Zheng, W., and van der Laan, M. J. (2017). Targeted sequential design for targeted learning inference of the optimal treatment rule and its mean reward. Annals of statistics , 45(6):2537

work page 2017
[12]

and Guestrin, C

Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining , pages 785--794

work page 2016
[13]

and Chang, M

Chow, S.-C. and Chang, M. (2008). Adaptive design methods in clinical trials--a review. Orphanet journal of rare diseases , 3(1):1--13

work page 2008
[14]

Daniels, M. J. and Hughes, M. D. (1997). Meta-analysis for the evaluation of potential surrogate markers. Statistics in medicine , 16(17):1965--1982

work page 1997
[15]

Duan, W., Ba, S., and Zhang, C. (2021). Online experimentation with surrogate metrics: Guidelines and a case study. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining , pages 193--201

work page 2021
[16]

Elliott, M. R. (2023). Surrogate endpoints in clinical trials. Annual Review of Statistics and its Application , 10:75--96

work page 2023
[17]

Frangakis, C. E. and Rubin, D. B. (2002). Principal stratification in causal inference. Biometrics , 58(1):21--29

work page 2002
[18]

S., Graubard, B

Freedman, L. S., Graubard, B. I., and Schatzkin, A. (1992). Statistical validation of intermediate endpoints for chronic diseases. Statistics in medicine , 11(2):167--178

work page 1992
[19]

H., Odeny, T

Geng, E. H., Odeny, T. A., Montoya, L. M., Iguna, S., Kulzer, J. L., Adhiambo, H. F., Eshun-Wilson, I., Akama, E., Nyandieka, E., Guz \'e , M. A., et al. (2023). Adaptive strategies for retention in care among persons living with hiv. NEJM evidence , 2(4):EVIDoa2200076

work page 2023
[20]

Gilbert, P. B. and Hudgens, M. G. (2008). Evaluating candidate principal surrogate endpoints. Biometrics , 64(4):1146--1154

work page 2008
[21]

D., Laan, M

Gill, R. D., Laan, M. J., and Wellner, J. A. (1995). Inefficient estimators of the bivariate survival function for three models. In Annales de l'IHP Probabilit \'e s et statistiques , volume 31, pages 545--597

work page 1995
[22]

A., Zhan, R., Wager, S., and Athey, S

Hadad, V., Hirshberg, D. A., Zhan, R., Wager, S., and Athey, S. (2021). Confidence intervals for policy evaluation in adaptive experiments. Proceedings of the national academy of sciences , 118(15):e2014602118

work page 2021
[23]

Y., Kennedy, E

Hsu, J. Y., Kennedy, E. H., Roy, J. A., Stephens-Shields, A. J., Small, D. S., and Joffe, M. M. (2015). Surrogate markers for time-varying treatments and outcomes. Clinical Trials , 12(4):309--316

work page 2015
[24]

and Rosenberger, W

Hu, F. and Rosenberger, W. F. (2006). The theory of response-adaptive randomization in clinical trials . John Wiley & Sons

work page 2006
[25]

Huang, X., Ning, J., Li, Y., Estey, E., Issa, J.-P., and Berry, D. A. (2009). Using short-term response information to facilitate adaptive randomization for survival clinical trials. Statistics in medicine , 28(12):1680--1689

work page 2009
[26]

Joffe, M. M. and Greene, T. (2009). Related causal frameworks for surrogate outcomes. Biometrics , 65(2):530--538

work page 2009
[27]

and Szepesv \'a ri, C

Lattimore, T. and Szepesv \'a ri, C. (2020). Bandit algorithms . Cambridge University Press

work page 2020
[28]

Lin, D., Fleming, T., and De Gruttola, V. (1997). Estimating the proportion of treatment effect explained by a surrogate marker. Statistics in medicine , 16(13):1515--1527

work page 1997
[29]

Luedtke, A. R. and van der Laan, M. J. (2016a). Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy. Annals of statistics , 44(2):713

work page
[30]

Luedtke, A. R. and van der Laan, M. J. (2016b). Super-learning of an optimal dynamic treatment rule. The international journal of biostatistics , 12(1):305--332

work page
[31]

Malenica, I., Bibaut, A., and van der Laan, M. J. (2021). Adaptive sequential design for a single time-series. arXiv preprint arXiv:2102.00102

work page arXiv 2021
[32]

R., van der Laan, M

Malenica, I., Coyle, J. R., van der Laan, M. J., and Petersen, M. L. (2024). Adaptive sequential surveillance with network and temporal dependence. Biometrics , 80(1):ujad007

work page 2024
[33]

M., Maystre, L., Lalmas, M., Russo, D., and Ciosek, K

McDonald, T. M., Maystre, L., Lalmas, M., Russo, D., and Ciosek, K. (2023). Impatient bandits: Optimizing recommendations for the long-term without delay. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , pages 1687--1697

work page 2023
[34]

Montoya, L., van der Laan, M., Luedtke, A., Skeem, J., Coyle, J., and Petersen, M. (2021). The optimal dynamic treatment rule superlearner: considerations, performance, and application. arXiv preprint arXiv:2101.12326

work page arXiv 2021
[35]

Murphy, S. A. (2003). Optimal dynamic treatment regimes. Journal of the Royal Statistical Society Series B: Statistical Methodology , 65(2):331--355

work page 2003
[36]

Pearl, J. et al. (2000). Models, reasoning and inference. Cambridge, UK: CambridgeUniversityPress , 19(2):3

work page 2000
[37]

Prentice, R. L. (1989). Surrogate endpoints in clinical trials: definition and operational criteria. Statistics in medicine , 8(4):431--440

work page 1989
[38]

Richardson, L., Zito, A., Greaves, D., and Soriano, J. (2023). Pareto optimal proxy metrics. arXiv preprint arXiv:2307.01000

work page arXiv 2023
[39]

Robbins, H. (1952). Some aspects of the sequential design of experiments

work page 1952
[40]

S., Lee, K

Robertson, D. S., Lee, K. M., L \'o pez-Kolkovska, B. C., and Villar, S. S. (2023). Response-adaptive randomization in clinical trials: from myths to practical considerations. Statistical science: a review journal of the Institute of Mathematical Statistics , 38(2):185

work page 2023
[41]

Robins, J. (1986). A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Mathematical modelling , 7(9-12):1393--1512

work page 1986
[42]

a new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect

Robins, J. M. (1987). Addendum to “a new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect”. Computers & Mathematics with Applications , 14(9-12):923--945

work page 1987
[43]

Robins, J. M. and Greenland, S. (1992). Identifiability and exchangeability for direct and indirect effects. Epidemiology , 3(2):143--155

work page 1992
[44]

Rosenberger, W. F. and Lachin, J. M. (2015). Randomization in clinical trials: theory and practice . John Wiley & Sons

work page 2015
[45]

Rosenberger, W. F. and Sverdlov, O. (2008). Handling covariates in the design of clinical trials

work page 2008
[46]

F., Vidyashankar, A., and Agarwal, D

Rosenberger, W. F., Vidyashankar, A., and Agarwal, D. K. (2001). Covariate-adjusted response-adaptive designs for binary response. Journal of biopharmaceutical statistics , 11(4):227--236

work page 2001
[47]

and Wang, C

Simchi-Levi, D. and Wang, C. (2023). Multi-armed bandit experimental design: Online decision-making and adaptive inference. In International Conference on Artificial Intelligence and Statistics , pages 3086--3097. PMLR

work page 2023
[48]

N., Faries, D

Tamura, R. N., Faries, D. E., Andersen, J. S., and Heiligenstein, J. H. (1994). A case study of an adaptive clinical trial in the treatment of out-patients with depressive disorder. Journal of the American Statistical Association , pages 768--776

work page 1994
[49]

Thompson, W. R. (1933). On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika , 25(3-4):285--294

work page 1933
[50]

van de Geer, S. (1999). Applications of Empirical Process Theory . Cambridge series in statistical and probabilistic mathematics. Cambridge U.P

work page 1999
[51]

van der Laan, M. (2017). A generally efficient targeted minimum loss based estimator based on the highly adaptive lasso. The international journal of biostatistics , 13(2)

work page 2017
[52]

van der Laan, M. (2023). Higher order spline highly adaptive lasso estimators of functional parameters: Pointwise asymptotic normality and uniform convergence rates. arXiv preprint arXiv:2301.13354

work page arXiv 2023
[53]

van der Laan, M. J. (2006). Statistical inference for variable importance. The International Journal of Biostatistics , 2(1)

work page 2006
[54]

van der Laan, M. J. (2008). The construction and analysis of adaptive group sequential designs

work page 2008
[55]

van der Laan, M. J. (2015). A generally efficient targeted minimum loss based estimator

work page 2015
[56]

van der Laan, M. J. and Luedtke, A. R. (2015). Targeted learning of the mean outcome under an optimal dynamic treatment rule. Journal of causal inference , 3(1):61--95

work page 2015
[57]

van der Laan, M. J. and Malenica, I. (2018). Robust estimation of data-dependent causal effects based on observing a single time-series. arXiv preprint arXiv:1809.00734

work page arXiv 2018
[58]

J., Polley, E

van der Laan, M. J., Polley, E. C., and Hubbard, A. E. (2007). Super learner. Statistical applications in genetics and molecular biology , 6(1)

work page 2007
[59]

van der Laan, M. J. and Rose, S. (2018). Targeted learning in data science . Springer

work page 2018
[60]

J., Rose, S., et al

van der Laan, M. J., Rose, S., et al. (2011). Targeted learning: causal inference for observational and experimental data , volume 4. Springer

work page 2011
[61]

van der Laan, M. J. and Rubin, D. (2006). Targeted maximum likelihood learning. The international journal of biostatistics , 2(1)

work page 2006
[62]

van Handel, R. (2011). On the minimal penalty for markov order estimation. Probability theory and related fields , 150:709--738

work page 2011
[63]

VanderWeele, T. J. (2013). Surrogate measures and consistent surrogates. Biometrics , 69(3):561--565

work page 2013
[64]

Weir, C. J. and Taylor, R. S. (2022). Informed decision-making: Statistical methodology for surrogacy evaluation and its role in licensing and reimbursement assessments. Pharmaceutical Statistics , 21(4):740--756

work page 2022
[65]

Yang, J., Eckles, D., Dhillon, P., and Aral, S. (2023). Targeting for long-term outcomes. Management Science

work page 2023
[66]

A., and Athey, S

Zhan, R., Hadad, V., Hirshberg, D. A., and Athey, S. (2021). Off-policy evaluation via adaptive weighting with data from contextual bandits. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages 2125--2135

work page 2021
[67]

H., and Chan, W

Zhang, L.-X., Hu, F., Cheung, S. H., and Chan, W. S. (2007). Asymptotic properties of covariate-adjusted response-adaptive designs

work page 2007
[68]

and van der Laan, M

Zheng, W. and van der Laan, M. J. (2010). Asymptotic theory for cross-validated targeted maximum likelihood estimation

work page 2010
[69]

Zhu, H. (2015). Covariate-adjusted response adaptive designs incorporating covariates with and without treatment interactions. Canadian Journal of Statistics , 43(4):534--553

work page 2015

[1] [1]

Atkinson, A., Biswas, A., and Pronzato, L. (2011). Covariate-balanced response-adaptive designs for clinical trials with continuous responses that target allocation probabilities. Technical report, Technical Report NI11042-DAE, Isaac Newton Institute for Mathematical …

work page 2011

[2] [2]

and van der Laan, M

Benkeser, D. and van der Laan, M. (2016). The highly adaptive lasso estimator. In 2016 IEEE international conference on data science and advanced analytics (DSAA) , pages 689--696. IEEE

work page 2016

[3] [3]

Bibaut, A., Dimakopoulou, M., Kallus, N., Chambaz, A., and van Der Laan, M. (2021). Post-contextual-bandit inference. Advances in neural information processing systems , 34:28548--28559

work page 2021

[4] [4]

Bibaut, A. F. and van der Laan, M. J. (2019). Fast rates for empirical risk minimization over c adl ag functions with bounded sectional variation norm. arXiv preprint arXiv:1907.09244

work page arXiv 2019

[5] [5]

Breiman, L. (2001). Random forests. Machine learning , 45:5--32

work page 2001

[6] [6]

Brown, B. M. (1971). Martingale central limit theorems. The Annals of Mathematical Statistics , pages 59--66

work page 1971

[7] [7]

Bubeck, S., Cesa-Bianchi, N., et al. (2012). Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends in Machine Learning , 5(1):1--122

work page 2012

[8] [8]

Buyse, M., Molenberghs, G., Burzykowski, T., Renard, D., and Geys, H. (2000). The validation of surrogate endpoints in meta-analyses of randomized experiments. Biostatistics , 1(1):49--67

work page 2000

[9] [9]

and van der Laan, M

Chambaz, A. and van der Laan, M. J. (2011). Targeting the optimal design in randomized clinical trials with binary outcomes and no covariate: simulation study. The International Journal of Biostatistics , 7(1)

work page 2011

[10] [10]

and van der Laan, M

Chambaz, A. and van der Laan, M. J. (2014). Inference in targeted group-sequential covariate-adjusted randomized clinical trials. Scandinavian Journal of Statistics , 41(1):104--140

work page 2014

[11] [11]

Chambaz, A., Zheng, W., and van der Laan, M. J. (2017). Targeted sequential design for targeted learning inference of the optimal treatment rule and its mean reward. Annals of statistics , 45(6):2537

work page 2017

[12] [12]

and Guestrin, C

Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining , pages 785--794

work page 2016

[13] [13]

and Chang, M

Chow, S.-C. and Chang, M. (2008). Adaptive design methods in clinical trials--a review. Orphanet journal of rare diseases , 3(1):1--13

work page 2008

[14] [14]

Daniels, M. J. and Hughes, M. D. (1997). Meta-analysis for the evaluation of potential surrogate markers. Statistics in medicine , 16(17):1965--1982

work page 1997

[15] [15]

Duan, W., Ba, S., and Zhang, C. (2021). Online experimentation with surrogate metrics: Guidelines and a case study. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining , pages 193--201

work page 2021

[16] [16]

Elliott, M. R. (2023). Surrogate endpoints in clinical trials. Annual Review of Statistics and its Application , 10:75--96

work page 2023

[17] [17]

Frangakis, C. E. and Rubin, D. B. (2002). Principal stratification in causal inference. Biometrics , 58(1):21--29

work page 2002

[18] [18]

S., Graubard, B

Freedman, L. S., Graubard, B. I., and Schatzkin, A. (1992). Statistical validation of intermediate endpoints for chronic diseases. Statistics in medicine , 11(2):167--178

work page 1992

[19] [19]

H., Odeny, T

Geng, E. H., Odeny, T. A., Montoya, L. M., Iguna, S., Kulzer, J. L., Adhiambo, H. F., Eshun-Wilson, I., Akama, E., Nyandieka, E., Guz \'e , M. A., et al. (2023). Adaptive strategies for retention in care among persons living with hiv. NEJM evidence , 2(4):EVIDoa2200076

work page 2023

[20] [20]

Gilbert, P. B. and Hudgens, M. G. (2008). Evaluating candidate principal surrogate endpoints. Biometrics , 64(4):1146--1154

work page 2008

[21] [21]

D., Laan, M

Gill, R. D., Laan, M. J., and Wellner, J. A. (1995). Inefficient estimators of the bivariate survival function for three models. In Annales de l'IHP Probabilit \'e s et statistiques , volume 31, pages 545--597

work page 1995

[22] [22]

A., Zhan, R., Wager, S., and Athey, S

Hadad, V., Hirshberg, D. A., Zhan, R., Wager, S., and Athey, S. (2021). Confidence intervals for policy evaluation in adaptive experiments. Proceedings of the national academy of sciences , 118(15):e2014602118

work page 2021

[23] [23]

Y., Kennedy, E

Hsu, J. Y., Kennedy, E. H., Roy, J. A., Stephens-Shields, A. J., Small, D. S., and Joffe, M. M. (2015). Surrogate markers for time-varying treatments and outcomes. Clinical Trials , 12(4):309--316

work page 2015

[24] [24]

and Rosenberger, W

Hu, F. and Rosenberger, W. F. (2006). The theory of response-adaptive randomization in clinical trials . John Wiley & Sons

work page 2006

[25] [25]

Huang, X., Ning, J., Li, Y., Estey, E., Issa, J.-P., and Berry, D. A. (2009). Using short-term response information to facilitate adaptive randomization for survival clinical trials. Statistics in medicine , 28(12):1680--1689

work page 2009

[26] [26]

Joffe, M. M. and Greene, T. (2009). Related causal frameworks for surrogate outcomes. Biometrics , 65(2):530--538

work page 2009

[27] [27]

and Szepesv \'a ri, C

Lattimore, T. and Szepesv \'a ri, C. (2020). Bandit algorithms . Cambridge University Press

work page 2020

[28] [28]

Lin, D., Fleming, T., and De Gruttola, V. (1997). Estimating the proportion of treatment effect explained by a surrogate marker. Statistics in medicine , 16(13):1515--1527

work page 1997

[29] [29]

Luedtke, A. R. and van der Laan, M. J. (2016a). Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy. Annals of statistics , 44(2):713

work page

[30] [30]

Luedtke, A. R. and van der Laan, M. J. (2016b). Super-learning of an optimal dynamic treatment rule. The international journal of biostatistics , 12(1):305--332

work page

[31] [31]

Malenica, I., Bibaut, A., and van der Laan, M. J. (2021). Adaptive sequential design for a single time-series. arXiv preprint arXiv:2102.00102

work page arXiv 2021

[32] [32]

R., van der Laan, M

Malenica, I., Coyle, J. R., van der Laan, M. J., and Petersen, M. L. (2024). Adaptive sequential surveillance with network and temporal dependence. Biometrics , 80(1):ujad007

work page 2024

[33] [33]

M., Maystre, L., Lalmas, M., Russo, D., and Ciosek, K

McDonald, T. M., Maystre, L., Lalmas, M., Russo, D., and Ciosek, K. (2023). Impatient bandits: Optimizing recommendations for the long-term without delay. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , pages 1687--1697

work page 2023

[34] [34]

Montoya, L., van der Laan, M., Luedtke, A., Skeem, J., Coyle, J., and Petersen, M. (2021). The optimal dynamic treatment rule superlearner: considerations, performance, and application. arXiv preprint arXiv:2101.12326

work page arXiv 2021

[35] [35]

Murphy, S. A. (2003). Optimal dynamic treatment regimes. Journal of the Royal Statistical Society Series B: Statistical Methodology , 65(2):331--355

work page 2003

[36] [36]

Pearl, J. et al. (2000). Models, reasoning and inference. Cambridge, UK: CambridgeUniversityPress , 19(2):3

work page 2000

[37] [37]

Prentice, R. L. (1989). Surrogate endpoints in clinical trials: definition and operational criteria. Statistics in medicine , 8(4):431--440

work page 1989

[38] [38]

Richardson, L., Zito, A., Greaves, D., and Soriano, J. (2023). Pareto optimal proxy metrics. arXiv preprint arXiv:2307.01000

work page arXiv 2023

[39] [39]

Robbins, H. (1952). Some aspects of the sequential design of experiments

work page 1952

[40] [40]

S., Lee, K

Robertson, D. S., Lee, K. M., L \'o pez-Kolkovska, B. C., and Villar, S. S. (2023). Response-adaptive randomization in clinical trials: from myths to practical considerations. Statistical science: a review journal of the Institute of Mathematical Statistics , 38(2):185

work page 2023

[41] [41]

Robins, J. (1986). A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Mathematical modelling , 7(9-12):1393--1512

work page 1986

[42] [42]

a new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect

Robins, J. M. (1987). Addendum to “a new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect”. Computers & Mathematics with Applications , 14(9-12):923--945

work page 1987

[43] [43]

Robins, J. M. and Greenland, S. (1992). Identifiability and exchangeability for direct and indirect effects. Epidemiology , 3(2):143--155

work page 1992

[44] [44]

Rosenberger, W. F. and Lachin, J. M. (2015). Randomization in clinical trials: theory and practice . John Wiley & Sons

work page 2015

[45] [45]

Rosenberger, W. F. and Sverdlov, O. (2008). Handling covariates in the design of clinical trials

work page 2008

[46] [46]

F., Vidyashankar, A., and Agarwal, D

Rosenberger, W. F., Vidyashankar, A., and Agarwal, D. K. (2001). Covariate-adjusted response-adaptive designs for binary response. Journal of biopharmaceutical statistics , 11(4):227--236

work page 2001

[47] [47]

and Wang, C

Simchi-Levi, D. and Wang, C. (2023). Multi-armed bandit experimental design: Online decision-making and adaptive inference. In International Conference on Artificial Intelligence and Statistics , pages 3086--3097. PMLR

work page 2023

[48] [48]

N., Faries, D

Tamura, R. N., Faries, D. E., Andersen, J. S., and Heiligenstein, J. H. (1994). A case study of an adaptive clinical trial in the treatment of out-patients with depressive disorder. Journal of the American Statistical Association , pages 768--776

work page 1994

[49] [49]

Thompson, W. R. (1933). On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika , 25(3-4):285--294

work page 1933

[50] [50]

van de Geer, S. (1999). Applications of Empirical Process Theory . Cambridge series in statistical and probabilistic mathematics. Cambridge U.P

work page 1999

[51] [51]

van der Laan, M. (2017). A generally efficient targeted minimum loss based estimator based on the highly adaptive lasso. The international journal of biostatistics , 13(2)

work page 2017

[52] [52]

van der Laan, M. (2023). Higher order spline highly adaptive lasso estimators of functional parameters: Pointwise asymptotic normality and uniform convergence rates. arXiv preprint arXiv:2301.13354

work page arXiv 2023

[53] [53]

van der Laan, M. J. (2006). Statistical inference for variable importance. The International Journal of Biostatistics , 2(1)

work page 2006

[54] [54]

van der Laan, M. J. (2008). The construction and analysis of adaptive group sequential designs

work page 2008

[55] [55]

van der Laan, M. J. (2015). A generally efficient targeted minimum loss based estimator

work page 2015

[56] [56]

van der Laan, M. J. and Luedtke, A. R. (2015). Targeted learning of the mean outcome under an optimal dynamic treatment rule. Journal of causal inference , 3(1):61--95

work page 2015

[57] [57]

van der Laan, M. J. and Malenica, I. (2018). Robust estimation of data-dependent causal effects based on observing a single time-series. arXiv preprint arXiv:1809.00734

work page arXiv 2018

[58] [58]

J., Polley, E

van der Laan, M. J., Polley, E. C., and Hubbard, A. E. (2007). Super learner. Statistical applications in genetics and molecular biology , 6(1)

work page 2007

[59] [59]

van der Laan, M. J. and Rose, S. (2018). Targeted learning in data science . Springer

work page 2018

[60] [60]

J., Rose, S., et al

van der Laan, M. J., Rose, S., et al. (2011). Targeted learning: causal inference for observational and experimental data , volume 4. Springer

work page 2011

[61] [61]

van der Laan, M. J. and Rubin, D. (2006). Targeted maximum likelihood learning. The international journal of biostatistics , 2(1)

work page 2006

[62] [62]

van Handel, R. (2011). On the minimal penalty for markov order estimation. Probability theory and related fields , 150:709--738

work page 2011

[63] [63]

VanderWeele, T. J. (2013). Surrogate measures and consistent surrogates. Biometrics , 69(3):561--565

work page 2013

[64] [64]

Weir, C. J. and Taylor, R. S. (2022). Informed decision-making: Statistical methodology for surrogacy evaluation and its role in licensing and reimbursement assessments. Pharmaceutical Statistics , 21(4):740--756

work page 2022

[65] [65]

Yang, J., Eckles, D., Dhillon, P., and Aral, S. (2023). Targeting for long-term outcomes. Management Science

work page 2023

[66] [66]

A., and Athey, S

Zhan, R., Hadad, V., Hirshberg, D. A., and Athey, S. (2021). Off-policy evaluation via adaptive weighting with data from contextual bandits. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages 2125--2135

work page 2021

[67] [67]

H., and Chan, W

Zhang, L.-X., Hu, F., Cheung, S. H., and Chan, W. S. (2007). Asymptotic properties of covariate-adjusted response-adaptive designs

work page 2007

[68] [68]

and van der Laan, M

Zheng, W. and van der Laan, M. J. (2010). Asymptotic theory for cross-validated targeted maximum likelihood estimation

work page 2010

[69] [69]

Zhu, H. (2015). Covariate-adjusted response adaptive designs incorporating covariates with and without treatment interactions. Canadian Journal of Statistics , 43(4):534--553

work page 2015