pith. sign in

arxiv: 2408.02667 · v6 · submitted 2024-08-05 · 📊 stat.ME

An Online Meta-Level Adaptive Design Framework with Targeted Learning Inference: Applications to Evaluating and Utilizing Surrogate Outcomes in Adaptive Designs

Pith reviewed 2026-05-23 21:59 UTC · model grok-4.3

classification 📊 stat.ME
keywords adaptive designstargeted maximum likelihood estimationcausal estimandssurrogate outcomesonline selectionclinical trialsheterogeneous treatment effects
0
0 comments X

The pith

A meta-level framework defines new causal estimands for adaptive designs and supplies TMLE estimators that support online selection while handling dependence without parametric models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to create a framework that lets experimenters evaluate several candidate adaptive designs in real time as data arrive and switch among them based on performance. It introduces a new class of causal estimands that measure how well each design would have performed, including when designs use surrogate outcomes to update randomization. Targeted maximum likelihood estimators are then derived for these estimands. The estimators are shown to be asymptotically normal under the specific dependence created by adaptive randomization and without any parametric assumptions on the data process. This setup is illustrated with surrogate-based designs, where it quantifies how much each surrogate speeds detection of treatment effect heterogeneity and improves participant outcomes.

Core claim

We define a new class of causal estimands to evaluate adaptive designs and propose Targeted Maximum Likelihood Estimators for these estimands. These estimators are asymptotically normal while accommodating dependence in adaptive-design data without parametric assumptions, enabling online selection among candidate designs. We further apply this framework to a motivating example where multiple surrogates of a long-term outcome are considered for updating randomization probabilities in adaptive experiments, comprehensively quantifying surrogates' utility to accelerate detection of heterogeneous treatment effects.

What carries the argument

New class of causal estimands for adaptive design evaluation, paired with Targeted Maximum Likelihood Estimators that remain asymptotically normal under adaptive randomization dependence without parametric models.

If this is right

  • Experimenters can perform real-time, data-driven selection among multiple candidate adaptive designs instead of committing to one in advance.
  • The utility of different surrogate outcomes for guiding randomization can be quantified directly in terms of faster detection of heterogeneous treatment effects and better participant outcomes.
  • Valid inference for design comparisons becomes available even though the data exhibit dependence induced by the adaptive process.
  • Dynamic updating of randomization probabilities can be guided by observed performance of each design without relying on strong parametric assumptions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same estimands and estimators could be applied to online A/B testing platforms outside clinical trials to compare candidate adaptive allocation rules.
  • Extensions might incorporate additional machine-learning predictors for the nuisance functions inside the TMLE while preserving the asymptotic guarantees.
  • Finite-sample behavior of the online selection procedure under varying degrees of adaptivity remains open for direct investigation.

Load-bearing premise

The TMLE estimators achieve asymptotic normality and valid inference under the specific dependence structure induced by adaptive randomization without requiring parametric models for the data-generating process.

What would settle it

A simulation or real adaptive trial in which the TMLE point estimates fail to converge at the expected rate or produce confidence intervals with incorrect coverage under realistic adaptive dependence would falsify the asymptotic normality result.

Figures

Figures reproduced from arXiv: 2408.02667 by Aaron Hudson, Mark van der Laan, Maya Petersen, Wenxin Zhang.

Figure 1
Figure 1. Figure 1: Conditional Average Treatment Effect (CATE) functions for Scenario 1 (left), in which [PITH_FULL_IMAGE:figures/full_fig_p024_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Frequency of candidate adaptive design selection by the Online Superlearner at each [PITH_FULL_IMAGE:figures/full_fig_p027_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The left plot illustrates the frequency of candidate adaptive design selections by the [PITH_FULL_IMAGE:figures/full_fig_p034_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Regret at time t for different design strategies in Scenario 1 (left) and Scenario 2 (right). This figure includes seven curves: five curves correspond to regrets from adaptive designs using outcomes from Y1 to Y5, respectively; one curve illustrates the regret for an adaptive design employing an Online Superlearner to evaluate and utilize surrogates; and the last curve represents the regret of a fixed des… view at source ↗
Figure 5
Figure 5. Figure 5: Estimated Conditional Average Treatment Effect (CATE) functions of Navigator v.s. [PITH_FULL_IMAGE:figures/full_fig_p055_5.png] view at source ↗
read the original abstract

Adaptive designs are increasingly used in clinical trials and online experiments to improve participant outcomes by dynamically updating treatment allocation as data accumulate. In practice, experimenters often consider multiple candidate designs, each with distinct trade-offs. However, typically only one design is implemented at a time, leaving benefits and costs of alternative designs unobserved and unquantified. To address this, we propose a novel meta-level adaptive design framework that enables real-time, data-driven evaluation and selection among candidate adaptive designs. Specifically, we define a new class of causal estimands to evaluate adaptive designs and propose Targeted Maximum Likelihood Estimators for these estimands. These estimators are asymptotically normal while accommodating dependence in adaptive-design data without parametric assumptions, enabling online selection among candidate designs. We further apply this framework to a motivating example where multiple surrogates of a long-term outcome are considered for updating randomization probabilities in adaptive experiments. Unlike existing surrogate evaluation methods, our approach comprehensively quantifies surrogates' utility to accelerate detection of heterogeneous treatment effects, expedite updates to treatment randomization, and improve participant outcomes, facilitating dynamic selection among surrogate-guided designs. Overall, our framework provides a unified approach for evaluating opportunities and costs of various adaptive designs and guiding real-time decision-making in adaptive experiments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes a meta-level adaptive design framework for real-time, data-driven evaluation and selection among multiple candidate adaptive designs. It defines a new class of causal estimands for this purpose and develops Targeted Maximum Likelihood Estimators (TMLE) asserted to be asymptotically normal while accommodating dependence in adaptive-design data without parametric assumptions; the framework is applied to surrogate-guided designs to quantify their utility for detecting heterogeneous treatment effects and improving outcomes.

Significance. If the asymptotic normality and valid inference results hold under the dependence induced by adaptive randomization and online selection, the work would supply a unified, non-parametric tool for comparing the benefits and costs of alternative adaptive designs in clinical trials and online experiments, extending TMLE theory to meta-level design evaluation and surrogate assessment.

major comments (2)
  1. [Abstract] Abstract: the claim that the TMLE estimators 'are asymptotically normal while accommodating dependence in adaptive-design data without parametric assumptions, enabling online selection among candidate designs' is load-bearing for the entire framework, yet the manuscript provides no derivation, proof sketch, or simulation evidence that the influence-function remainder vanishes at the required rate once the same running estimates are used for data-dependent design switching.
  2. [Abstract] Abstract (and motivating example paragraph): the online selection step among surrogate-guided designs creates a regime-switching or stopped process whose effect on the asymptotic expansion is not automatically controlled by standard TMLE results for fixed or slowly varying nuisances; no analysis addresses whether the additional layer of dependence preserves asymptotic linearity.
minor comments (1)
  1. The abstract refers to 'a motivating example' and 'simulation results' implicitly through the framework's application, but no concrete numerical results, tables, or figures are described in the provided text to illustrate performance.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and insightful comments on our manuscript. We address each major comment below in a point-by-point manner and commit to revisions that strengthen the presentation of the asymptotic results without altering the core contributions.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that the TMLE estimators 'are asymptotically normal while accommodating dependence in adaptive-design data without parametric assumptions, enabling online selection among candidate designs' is load-bearing for the entire framework, yet the manuscript provides no derivation, proof sketch, or simulation evidence that the influence-function remainder vanishes at the required rate once the same running estimates are used for data-dependent design switching.

    Authors: We appreciate the referee's emphasis on the need for transparent justification of the asymptotic claims. The full derivation establishing asymptotic normality of the TMLE under adaptive dependence, including control of the remainder term when nuisance estimates are updated online, appears in the theoretical results (Section 3) and supplementary proofs. To make this more accessible and directly responsive to the abstract claim, we will add a concise proof sketch as a new appendix subsection and include targeted simulation results demonstrating the rate conditions on the remainder. These additions will explicitly address the data-dependent design switching without requiring parametric assumptions. revision: yes

  2. Referee: [Abstract] Abstract (and motivating example paragraph): the online selection step among surrogate-guided designs creates a regime-switching or stopped process whose effect on the asymptotic expansion is not automatically controlled by standard TMLE results for fixed or slowly varying nuisances; no analysis addresses whether the additional layer of dependence preserves asymptotic linearity.

    Authors: The referee correctly notes that online selection among designs introduces an additional layer of dependence beyond standard adaptive randomization. Our meta-level causal estimands and the corresponding TMLE are constructed precisely to accommodate this by targeting the relevant functionals while allowing the design to depend on accumulating data at both levels; the asymptotic linearity follows from the conditions stated in our theorems (which bound the selection-induced variation). That said, we agree an explicit discussion of the regime-switching process would improve clarity. In revision we will insert a dedicated paragraph (with supporting lemma) in the methods section explaining why the additional dependence preserves asymptotic linearity under the stated regularity conditions. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation builds on external TMLE theory

full rationale

The paper defines new causal estimands for evaluating adaptive designs and proposes TMLE estimators claimed to achieve asymptotic normality under adaptive dependence without parametric models. These steps extend established TMLE results rather than redefining targets in terms of the estimators themselves or fitting parameters that are then relabeled as predictions. No self-definitional, fitted-input, or self-citation-load-bearing reductions appear in the provided abstract and description; the framework is presented as applying prior TMLE machinery to a new meta-design setting. Self-citations to TMLE foundations are expected and do not bear the load of the central claim by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on standard causal inference assumptions plus the novel claim that TMLE remains asymptotically normal under adaptive dependence; no free parameters or invented entities are introduced in the abstract.

axioms (2)
  • domain assumption Standard causal assumptions (consistency, positivity, no unmeasured confounding) hold for the defined estimands in the adaptive setting.
    Required for TMLE validity; invoked implicitly when claiming the estimators target causal quantities.
  • domain assumption The dependence structure induced by adaptive designs permits asymptotic normality of the TMLE without parametric modeling.
    Central technical assumption enabling online inference and selection.

pith-pipeline@v0.9.0 · 5759 in / 1274 out tokens · 30489 ms · 2026-05-23T21:59:28.223502+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. CBARA: Covariate-Balanced-and-Adjusted Response-Adaptive Randomization

    stat.ME 2026-04 unverdicted novelty 6.0

    CBARA integrates response-adaptive and covariate-adaptive randomization via a new imbalance vector and pseudo-Markov framework to achieve covariate balance and consistent estimators without model correctness assumptions.

  2. CBARA: Covariate-Balanced-and-Adjusted Response-Adaptive Randomization

    stat.ME 2026-04 unverdicted novelty 5.0

    CBARA integrates covariate-adaptive and response-adaptive randomization via a new imbalance vector and pseudo-Markov chain framework to achieve better covariate balance while preserving allocation consistency.

Reference graph

Works this paper leans on

69 extracted references · 69 canonical work pages · cited by 1 Pith paper

  1. [1]

    Atkinson, A., Biswas, A., and Pronzato, L. (2011). Covariate-balanced response-adaptive designs for clinical trials with continuous responses that target allocation probabilities. Technical report, Technical Report NI11042-DAE, Isaac Newton Institute for Mathematical …

  2. [2]

    and van der Laan, M

    Benkeser, D. and van der Laan, M. (2016). The highly adaptive lasso estimator. In 2016 IEEE international conference on data science and advanced analytics (DSAA) , pages 689--696. IEEE

  3. [3]

    Bibaut, A., Dimakopoulou, M., Kallus, N., Chambaz, A., and van Der Laan, M. (2021). Post-contextual-bandit inference. Advances in neural information processing systems , 34:28548--28559

  4. [4]

    Bibaut, A. F. and van der Laan, M. J. (2019). Fast rates for empirical risk minimization over c adl ag functions with bounded sectional variation norm. arXiv preprint arXiv:1907.09244

  5. [5]

    Breiman, L. (2001). Random forests. Machine learning , 45:5--32

  6. [6]

    Brown, B. M. (1971). Martingale central limit theorems. The Annals of Mathematical Statistics , pages 59--66

  7. [7]

    Bubeck, S., Cesa-Bianchi, N., et al. (2012). Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends in Machine Learning , 5(1):1--122

  8. [8]

    Buyse, M., Molenberghs, G., Burzykowski, T., Renard, D., and Geys, H. (2000). The validation of surrogate endpoints in meta-analyses of randomized experiments. Biostatistics , 1(1):49--67

  9. [9]

    and van der Laan, M

    Chambaz, A. and van der Laan, M. J. (2011). Targeting the optimal design in randomized clinical trials with binary outcomes and no covariate: simulation study. The International Journal of Biostatistics , 7(1)

  10. [10]

    and van der Laan, M

    Chambaz, A. and van der Laan, M. J. (2014). Inference in targeted group-sequential covariate-adjusted randomized clinical trials. Scandinavian Journal of Statistics , 41(1):104--140

  11. [11]

    Chambaz, A., Zheng, W., and van der Laan, M. J. (2017). Targeted sequential design for targeted learning inference of the optimal treatment rule and its mean reward. Annals of statistics , 45(6):2537

  12. [12]

    and Guestrin, C

    Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining , pages 785--794

  13. [13]

    and Chang, M

    Chow, S.-C. and Chang, M. (2008). Adaptive design methods in clinical trials--a review. Orphanet journal of rare diseases , 3(1):1--13

  14. [14]

    Daniels, M. J. and Hughes, M. D. (1997). Meta-analysis for the evaluation of potential surrogate markers. Statistics in medicine , 16(17):1965--1982

  15. [15]

    Duan, W., Ba, S., and Zhang, C. (2021). Online experimentation with surrogate metrics: Guidelines and a case study. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining , pages 193--201

  16. [16]

    Elliott, M. R. (2023). Surrogate endpoints in clinical trials. Annual Review of Statistics and its Application , 10:75--96

  17. [17]

    Frangakis, C. E. and Rubin, D. B. (2002). Principal stratification in causal inference. Biometrics , 58(1):21--29

  18. [18]

    S., Graubard, B

    Freedman, L. S., Graubard, B. I., and Schatzkin, A. (1992). Statistical validation of intermediate endpoints for chronic diseases. Statistics in medicine , 11(2):167--178

  19. [19]

    H., Odeny, T

    Geng, E. H., Odeny, T. A., Montoya, L. M., Iguna, S., Kulzer, J. L., Adhiambo, H. F., Eshun-Wilson, I., Akama, E., Nyandieka, E., Guz \'e , M. A., et al. (2023). Adaptive strategies for retention in care among persons living with hiv. NEJM evidence , 2(4):EVIDoa2200076

  20. [20]

    Gilbert, P. B. and Hudgens, M. G. (2008). Evaluating candidate principal surrogate endpoints. Biometrics , 64(4):1146--1154

  21. [21]

    D., Laan, M

    Gill, R. D., Laan, M. J., and Wellner, J. A. (1995). Inefficient estimators of the bivariate survival function for three models. In Annales de l'IHP Probabilit \'e s et statistiques , volume 31, pages 545--597

  22. [22]

    A., Zhan, R., Wager, S., and Athey, S

    Hadad, V., Hirshberg, D. A., Zhan, R., Wager, S., and Athey, S. (2021). Confidence intervals for policy evaluation in adaptive experiments. Proceedings of the national academy of sciences , 118(15):e2014602118

  23. [23]

    Y., Kennedy, E

    Hsu, J. Y., Kennedy, E. H., Roy, J. A., Stephens-Shields, A. J., Small, D. S., and Joffe, M. M. (2015). Surrogate markers for time-varying treatments and outcomes. Clinical Trials , 12(4):309--316

  24. [24]

    and Rosenberger, W

    Hu, F. and Rosenberger, W. F. (2006). The theory of response-adaptive randomization in clinical trials . John Wiley & Sons

  25. [25]

    Huang, X., Ning, J., Li, Y., Estey, E., Issa, J.-P., and Berry, D. A. (2009). Using short-term response information to facilitate adaptive randomization for survival clinical trials. Statistics in medicine , 28(12):1680--1689

  26. [26]

    Joffe, M. M. and Greene, T. (2009). Related causal frameworks for surrogate outcomes. Biometrics , 65(2):530--538

  27. [27]

    and Szepesv \'a ri, C

    Lattimore, T. and Szepesv \'a ri, C. (2020). Bandit algorithms . Cambridge University Press

  28. [28]

    Lin, D., Fleming, T., and De Gruttola, V. (1997). Estimating the proportion of treatment effect explained by a surrogate marker. Statistics in medicine , 16(13):1515--1527

  29. [29]

    Luedtke, A. R. and van der Laan, M. J. (2016a). Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy. Annals of statistics , 44(2):713

  30. [30]

    Luedtke, A. R. and van der Laan, M. J. (2016b). Super-learning of an optimal dynamic treatment rule. The international journal of biostatistics , 12(1):305--332

  31. [31]

    Malenica, I., Bibaut, A., and van der Laan, M. J. (2021). Adaptive sequential design for a single time-series. arXiv preprint arXiv:2102.00102

  32. [32]

    R., van der Laan, M

    Malenica, I., Coyle, J. R., van der Laan, M. J., and Petersen, M. L. (2024). Adaptive sequential surveillance with network and temporal dependence. Biometrics , 80(1):ujad007

  33. [33]

    M., Maystre, L., Lalmas, M., Russo, D., and Ciosek, K

    McDonald, T. M., Maystre, L., Lalmas, M., Russo, D., and Ciosek, K. (2023). Impatient bandits: Optimizing recommendations for the long-term without delay. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , pages 1687--1697

  34. [34]

    Montoya, L., van der Laan, M., Luedtke, A., Skeem, J., Coyle, J., and Petersen, M. (2021). The optimal dynamic treatment rule superlearner: considerations, performance, and application. arXiv preprint arXiv:2101.12326

  35. [35]

    Murphy, S. A. (2003). Optimal dynamic treatment regimes. Journal of the Royal Statistical Society Series B: Statistical Methodology , 65(2):331--355

  36. [36]

    Pearl, J. et al. (2000). Models, reasoning and inference. Cambridge, UK: CambridgeUniversityPress , 19(2):3

  37. [37]

    Prentice, R. L. (1989). Surrogate endpoints in clinical trials: definition and operational criteria. Statistics in medicine , 8(4):431--440

  38. [38]

    Richardson, L., Zito, A., Greaves, D., and Soriano, J. (2023). Pareto optimal proxy metrics. arXiv preprint arXiv:2307.01000

  39. [39]

    Robbins, H. (1952). Some aspects of the sequential design of experiments

  40. [40]

    S., Lee, K

    Robertson, D. S., Lee, K. M., L \'o pez-Kolkovska, B. C., and Villar, S. S. (2023). Response-adaptive randomization in clinical trials: from myths to practical considerations. Statistical science: a review journal of the Institute of Mathematical Statistics , 38(2):185

  41. [41]

    Robins, J. (1986). A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Mathematical modelling , 7(9-12):1393--1512

  42. [42]

    a new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect

    Robins, J. M. (1987). Addendum to “a new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect”. Computers & Mathematics with Applications , 14(9-12):923--945

  43. [43]

    Robins, J. M. and Greenland, S. (1992). Identifiability and exchangeability for direct and indirect effects. Epidemiology , 3(2):143--155

  44. [44]

    Rosenberger, W. F. and Lachin, J. M. (2015). Randomization in clinical trials: theory and practice . John Wiley & Sons

  45. [45]

    Rosenberger, W. F. and Sverdlov, O. (2008). Handling covariates in the design of clinical trials

  46. [46]

    F., Vidyashankar, A., and Agarwal, D

    Rosenberger, W. F., Vidyashankar, A., and Agarwal, D. K. (2001). Covariate-adjusted response-adaptive designs for binary response. Journal of biopharmaceutical statistics , 11(4):227--236

  47. [47]

    and Wang, C

    Simchi-Levi, D. and Wang, C. (2023). Multi-armed bandit experimental design: Online decision-making and adaptive inference. In International Conference on Artificial Intelligence and Statistics , pages 3086--3097. PMLR

  48. [48]

    N., Faries, D

    Tamura, R. N., Faries, D. E., Andersen, J. S., and Heiligenstein, J. H. (1994). A case study of an adaptive clinical trial in the treatment of out-patients with depressive disorder. Journal of the American Statistical Association , pages 768--776

  49. [49]

    Thompson, W. R. (1933). On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika , 25(3-4):285--294

  50. [50]

    van de Geer, S. (1999). Applications of Empirical Process Theory . Cambridge series in statistical and probabilistic mathematics. Cambridge U.P

  51. [51]

    van der Laan, M. (2017). A generally efficient targeted minimum loss based estimator based on the highly adaptive lasso. The international journal of biostatistics , 13(2)

  52. [52]

    van der Laan, M. (2023). Higher order spline highly adaptive lasso estimators of functional parameters: Pointwise asymptotic normality and uniform convergence rates. arXiv preprint arXiv:2301.13354

  53. [53]

    van der Laan, M. J. (2006). Statistical inference for variable importance. The International Journal of Biostatistics , 2(1)

  54. [54]

    van der Laan, M. J. (2008). The construction and analysis of adaptive group sequential designs

  55. [55]

    van der Laan, M. J. (2015). A generally efficient targeted minimum loss based estimator

  56. [56]

    van der Laan, M. J. and Luedtke, A. R. (2015). Targeted learning of the mean outcome under an optimal dynamic treatment rule. Journal of causal inference , 3(1):61--95

  57. [57]

    van der Laan, M. J. and Malenica, I. (2018). Robust estimation of data-dependent causal effects based on observing a single time-series. arXiv preprint arXiv:1809.00734

  58. [58]

    J., Polley, E

    van der Laan, M. J., Polley, E. C., and Hubbard, A. E. (2007). Super learner. Statistical applications in genetics and molecular biology , 6(1)

  59. [59]

    van der Laan, M. J. and Rose, S. (2018). Targeted learning in data science . Springer

  60. [60]

    J., Rose, S., et al

    van der Laan, M. J., Rose, S., et al. (2011). Targeted learning: causal inference for observational and experimental data , volume 4. Springer

  61. [61]

    van der Laan, M. J. and Rubin, D. (2006). Targeted maximum likelihood learning. The international journal of biostatistics , 2(1)

  62. [62]

    van Handel, R. (2011). On the minimal penalty for markov order estimation. Probability theory and related fields , 150:709--738

  63. [63]

    VanderWeele, T. J. (2013). Surrogate measures and consistent surrogates. Biometrics , 69(3):561--565

  64. [64]

    Weir, C. J. and Taylor, R. S. (2022). Informed decision-making: Statistical methodology for surrogacy evaluation and its role in licensing and reimbursement assessments. Pharmaceutical Statistics , 21(4):740--756

  65. [65]

    Yang, J., Eckles, D., Dhillon, P., and Aral, S. (2023). Targeting for long-term outcomes. Management Science

  66. [66]

    A., and Athey, S

    Zhan, R., Hadad, V., Hirshberg, D. A., and Athey, S. (2021). Off-policy evaluation via adaptive weighting with data from contextual bandits. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , pages 2125--2135

  67. [67]

    H., and Chan, W

    Zhang, L.-X., Hu, F., Cheung, S. H., and Chan, W. S. (2007). Asymptotic properties of covariate-adjusted response-adaptive designs

  68. [68]

    and van der Laan, M

    Zheng, W. and van der Laan, M. J. (2010). Asymptotic theory for cross-validated targeted maximum likelihood estimation

  69. [69]

    Zhu, H. (2015). Covariate-adjusted response adaptive designs incorporating covariates with and without treatment interactions. Canadian Journal of Statistics , 43(4):534--553