pith. sign in

arxiv: 2605.21458 · v1 · pith:QHZ3URLRnew · submitted 2026-05-20 · 💻 cs.AI · cs.LG· stat.ME

Mind the Sim-to-Real Gap & Think Like a Scientist

Pith reviewed 2026-05-21 03:58 UTC · model grok-4.3

classification 💻 cs.AI cs.LGstat.ME
keywords sim-to-real gapsimulation lemmasequential decision makingexperimental designpolicy evaluationreinforcement learningFisher information
0
0 comments X

The pith

Randomization in real experiments identifies the calibration-deployment shift in simulator value error while a reachability gap persists under passive learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies when a planner with a pre-trained but confounded simulator should run real experiments to improve sequential decisions. It decomposes simulator value error via an extended simulation lemma into a shift component that randomization can identify and correct and a parametric residual that further data cannot reduce. The value gap between the simulator policy and the true optimum further splits into local and reachability parts, with the latter bounded away from zero under purely passive learning. This guides the design of Fisher-SEP, an experimental policy that minimizes posterior predictive variance of a target policy's value, with reward-only and transition-only variants. Case studies show when front-loaded pilots amortize costs and when designed exploration is required to reach distant states.

Core claim

An extended simulation lemma decomposes the simulator's value error into a calibration-deployment shift that randomization can identify and a parametric residual that no further interaction can reduce. The value gap between the simulator-optimal policy and the optimum splits into a local component on visited states and a reachability component on unvisited states that stays bounded away from zero at any horizon under purely passive learning. Fisher-SEP is proposed as a simulation-aided experimental policy that minimizes the posterior predictive variance of a target policy's value.

What carries the argument

The extended simulation lemma, which partitions simulator value error into a randomization-identifiable calibration-deployment shift and an irreducible parametric residual.

If this is right

  • In supply-chain problems with long horizons, front-loaded experimentation overtakes posterior updating once pilot costs are amortized.
  • In problems with separated regions like well- and poorly-surveilled corridors, only designed exploration reaches the poorly-surveilled states.
  • Reward-only and transition-only specializations of the experimental policy allow tailoring data collection to what is observed.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The decomposition suggests prioritizing early randomization experiments to calibrate simulators before committing to long deployment horizons.
  • Persistent reachability gaps imply that passive data collection alone will leave value estimates biased in problems with distant or low-probability states.
  • Variance-minimization objectives like Fisher-SEP could be adapted to set explicit budgets for real trials based on target precision.

Load-bearing premise

That randomization in real experiments can identify and correct the calibration-deployment shift component of simulator error.

What would settle it

An experiment in which randomized real trials fail to reduce the identified shift component of value error or in which the reachability component of the value gap approaches zero under infinite passive observations.

Figures

Figures reproduced from arXiv: 2605.21458 by Alexander Volfovsky, Dominique Perrault-Joncas, Gabriel Levin-Konigsberg, Harsh Parikh.

Figure 1
Figure 1. Figure 1: Gap decomposition on the visitation sim [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: HIV mobile-testing program (30 common-seed trials, [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗
read the original abstract

Suppose a planner has a pre-trained simulator of a sequential decision problem and the option to run real experiments in the field. The simulator is cheap to query but inherits confounding and drift from its calibration data. Experimentation is unbiased but consumes one real unit per trial. We study when, and how, the planner should supplement the simulator with experiments. We give three results. First, an extended simulation lemma decomposes the simulator's value error into a calibration--deployment shift that randomization can identify and a parametric residual that no further interaction can reduce. Second, the value gap between the simulator-optimal policy and the optimum splits into a local component, on states the deployed policy already visits, and a reachability component, on states it does not. The reachability component stays bounded away from zero at any horizon under purely passive learning. Third, we propose Fisher-SEP, a simulation-aided experimental policy (SEP) that minimizes the posterior predictive variance of a target policy's value, with reward-only and transition-only specializations. Two case studies illustrate the regimes. In a vending-machine supply chain, front-loaded experimentation overtakes posterior updating once the horizon is long enough to amortize the pilot. In an HIV mobile-testing example with a corridor that separates a well-surveilled region from a poorly-surveilled one, only designed exploration reaches the poorly-surveilled region.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper studies when and how to supplement a pre-trained simulator (with inherited confounding and drift) with real experiments in sequential decision problems. It claims three results: (1) an extended simulation lemma decomposing simulator value error into a randomization-identifiable calibration-deployment shift and an irreducible parametric residual; (2) a decomposition of the value gap between simulator-optimal and optimal policies into local and reachability components, with the reachability component bounded away from zero under passive learning; (3) the Fisher-SEP policy that minimizes posterior predictive variance of a target policy's value (with reward-only and transition-only variants), illustrated in vending-machine supply-chain and HIV mobile-testing case studies.

Significance. If the decomposition in the extended simulation lemma holds, the work supplies a principled separation of simulator error sources that can guide the allocation of real experiments, with direct relevance to efficient policy learning under confounding. The reachability result and Fisher-SEP proposal highlight concrete regimes where passive learning fails and designed exploration or front-loaded pilots become necessary. The two case studies usefully illustrate the claimed regimes.

major comments (2)
  1. [Abstract and §3] Abstract and §3 (extended simulation lemma): the decomposition of simulator value error into an identifiable calibration-deployment shift (via randomization) and an irreducible parametric residual is load-bearing for all subsequent claims on when to run real experiments. The argument implicitly requires that randomization in the real environment isolates the shift term without further modeling of state-dependent confounding or non-additive drift-policy interactions; if those conditions fail, the residual is no longer cleanly separable from what additional interaction can address.
  2. [§4] §4 (value-gap decomposition): the claim that the reachability component remains bounded away from zero at any horizon under purely passive learning is central to the argument for designed exploration. The bound appears to rely on the specific corridor structure of the HIV example; the general conditions under which the reachability term cannot be reduced by passive sampling should be stated explicitly, including any assumptions on the state space or transition structure.
minor comments (2)
  1. [Abstract] The acronym Fisher-SEP is introduced without expansion on first use; a parenthetical definition (e.g., Fisher-information Simulation-aided Experimental Policy) would improve readability.
  2. [§3] Notation for the calibration-deployment shift term is used before it is formally defined; a short notational table or inline definition at first appearance would help.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments. These help us clarify the assumptions underlying our decompositions and strengthen the generalizability of the results. We address each major comment below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (extended simulation lemma): the decomposition of simulator value error into an identifiable calibration-deployment shift (via randomization) and an irreducible parametric residual is load-bearing for all subsequent claims on when to run real experiments. The argument implicitly requires that randomization in the real environment isolates the shift term without further modeling of state-dependent confounding or non-additive drift-policy interactions; if those conditions fail, the residual is no longer cleanly separable from what additional interaction can address.

    Authors: The extended simulation lemma is derived under a model in which the simulator's inherited confounding and drift are captured as a calibration-deployment shift that can be isolated via randomization in the real environment, leaving an irreducible parametric residual. We agree that the clean separation assumes the absence of additional state-dependent confounding or non-additive drift-policy interactions beyond the modeled shift. Our framework targets regimes where this decomposition holds, consistent with standard sim-to-real assumptions. We will revise §3 to explicitly enumerate these modeling assumptions and discuss the conditions (including randomization requirements) under which the lemma applies, along with brief remarks on potential violations. revision: partial

  2. Referee: [§4] §4 (value-gap decomposition): the claim that the reachability component remains bounded away from zero at any horizon under purely passive learning is central to the argument for designed exploration. The bound appears to rely on the specific corridor structure of the HIV example; the general conditions under which the reachability term cannot be reduced by passive sampling should be stated explicitly, including any assumptions on the state space or transition structure.

    Authors: We appreciate this point. The reachability component is defined generally as the value difference arising from states not visited by the simulator-optimal policy. The result that this component is bounded away from zero under passive learning holds whenever the transition structure creates components unreachable with positive probability under passive sampling from the simulator policy. The HIV corridor serves as an illustration of such a structure, but the formal argument does not depend on it. We will revise §4 to state the general conditions explicitly, including assumptions on the state space (e.g., presence of separated or low-probability transition components) and transition kernel, and present the bound in a manner independent of the specific example. revision: yes

Circularity Check

0 steps flagged

Extended simulation lemma and policy proposals derive from problem setup without reduction to fitted inputs or self-citations

full rationale

The paper states three results beginning with an extended simulation lemma that decomposes simulator value error into a calibration-deployment shift identifiable via randomization and an irreducible parametric residual. This decomposition is presented as following directly from the sequential decision problem with a confounded simulator and unbiased real experiments. The subsequent split of the value gap into local and reachability components is likewise derived from visitation properties under passive learning, and Fisher-SEP is defined by minimizing posterior predictive variance of a target policy's value. No equations or steps reduce these quantities to parameters already fitted inside the simulator or to self-citations whose content is unverified; the derivations remain independent of the target claims and are self-contained against external benchmarks of the underlying MDP and identification assumptions.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on the domain assumption that real experiments are unbiased and that randomization suffices to identify the calibration-deployment shift; no free parameters or new entities are introduced in the abstract.

axioms (2)
  • domain assumption Real experiments are unbiased while the simulator inherits confounding and drift from its calibration data.
    Stated directly in the opening problem setup of the abstract.
  • domain assumption Randomization in real experiments can identify the calibration-deployment shift component of simulator error.
    Invoked as part of the extended simulation lemma result.

pith-pipeline@v0.9.0 · 5790 in / 1444 out tokens · 40577 ms · 2026-05-21T03:58:53.293524+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

166 extracted references · 166 canonical work pages · 3 internal anchors

  1. [1]

    , title =

    Howard, Ronald A. , title =. IEEE Transactions on Systems Science and Cybernetics , volume =. 1966 , publisher =

  2. [2]

    1961 , address =

    Raiffa, Howard and Schlaifer, Robert , title =. 1961 , address =

  3. [3]

    Statistical Science , volume =

    Chaloner, Kathryn and Verdinelli, Isabella , title =. Statistical Science , volume =. 1995 , publisher =

  4. [4]

    and Baio, Gianluca and Menzies, Nicolas A

    Heath, Anna and Kunst, Natalia and Jackson, Christopher and Strong, Mark and Alarid-Escudero, Fernando and Goldhaber-Fiebert, Jeremy D. and Baio, Gianluca and Menzies, Nicolas A. and Jalal, Hawre , title =. Medical Decision Making , volume =. 2020 , publisher =

  5. [5]

    and Brennan, Alan , title =

    Strong, Mark and Oakley, Jeremy E. and Brennan, Alan , title =. Medical Decision Making , volume =. 2014 , publisher =

  6. [6]

    and Chades, Iadine and Dezfouli, Amir , title =

    Blau, Tom and Bonilla, Edwin V. and Chades, Iadine and Dezfouli, Amir , title =. International Conference on Machine Learning (ICML) , pages =. 2022 , organization =

  7. [7]

    SIAM Review , volume =

    Peherstorfer, Benjamin and Willcox, Karen and Gunzburger, Max , title =. SIAM Review , volume =. 2018 , publisher =

  8. [8]

    and Schneider, Jeff and P

    Kandasamy, Kirthevasan and Dasarathy, Gautam and Oliva, Junier B. and Schneider, Jeff and P. Gaussian Process Bandit Optimisation with Multi-fidelity Evaluations , booktitle =

  9. [9]

    Multi-fidelity Bayesian Optimisation with Continuous Approximations , booktitle =

    Kandasamy, Kirthevasan and Dasarathy, Gautam and Schneider, Jeff and P. Multi-fidelity Bayesian Optimisation with Continuous Approximations , booktitle =. 2017 , organization =

  10. [10]

    Multi-fidelity Gaussian Process Bandit Optimisation , journal =

    Kandasamy, Kirthevasan and Dasarathy, Gautam and P. Multi-fidelity Gaussian Process Bandit Optimisation , journal =

  11. [11]

    , title =

    Poloczek, Matthias and Wang, Jialei and Frazier, Peter I. , title =. Advances in Neural Information Processing Systems (NeurIPS) , volume =

  12. [12]

    arXiv preprint arXiv:2003.10870 , year =

    Lee, Eric Hans and Perrone, Valerio and Archambeau, Cedric and Seeger, Matthias , title =. arXiv preprint arXiv:2003.10870 , year =

  13. [13]

    IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages =

    Tobin, Josh and Fong, Rachel and Ray, Alex and Schneider, Jonas and Zaremba, Wojciech and Abbeel, Pieter , title =. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages =. 2017 , organization =

  14. [14]

    Solving Rubik's Cube with a Robot Hand

    Akkaya, Ilge and Andrychowicz, Marcin and Chociej, Maciek and Litwin, Mateusz and McGrew, Bob and Petron, Arthur and Paino, Alex and Plappert, Matthias and Powell, Glenn and Ribas, Raphael and others , title =. arXiv preprint arXiv:1910.07113 , year =

  15. [15]

    Conference on Robot Learning (CoRL) , pages =

    Mehta, Bhairav and Diaz, Manfred and Golber, Florian and Sim, Christopher and Englert, Peter and Fox, Dieter , title =. Conference on Robot Learning (CoRL) , pages =. 2020 , organization =

  16. [16]

    International Conference on Robotics and Automation (ICRA) , pages =

    Chebotar, Yevgen and Handa, Ankur and Makoviychuk, Viktor and Macklin, Miles and Issac, Jan and Ratliff, Nathan and Fox, Dieter , title =. International Conference on Robotics and Automation (ICRA) , pages =. 2019 , organization =

  17. [17]

    Conference on Robot Learning (CoRL) , pages =

    Allevato, Adam and Short, Elaine Schaertl and Pryor, Mitch and Thomaz, Andrea , title =. Conference on Robot Learning (CoRL) , pages =. 2020 , organization =

  18. [18]

    Frontiers in Robotics and AI , volume =

    Muratore, Fabio and Ramos, Fabio and Turk, Greg and Yu, Wenhao and Gienger, Michael and Peters, Jan , title =. Frontiers in Robotics and AI , volume =. 2022 , publisher =

  19. [19]

    IEEE Access , volume =

    Salvato, Erica and Fenu, Gianfranco and Medvet, Eric and Pellegrino, Felice Andrea , title =. IEEE Access , volume =. 2021 , publisher =

  20. [20]

    Advances in Neural Information Processing Systems (NeurIPS) , volume =

    Kumar, Aviral and Zhou, Aurick and Tucker, George and Levine, Sergey , title =. Advances in Neural Information Processing Systems (NeurIPS) , volume =

  21. [21]

    Advances in Neural Information Processing Systems (NeurIPS) , volume =

    Yu, Tianhe and Thomas, Garrett and Yu, Lantao and Ermon, Stefano and Zou, James and Levine, Sergey and Finn, Chelsea and Ma, Tengyu , title =. Advances in Neural Information Processing Systems (NeurIPS) , volume =

  22. [22]

    International Conference on Learning Representations (ICLR) , year =

    Kostrikov, Ilya and Nair, Ashvin and Levine, Sergey , title =. International Conference on Learning Representations (ICLR) , year =

  23. [23]

    International Conference on Machine Learning (ICML) , pages =

    Fujimoto, Scott and Meger, David and Precup, Doina , title =. International Conference on Machine Learning (ICML) , pages =. 2019 , organization =

  24. [24]

    Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

    Levine, Sergey and Kumar, Aviral and Tucker, George and Fu, Justin , title =. arXiv preprint arXiv:2005.01643 , year =

  25. [25]

    Advances in Neural Information Processing Systems (NeurIPS) , volume =

    Kidambi, Rahul and Rajeswaran, Aravind and Netrapalli, Praneeth and Joachims, Thorsten , title =. Advances in Neural Information Processing Systems (NeurIPS) , volume =

  26. [26]

    and Malik, Ilyas and Rainforth, Tom , title =

    Foster, Adam and Ivanova, Desi R. and Malik, Ilyas and Rainforth, Tom , title =. International Conference on Machine Learning (ICML) , pages =. 2021 , organization =

  27. [27]

    and Foster, Adam and Kleinegesse, Steven and Gutmann, Michael U

    Ivanova, Desi R. and Foster, Adam and Kleinegesse, Steven and Gutmann, Michael U. and Rainforth, Tom , title =. Advances in Neural Information Processing Systems (NeurIPS) , volume =

  28. [28]

    and Bickford Smith, Freddie , title =

    Rainforth, Tom and Foster, Adam and Ivanova, Desi R. and Bickford Smith, Freddie , title =. Statistical Science , year =

  29. [29]

    Proceedings of the National Academy of Sciences , volume =

    Bareinboim, Elias and Pearl, Judea , title =. Proceedings of the National Academy of Sciences , volume =. 2016 , publisher =

  30. [30]

    Mastering Diverse Domains through World Models

    Hafner, Danijar and Pasukonis, Jurgis and Ba, Jimmy and Lillicrap, Timothy , title =. arXiv preprint arXiv:2301.04104 , year =

  31. [31]

    Mathematics of Operations Research , volume =

    Russo, Daniel and Van Roy, Benjamin , title =. Mathematics of Operations Research , volume =. 2014 , publisher =

  32. [32]

    Operations Research , volume =

    Russo, Daniel , title =. Operations Research , volume =. 2020 , publisher =

  33. [33]

    Bulletin of the American Mathematical Society , volume =

    Robbins, Herbert , title =. Bulletin of the American Mathematical Society , volume =

  34. [34]

    Advances in Neural Information Processing Systems (NeurIPS) , volume =

    Niu, Haoyi and Qiu, Yiwen and Li, Ming and Zhou, Guyue and HU, Jianming and Zhan, Xianyuan , title =. Advances in Neural Information Processing Systems (NeurIPS) , volume =

  35. [35]

    and Smith, Laura and Kostrikov, Ilya and Levine, Sergey , title =

    Ball, Philip J. and Smith, Laura and Kostrikov, Ilya and Levine, Sergey , title =. International Conference on Machine Learning (ICML) , year =

  36. [36]

    Conference on Robot Learning (CoRL) , pages =

    Wu, Philipp and Escontrela, Alejandro and Hafner, Danijar and Abbeel, Pieter and Goldberg, Ken , title =. Conference on Robot Learning (CoRL) , pages =

  37. [37]

    International Conference on Learning Representations (ICLR) , year =

    Hansen, Nicklas and Wang, Xiaolong and Su, Hao , title =. International Conference on Learning Representations (ICLR) , year =

  38. [38]

    , title =

    DeGroot, Morris H. , title =. 1970 , address =

  39. [39]

    Medical Decision Making , volume =

    Jalal, Hawre and Alarid-Escudero, Fernando , title =. Medical Decision Making , volume =. 2018 , publisher =

  40. [40]

    Ades, A. E. and Lu, Guobing and Claxton, Karl , title =. Medical Decision Making , volume =. 2004 , publisher =

  41. [41]

    Medical Decision Making , volume =

    Brennan, Alan and Kharroubi, Samer and O'Hagan, Anthony and Chilcott, Jim , title =. Medical Decision Making , volume =. 2007 , publisher =

  42. [42]

    Journal of Health Economics , volume =

    Claxton, Karl , title =. Journal of Health Economics , volume =. 1999 , publisher =

  43. [43]

    The Lancet , volume =

    Claxton, Karl and Sculpher, Mark and Drummond, Michael , title =. The Lancet , volume =. 2002 , publisher =

  44. [44]

    2006 , address =

    Briggs, Andrew and Claxton, Karl and Sculpher, Mark , title =. 2006 , address =

  45. [45]

    Wilson, Ewan C. F. , title =. PharmacoEconomics , volume =. 2015 , publisher =

  46. [46]

    and Inoue, Koichiro , title =

    Chick, Stephen E. and Inoue, Koichiro , title =. Operations Research , volume =. 2001 , publisher =

  47. [47]

    and Branke, J

    Chick, Stephen E. and Branke, J. Sequential Sampling to Myopically Maximize the Expected Value of Information , journal =. 2010 , publisher =

  48. [48]

    and Powell, Warren B

    Frazier, Peter I. and Powell, Warren B. and Dayanik, Savas , title =. SIAM Journal on Control and Optimization , volume =. 2008 , publisher =

  49. [49]

    , title =

    Thompson, William R. , title =. Biometrika , volume =. 1933 , publisher =

  50. [50]

    , title =

    Gittins, John C. , title =. Journal of the Royal Statistical Society: Series B (Methodological) , volume =. 1979 , publisher =

  51. [51]

    2011 , address =

    Gittins, John and Glazebrook, Kevin and Weber, Richard , title =. 2011 , address =

  52. [52]

    Advances in Applied Mathematics , volume =

    Lai, Tze Leung and Robbins, Herbert , title =. Advances in Applied Mathematics , volume =. 1985 , publisher =

  53. [53]

    Finite-Time Analysis of the Multiarmed Bandit Problem , journal =

    Auer, Peter and Cesa-Bianchi, Nicol. Finite-Time Analysis of the Multiarmed Bandit Problem , journal =. 2002 , publisher =

  54. [54]

    Conference on Learning Theory (COLT) , pages =

    Agrawal, Shipra and Goyal, Navin , title =. Conference on Learning Theory (COLT) , pages =. 2012 , organization =

  55. [55]

    and Van Roy, Benjamin and Kazerouni, Abbas and Osband, Ian and Wen, Zheng , title =

    Russo, Daniel J. and Van Roy, Benjamin and Kazerouni, Abbas and Osband, Ian and Wen, Zheng , title =. Foundations and Trends in Machine Learning , volume =. 2018 , publisher =

  56. [56]

    Best Arm Identification in Multi-Armed Bandits , booktitle =

    Audibert, Jean-Yves and Bubeck, S. Best Arm Identification in Multi-Armed Bandits , booktitle =

  57. [57]

    On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , journal =

    Kaufmann, Emilie and Capp. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , journal =

  58. [58]

    , title =

    Li, Lihong and Chu, Wei and Langford, John and Schapire, Robert E. , title =. International Conference on World Wide Web (WWW) , pages =. 2010 , organization =

  59. [59]

    International Conference on Machine Learning (ICML) , pages =

    Agarwal, Alekh and Hsu, Daniel and Kale, Satyen and Langford, John and Li, Lihong and Schapire, Robert , title =. International Conference on Machine Learning (ICML) , pages =. 2014 , organization =

  60. [60]

    and Agarwal, Alekh and Dud

    Foster, Dylan J. and Agarwal, Alekh and Dud. Practical Contextual Bandits with Regression Oracles , booktitle =. 2018 , organization =

  61. [61]

    and Rakhlin, Alexander , title =

    Foster, Dylan J. and Rakhlin, Alexander , title =. International Conference on Machine Learning (ICML) , pages =. 2020 , organization =

  62. [62]

    , title =

    Berry, Donald A. , title =. Nature Reviews Drug Discovery , volume =. 2006 , publisher =

  63. [63]

    and Lachin, John M

    Rosenberger, William F. and Lachin, John M. , title =. 2012 , address =

  64. [64]

    , title =

    Hu, Feifang and Rosenberger, William F. , title =. 2006 , address =

  65. [65]

    , title =

    Berry, Donald A. , title =. Nature Reviews Clinical Oncology , volume =. 2012 , publisher =

  66. [66]

    , title =

    Pocock, Stuart J. , title =. Biometrika , volume =. 1977 , publisher =

  67. [67]

    and Fleming, Thomas R

    O'Brien, Peter C. and Fleming, Thomas R. , title =. Biometrics , volume =. 1979 , publisher =

  68. [68]

    , title =

    Jennison, Christopher and Turnbull, Bruce W. , title =. 1999 , address =

  69. [69]

    and Connor, Jason T

    Berry, Scott M. and Connor, Jason T. and Lewis, Roger J. , title =. JAMA , volume =. 2015 , publisher =

  70. [70]

    , title =

    Woodcock, Janet and LaVange, Lisa M. , title =. New England Journal of Medicine , volume =. 2017 , publisher =

  71. [71]

    2019 , publisher =

    Adaptive Platform Trials: Definition, Design, Conduct and Reporting Considerations , journal =. 2019 , publisher =

  72. [72]

    and Sigman, Carrie C

    Barker, Ann D. and Sigman, Carrie C. and Kelloff, Gary J. and Hylton, Nola M. and Berry, Donald A. and Esserman, Laura J. , title =. Clinical Pharmacology & Therapeutics , volume =. 2009 , publisher =

  73. [73]

    Operations Research , volume =

    Johari, Ramesh and Koomen, Pete and Pekelis, Leonid and Walsh, David , title =. Operations Research , volume =. 2022 , publisher =

  74. [74]

    and Ramdas, Aaditya and McAuliffe, Jon and Sekhon, Jasjeet , title =

    Howard, Steven R. and Ramdas, Aaditya and McAuliffe, Jon and Sekhon, Jasjeet , title =. The Annals of Statistics , volume =. 2021 , publisher =

  75. [75]

    Game-Theoretic Statistics and Safe Anytime-Valid Inference , journal =

    Ramdas, Aaditya and Gr. Game-Theoretic Statistics and Safe Anytime-Valid Inference , journal =. 2023 , publisher =

  76. [76]

    Advances in Neural Information Processing Systems (NeurIPS) , volume =

    Kallus, Nathan and Zhou, Angela , title =. Advances in Neural Information Processing Systems (NeurIPS) , volume =

  77. [77]

    Rosenman, Evan T. R. and Basse, Guillaume and Owen, Art B. and Baiocchi, Michael , title =. Biometrics , volume =. 2023 , publisher =

  78. [78]

    Journal of the American Statistical Association , volume =

    Yang, Shu and Ding, Peng , title =. Journal of the American Statistical Association , volume =. 2020 , publisher =

  79. [79]

    Advances in Neural Information Processing Systems (NeurIPS) , volume =

    Kallus, Nathan and Puli, Aahlad Manas and Shalit, Uri , title =. Advances in Neural Information Processing Systems (NeurIPS) , volume =

  80. [80]

    , title =

    Kohavi, Ron and Longbotham, Roger and Sommerfield, Dan and Henne, Randal M. , title =. Data Mining and Knowledge Discovery , volume =. 2009 , publisher =

Showing first 80 references.