Optimal Stopping for Sequential Bayesian Experimental Design

Chen Cheng; Xun Huan

arxiv: 2509.21734 · v2 · pith:GUESTR4Pnew · submitted 2025-09-26 · 📊 stat.ME

Optimal Stopping for Sequential Bayesian Experimental Design

Chen Cheng , Xun Huan This is my paper

classification 📊 stat.ME

keywords stoppingdesignsequentialbayesianexperimentallearningoptimalbecause

0 comments

read the original abstract

Sequential Bayesian experimental design typically assumes that the number of experiments is fixed before data collection begins. In practical campaigns, however, experimentation may need to terminate early because additional measurements can provide diminishing information relative to their cost, raising the central decision question: when should one stop? Common threshold-based stopping rules are easy to implement but myopic, because they compare the current state with a fixed criterion without accounting for the expected value of future experiments. This work develops a Bayesian optimal stopping framework for sequential experimental design by formulating stopping and design as coupled decisions in a Markov decision process. We prove that, for any design policy, the optimal stopping rule terminates exactly when the immediate terminal reward exceeds the expected continuation value. We then derive a policy gradient method for learning value-based stopping and design policies. Na\"ive joint training can create a circular dependency that traps learning in early-stopping local optima. We address this difficulty with a curriculum learning strategy that gradually transitions from forced continuation to adaptive stopping during training. Numerical studies on a linear-Gaussian benchmark, a one-dimensional nonlinear test problem, and a contaminant source detection problem show that the proposed approach learns stable design-stopping policies and improves resource-aware performance, with the largest gains in settings with strong sequential dependence.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Optimizing Social Utility in Sequential Experiments
cs.GT 2026-05 unverdicted novelty 6.0

A subsidy-based sequential RCT protocol modeled as a belief MDP increases social utility by more than 35% compared to standard trials.