Off-policy estimation with adaptively collected data: the power of online learning , isbn =

Lee, Jeonghwan, Ma, Cong , year = · DOI 10.52202/079017-4255

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

Anytime-valid Optimal Policy Identification

stat.ME · 2026-06-16 · unverdicted · novelty 6.0

Constructs a time-indexed set S_t retaining the true optimal policy uniformly over time with high probability, enabling early stopping with sample complexity O((log |Π| + log log(1/Δ_min))/Δ_min²) when the optimum is unique.

citing papers explorer

Showing 1 of 1 citing paper.

Anytime-valid Optimal Policy Identification stat.ME · 2026-06-16 · unverdicted · none · ref 3
Constructs a time-indexed set S_t retaining the true optimal policy uniformly over time with high probability, enabling early stopping with sample complexity O((log |Π| + log log(1/Δ_min))/Δ_min²) when the optimum is unique.

Off-policy estimation with adaptively collected data: the power of online learning , isbn =

fields

years

verdicts

representative citing papers

citing papers explorer