Then µ∗ n = n−1X i=0 (ν∗ nK ∗i n )(s, a), hence µ∗ n −nˆp∗(a) s = Pn−1 i=0 [(ν∗ nK ∗i n )(s, a)−ˆp∗(a) s ]

under K ∗ n · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Model-based Bootstrap of Controlled Markov Chains

stat.ML · 2026-05-12 · unverdicted · novelty 6.0

A model-based bootstrap achieves distributional consistency for transition estimators in controlled Markov chains with unknown policies and yields asymptotically valid confidence intervals for offline policy evaluation and optimal policy recovery.

citing papers explorer

Showing 1 of 1 citing paper.

Model-based Bootstrap of Controlled Markov Chains stat.ML · 2026-05-12 · unverdicted · none · ref 13
A model-based bootstrap achieves distributional consistency for transition estimators in controlled Markov chains with unknown policies and yields asymptotically valid confidence intervals for offline policy evaluation and optimal policy recovery.

Then µ∗ n = n−1X i=0 (ν∗ nK ∗i n )(s, a), hence µ∗ n −nˆp∗(a) s = Pn−1 i=0 [(ν∗ nK ∗i n )(s, a)−ˆp∗(a) s ]

fields

years

verdicts

representative citing papers

citing papers explorer