RMDPs lack subgradient dominance in general and admit suboptimal local minima; finding epsilon-optimal policies is NP-hard for finite transition uncertainty sets, but the dominance property holds when worst-case kernels or action-values are unique per policy.
Robust Dynamic Programming
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
Presents the first algorithm to identify an ε-optimal policy in robust constrained MDPs via epigraph form and bisection search with Õ(ε^{-4}) robust policy evaluations.
citing papers explorer
-
Revisiting Subgradient Dominance in Robust MDPs: Counterexamples, Hardness, and Sufficient Conditions
RMDPs lack subgradient dominance in general and admit suboptimal local minima; finding epsilon-optimal policies is NP-hard for finite transition uncertainty sets, but the dominance property holds when worst-case kernels or action-values are unique per policy.
-
Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form
Presents the first algorithm to identify an ε-optimal policy in robust constrained MDPs via epigraph form and bisection search with Õ(ε^{-4}) robust policy evaluations.