Approximate Value Iteration for Risk-aware Markov Decision Processes

Huan Xu; Pengqian Yu; William B. Haskell

arxiv: 1701.01290 · v3 · pith:SVIESIISnew · submitted 2017-01-05 · 💻 cs.SY · math.OC

Approximate Value Iteration for Risk-aware Markov Decision Processes

Pengqian Yu , William B. Haskell , Huan Xu This is my paper

classification 💻 cs.SY math.OC

keywords mdpsrisk-awarealgorithmsapproachapproximatedecisiondevelopdynamic

0 comments

read the original abstract

We consider large-scale Markov decision processes (MDPs) with a risk measure of variability in cost, under the risk-aware MDPs paradigm. Previous studies showed that risk-aware MDPs, based on a minimax approach to handling risk, can be solved using dynamic programming for small to medium sized problems. However, due to the "curse of dimensionality", MDPs that model real-life problems are typically prohibitively large for such approaches. In this paper, we employ an approximate dynamic programming approach, and develop a family of simulation-based algorithms to approximately solve large-scale risk-aware MDPs. In parallel, we develop a unified convergence analysis technique to derive sample complexity bounds for this new family of algorithms.

This paper has not been read by Pith yet.

Approximate Value Iteration for Risk-aware Markov Decision Processes

discussion (0)