On the Complexity of Value Iteration

Guillermo A. P\'erez; Mahsa Shirmohammadi; Nikhil Balaji; Petr Novotn\'y; Stefan Kiefer

arxiv: 1807.04920 · v3 · pith:463REEX2new · submitted 2018-07-13 · 💻 cs.FL · cs.AI· cs.CC

On the Complexity of Value Iteration

Nikhil Balaji , Stefan Kiefer , Petr Novotn\'y , Guillermo A. P\'erez , Mahsa Shirmohammadi This is my paper

classification 💻 cs.FL cs.AIcs.CC

keywords iterationvaluecomplexitygivenbinaryexp-completehorizonmdps

0 comments

read the original abstract

Value iteration is a fundamental algorithm for solving Markov Decision Processes (MDPs). It computes the maximal $n$-step payoff by iterating $n$ times a recurrence equation which is naturally associated to the MDP. At the same time, value iteration provides a policy for the MDP that is optimal on a given finite horizon $n$. In this paper, we settle the computational complexity of value iteration. We show that, given a horizon $n$ in binary and an MDP, computing an optimal policy is EXP-complete, thus resolving an open problem that goes back to the seminal 1987 paper on the complexity of MDPs by Papadimitriou and Tsitsiklis. As a stepping stone, we show that it is EXP-complete to compute the $n$-fold iteration (with $n$ in binary) of a function given by a straight-line program over the integers with $\max$ and $+$ as operators.

This paper has not been read by Pith yet.

On the Complexity of Value Iteration

discussion (0)