pith. sign in

arxiv: 1807.04920 · v3 · pith:463REEX2new · submitted 2018-07-13 · 💻 cs.FL · cs.AI· cs.CC

On the Complexity of Value Iteration

classification 💻 cs.FL cs.AIcs.CC
keywords iterationvaluecomplexitygivenbinaryexp-completehorizonmdps
0
0 comments X
read the original abstract

Value iteration is a fundamental algorithm for solving Markov Decision Processes (MDPs). It computes the maximal $n$-step payoff by iterating $n$ times a recurrence equation which is naturally associated to the MDP. At the same time, value iteration provides a policy for the MDP that is optimal on a given finite horizon $n$. In this paper, we settle the computational complexity of value iteration. We show that, given a horizon $n$ in binary and an MDP, computing an optimal policy is EXP-complete, thus resolving an open problem that goes back to the seminal 1987 paper on the complexity of MDPs by Papadimitriou and Tsitsiklis. As a stepping stone, we show that it is EXP-complete to compute the $n$-fold iteration (with $n$ in binary) of a function given by a straight-line program over the integers with $\max$ and $+$ as operators.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.