pith. sign in

arxiv: 1604.04343 · v2 · pith:W7GPBLGXnew · submitted 2016-04-15 · 🧮 math.OC · cs.SY

A Generalized Fundamental Matrix for Computing Fundamental Quantities of Markov Systems

classification 🧮 math.OC cs.SY
keywords fundamentalmarkovmatrixvectorcomputeperformancesystemscolumn
0
0 comments X
read the original abstract

As is well known, the fundamental matrix $(I - P + e \pi)^{-1}$ plays an important role in the performance analysis of Markov systems, where $P$ is the transition probability matrix, $e$ is the column vector of ones, and $\pi$ is the row vector of the steady state distribution. It is used to compute the performance potential (relative value function) of Markov decision processes under the average criterion, such as $g=(I - P + e \pi)^{-1} f$ where $g$ is the column vector of performance potentials and $f$ is the column vector of reward functions. However, we need to pre-compute $\pi$ before we can compute $(I - P + e \pi)^{-1}$. In this paper, we derive a generalization version of the fundamental matrix as $(I - P + e r)^{-1}$, where $r$ can be any given row vector satisfying $r e \neq 0$. With this generalized fundamental matrix, we can compute $g=(I - P + e r)^{-1} f$. The steady state distribution is computed as $\pi = r(I - P + e r)^{-1}$. The Q-factors at every state-action pair can also be computed in a similar way. These formulas may give some insights on further understanding how to efficiently compute or estimate the values of $g$, $\pi$, and Q-factors in Markov systems, which are fundamental quantities for the performance optimization of Markov systems.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.