A Generalized Fundamental Matrix for Computing Fundamental Quantities of Markov Systems

Li Xia; Peter W. Glynn

read the original abstract

As is well known, the fundamental matrix $(I - P + e \pi)^{-1}$ plays an important role in the performance analysis of Markov systems, where $P$ is the transition probability matrix, $e$ is the column vector of ones, and $\pi$ is the row vector of the steady state distribution. It is used to compute the performance potential (relative value function) of Markov decision processes under the average criterion, such as $g=(I - P + e \pi)^{-1} f$ where $g$ is the column vector of performance potentials and $f$ is the column vector of reward functions. However, we need to pre-compute $\pi$ before we can compute $(I - P + e \pi)^{-1}$. In this paper, we derive a generalization version of the fundamental matrix as $(I - P + e r)^{-1}$, where $r$ can be any given row vector satisfying $r e \neq 0$. With this generalized fundamental matrix, we can compute $g=(I - P + e r)^{-1} f$. The steady state distribution is computed as $\pi = r(I - P + e r)^{-1}$. The Q-factors at every state-action pair can also be computed in a similar way. These formulas may give some insights on further understanding how to efficiently compute or estimate the values of $g$, $\pi$, and Q-factors in Markov systems, which are fundamental quantities for the performance optimization of Markov systems.

A Generalized Fundamental Matrix for Computing Fundamental Quantities of Markov Systems

discussion (0)