Finite-Horizon Markov Decision Processes with Sequentially-Observed Transitions

Behcet Acikmese; Mahmoud El Chamie

arxiv: 1507.01151 · v1 · pith:LKPEU4TWnew · submitted 2015-07-04 · 🧮 math.OC · cs.SY

Finite-Horizon Markov Decision Processes with Sequentially-Observed Transitions

Mahmoud El Chamie , Behcet Acikmese This is my paper

classification 🧮 math.OC cs.SY

keywords decisionmodelpoliciesinformationmarkovmdpsprocessessynthesize

0 comments

read the original abstract

Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (or minimize costs) in a given stochastic dynamical environment. In this paper, we extend this model by incorporating additional information that the transitions due to actions can be sequentially observed. The proposed model benefits from this information and produces policies with better performance than those of standard MDPs. The paper also presents an efficient offline linear programming based algorithm to synthesize optimal policies for the extended model.

This paper has not been read by Pith yet.

Finite-Horizon Markov Decision Processes with Sequentially-Observed Transitions

discussion (0)