A method for the online construction of the set of states of a Markov Decision Process using Answer Set Programming

Leonardo A. Ferreira; Paulo E. Santos; Ramon Lopez de Mantaras; Reinaldo A. C. Bianchi

arxiv: 1706.01417 · v1 · pith:QXLP7LLAnew · submitted 2017-06-05 · 💻 cs.AI

A method for the online construction of the set of states of a Markov Decision Process using Answer Set Programming

Leonardo A. Ferreira , Reinaldo A. C. Bianchi , Paulo E. Santos , Ramon Lopez de Mantaras This is my paper

classification 💻 cs.AI

keywords domainoaspagentanswercapabledecisiondomainsmarkov

0 comments

read the original abstract

Non-stationary domains, that change in unpredicted ways, are a challenge for agents searching for optimal policies in sequential decision-making problems. This paper presents a combination of Markov Decision Processes (MDP) with Answer Set Programming (ASP), named {\em Online ASP for MDP} (oASP(MDP)), which is a method capable of constructing the set of domain states while the agent interacts with a changing environment. oASP(MDP) updates previously obtained policies, learnt by means of Reinforcement Learning (RL), using rules that represent the domain changes observed by the agent. These rules represent a set of domain constraints that are processed as ASP programs reducing the search space. Results show that oASP(MDP) is capable of finding solutions for problems in non-stationary domains without interfering with the action-value function approximation process.

This paper has not been read by Pith yet.

A method for the online construction of the set of states of a Markov Decision Process using Answer Set Programming

discussion (0)