pith. sign in

arxiv: 1802.07668 · v1 · pith:MBMUUXSXnew · submitted 2018-02-21 · 🧮 math.OC

A model for system uncertainty in reinforcement learning

classification 🧮 math.OC
keywords modelframeworklearningreinforcementconsideredcontrolmeasureprovides
0
0 comments X
read the original abstract

This work provides a rigorous framework for studying continuous time control problems in uncertain environments. The framework considered models uncertainty in state dynamics as a measure on the space of functions. This measure is considered to change over time as agents learn their environment. This model can be seem as a variant of either Bayesian reinforcement learning or adaptive control. We study necessary conditions for locally optimal trajectories within this model, in particular deriving an appropriate dynamic programming principle and Hamilton-Jacobi equations. This model provides one possible framework for studying the tradeoff between exploration and exploitation in reinforcement learning.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.