Model-Based Regularization for Deep Reinforcement Learning with Transcoder Networks

Felix Leibfried; Peter Vrancx

arxiv: 1809.01906 · v2 · pith:CS3DV2TCnew · submitted 2018-09-06 · 💻 cs.LG · stat.ML

Model-Based Regularization for Deep Reinforcement Learning with Transcoder Networks

Felix Leibfried , Peter Vrancx This is my paper

classification 💻 cs.LG stat.ML

keywords deeplearningobjectivemodelmodel-basedregularizationreinforcementtranscoder

0 comments

read the original abstract

This paper proposes a new optimization objective for value-based deep reinforcement learning. We extend conventional Deep Q-Networks (DQNs) by adding a model-learning component yielding a transcoder network. The prediction errors for the model are included in the basic DQN loss as additional regularizers. This augmented objective leads to a richer training signal that provides feedback at every time step. Moreover, because learning an environment model shares a common structure with the RL problem, we hypothesize that the resulting objective improves both sample efficiency and performance. We empirically confirm our hypothesis on a range of 20 games from the Atari benchmark attaining superior results over vanilla DQN without model-based regularization.

This paper has not been read by Pith yet.

Model-Based Regularization for Deep Reinforcement Learning with Transcoder Networks

discussion (0)