From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction

Eduard Hovy; Qizhe Xie; Zihang Dai

arxiv: 1804.10974 · v1 · pith:Y5QLVAJLnew · submitted 2018-04-29 · 💻 cs.CL · cs.LG· stat.ML

From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction

Zihang Dai , Qizhe Xie , Eduard Hovy This is my paper

classification 💻 cs.CL cs.LGstat.ML

keywords ramlalgorithmsassignmentcreditentropypredictionsequenceactor-critic

0 comments

read the original abstract

In this work, we study the credit assignment problem in reward augmented maximum likelihood (RAML) learning, and establish a theoretical equivalence between the token-level counterpart of RAML and the entropy regularized reinforcement learning. Inspired by the connection, we propose two sequence prediction algorithms, one extending RAML with fine-grained credit assignment and the other improving Actor-Critic with a systematic entropy regularization. On two benchmark datasets, we show the proposed algorithms outperform RAML and Actor-Critic respectively, providing new alternatives to sequence prediction.

This paper has not been read by Pith yet.

From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction

discussion (0)