arxiv: 1811.00696 · v1 · pith:L444FFWMnew · submitted 2018-11-02 · 💻 cs.CL

Sequence Generation with Guider Network

Ruiyi Zhang , Changyou Chen , Zhe Gan , Wenlin Wang , Liqun Chen , Dinghan Shen , Guoyin Wang , Lawrence Carin This is my paper

classification 💻 cs.CL

keywords sequencegenerationguidernetworksequence-generationapproachassistattention

0 comments

read the original abstract

Sequence generation with reinforcement learning (RL) has received significant attention recently. However, a challenge with such methods is the sparse-reward problem in the RL training process, in which a scalar guiding signal is often only available after an entire sequence has been generated. This type of sparse reward tends to ignore the global structural information of a sequence, causing generation of sequences that are semantically inconsistent. In this paper, we present a model-based RL approach to overcome this issue. Specifically, we propose a novel guider network to model the sequence-generation environment, which can assist next-word prediction and provide intermediate rewards for generator optimization. Extensive experiments show that the proposed method leads to improved performance for both unconditional and conditional sequence-generation tasks.

This paper has not been read by Pith yet.

Sequence Generation with Guider Network

discussion (0)