pith. sign in

arxiv: 1606.01541 · v4 · pith:ZKNN3NXXnew · submitted 2016-06-05 · 💻 cs.CL

Deep Reinforcement Learning for Dialogue Generation

classification 💻 cs.CL
keywords dialoguelearningmodelconversationaldialoguesfuturereinforcementagents
0
0 comments X
read the original abstract

Recent neural models of dialogue generation offer great promise for generating responses for conversational agents, but tend to be shortsighted, predicting utterances one at a time while ignoring their influence on future outcomes. Modeling the future direction of a dialogue is crucial to generating coherent, interesting dialogues, a need which led traditional NLP models of dialogue to draw on reinforcement learning. In this paper, we show how to integrate these goals, applying deep reinforcement learning to model future reward in chatbot dialogue. The model simulates dialogues between two virtual agents, using policy gradient methods to reward sequences that display three useful conversational properties: informativity (non-repetitive turns), coherence, and ease of answering (related to forward-looking function). We evaluate our model on diversity, length as well as with human judges, showing that the proposed algorithm generates more interactive responses and manages to foster a more sustained conversation in dialogue simulation. This work marks a first step towards learning a neural conversational model based on the long-term success of dialogues.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Ranking sentences from product description & bullets for better search

    cs.IR 2019-07 unverdicted novelty 4.0

    Two RL-based extractive summarization models rank sentences from product fields by leveraging titles and click-through logs to improve search relevance.

  2. Deep Reinforcement Learning for Personalized Search Story Recommendation

    cs.LG 2019-07 unverdicted novelty 3.0

    A deep RL architecture using imitation learning and reinforcement learning is proposed to model immediate and future values of search story recommendations in a Markov decision process framework.