pith. machine review for the scientific record. sign in

arxiv: 1611.09823 · v3 · submitted 2016-11-29 · 💻 cs.AI · cs.CL

Recognition: unknown

Dialogue Learning With Human-In-The-Loop

Authors on Pith no claims yet
classification 💻 cs.AI cs.CL
keywords learningabilitydialogueagentsapproachaspectaspectsbuild
0
0 comments X
read the original abstract

An important aspect of developing conversational agents is to give a bot the ability to improve through communicating with humans and to learn from the mistakes that it makes. Most research has focused on learning from fixed training sets of labeled data rather than interacting with a dialogue partner in an online fashion. In this paper we explore this direction in a reinforcement learning setting where the bot improves its question-answering ability from feedback a teacher gives following its generated responses. We build a simulator that tests various aspects of such learning in a synthetic environment, and introduce models that work in this regime. Finally, real experiments with Mechanical Turk validate the approach.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Fine-Tuning Language Models from Human Preferences

    cs.CL 2019-09 unverdicted novelty 7.0

    Language models fine-tuned via RL on 5k-60k human preference comparisons produce stylistically better text continuations and human-preferred summaries that sometimes copy input sentences.