A User Simulator for Task-Completion Dialogues

Xiujun Li, Zachary C Lipton, Bhuwan Dhingra, Lihong Li, Jianfeng Gao, Yun-Nung Chen · 2016 · cs.LG · arXiv 1612.05688

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

Despite widespread interests in reinforcement-learning for task-oriented dialogue systems, several obstacles can frustrate research and development progress. First, reinforcement learners typically require interaction with the environment, so conventional dialogue corpora cannot be used directly. Second, each task presents specific challenges, requiring separate corpus of task-specific annotated data. Third, collecting and annotating human-machine or human-human conversations for task-oriented dialogues requires extensive domain knowledge. Because building an appropriate dataset can be both financially costly and time-consuming, one popular approach is to build a user simulator based upon a corpus of example dialogues. Then, one can train reinforcement learning agents in an online fashion as they interact with the simulator. Dialogue agents trained on these simulators can serve as an effective starting point. Once agents master the simulator, they may be deployed in a real environment to interact with humans, and continue to be trained online. To ease empirical algorithmic comparisons in dialogues, this paper introduces a new, publicly available simulation framework, where our simulator, designed for the movie-booking domain, leverages both rules and collected data. The simulator supports two tasks: movie ticket booking and movie seeking. Finally, we demonstrate several agents and detail the procedure to add and test your own agent in the proposed framework.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Reinforcing Human Behavior Simulation via Verbal Feedback

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

DITTO uses RL with verbal feedback to train LLMs for human behavior simulation, reporting 36% average gains over base models and outperforming GPT-5.4 on 6 of 10 SOUL benchmark tasks.

Breakdowns in Conversational AI: Interactional Failures in Emotionally and Ethically Sensitive Contexts

cs.CL · 2026-04-03 · unverdicted · novelty 5.0

Mainstream conversational models show escalating affective misalignments and ethical guidance failures during staged emotional trajectories, organized into a taxonomy of interactional breakdowns.

Survey on reinforcement learning for language processing

cs.CL · 2021-04-12 · unverdicted · novelty 2.0

This survey reviews reinforcement learning applications to natural language processing problems, especially conversational systems, including problem descriptions, suitability of RL, advantages, limitations, and promising directions.

citing papers explorer

Showing 3 of 3 citing papers.

Reinforcing Human Behavior Simulation via Verbal Feedback cs.LG · 2026-05-19 · unverdicted · none · ref 89 · internal anchor
DITTO uses RL with verbal feedback to train LLMs for human behavior simulation, reporting 36% average gains over base models and outperforming GPT-5.4 on 6 of 10 SOUL benchmark tasks.
Breakdowns in Conversational AI: Interactional Failures in Emotionally and Ethically Sensitive Contexts cs.CL · 2026-04-03 · unverdicted · none · ref 18
Mainstream conversational models show escalating affective misalignments and ethical guidance failures during staged emotional trajectories, organized into a taxonomy of interactional breakdowns.
Survey on reinforcement learning for language processing cs.CL · 2021-04-12 · unverdicted · none · ref 78 · internal anchor
This survey reviews reinforcement learning applications to natural language processing problems, especially conversational systems, including problem descriptions, suitability of RL, advantages, limitations, and promising directions.

A User Simulator for Task-Completion Dialogues

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer