A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies

Jost Schatzmann, Karl Weilhammer, Matt Stuttle, Steve Young · 2006 · DOI 10.1017/s0269888906000944

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

Reinforcing Human Behavior Simulation via Verbal Feedback

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

DITTO uses RL with verbal feedback to train LLMs for human behavior simulation, reporting 36% average gains over base models and outperforming GPT-5.4 on 6 of 10 SOUL benchmark tasks.

citing papers explorer

Showing 1 of 1 citing paper.

Reinforcing Human Behavior Simulation via Verbal Feedback cs.LG · 2026-05-19 · unverdicted · none · ref 30
DITTO uses RL with verbal feedback to train LLMs for human behavior simulation, reporting 36% average gains over base models and outperforming GPT-5.4 on 6 of 10 SOUL benchmark tasks.

A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies

fields

years

verdicts

representative citing papers

citing papers explorer