DIAL: Direct Iterative Adversarial Learning for Realistic Multi-Turn Dialogue Simulation

· 2025 · cs.CL · arXiv 2512.20773

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Realistic user simulation is crucial for training and evaluating multi-turn dialogue systems, yet creating simulators that accurately replicate human behavior remains a significant challenge. An effective simulator must expose the failure modes of the systems under evaluation. This work introduces Direct Iterative Adversarial Learning (DIAL), an adversarial framework that iteratively enhances user simulator realism through a competitive dynamic between a generator (user simulator) and a discriminator. When applied to mental health support, a domain characterized by diverse failure types and a critical dependence on realistic user behavior for failure detection, DIAL restores lexical diversity diminished by supervised fine-tuning and drastically reduces discriminator accuracy. The resulting simulator exhibits a strong correlation between simulated and real failure occurrence rates while maintaining low distributional divergence of failure modes. These findings indicate that DIAL is a promising method for developing realistic user simulators in multi-turn dialogue, facilitating reliable and cost-effective system evaluation prior to deployment.

representative citing papers

PARALLAX: Separating Genuine Hallucination Detection from Benchmark Construction Artifacts

cs.CL · 2026-05-16 · unverdicted · novelty 6.0

Benchmark construction artifacts in hallucination detection corpora allow naive text-similarity baselines to achieve near-perfect scores, and controlled evaluations show most methods perform near chance except SAPLMA and the new DRIFT probe.

citing papers explorer

Showing 1 of 1 citing paper.

PARALLAX: Separating Genuine Hallucination Detection from Benchmark Construction Artifacts cs.CL · 2026-05-16 · unverdicted · none · ref 7 · internal anchor
Benchmark construction artifacts in hallucination detection corpora allow naive text-similarity baselines to achieve near-perfect scores, and controlled evaluations show most methods perform near chance except SAPLMA and the new DRIFT probe.

DIAL: Direct Iterative Adversarial Learning for Realistic Multi-Turn Dialogue Simulation

fields

years

verdicts

representative citing papers

citing papers explorer