Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language Models

Yubo Li , Xiaobin Shen , Yidi Miao , Xinyu Yao , Xueying Ding , Ramayya Krishnan , Rema Padman

Authors on Pith no claims yet

classification 💻 cs.CL cs.AI

keywords multi-turninteractionschallengeslanguagelargelearningmodelsrecent

read the original abstract

Recent advances in large language models (LLMs) have substantially improved single-turn task performance, yet real-world applications increasingly demand sophisticated multi-turn interactions. This survey provides a comprehensive review of recent progress in evaluating and enhancing multi-turn LLM interactions. Centered on a task-oriented taxonomy-spanning instruction following in domains such as mathematics and coding, and conversational engagement in role-playing, healthcare, education, and adversarial jailbreak settings-we systematically examine the challenges of maintaining context, coherence, fairness, and responsiveness across prolonged dialogues. We organize existing benchmarks and datasets into coherent categories reflecting the evolving landscape of multi-turn dialogue evaluation, and review a broad spectrum of enhancement methodologies, including model-centric strategies (in-context learning, supervised fine-tuning, reinforcement learning, and architectural innovations), external integration approaches (memory augmentation, retrieval-based methods, and knowledge graphs), and agent-based techniques for collaborative interaction. Finally, we identify open challenges and promising directions for future research to further improve the robustness and effectiveness of multi-turn LLM interactions.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

SOMA: Efficient Multi-turn LLM Serving via Small Language Model
cs.CL 2026-05 unverdicted novelty 6.0

SOMA estimates a local response manifold from early turns and adapts a small surrogate model via divergence-maximizing prompts and localized LoRA fine-tuning for efficient multi-turn serving.
MT-JailBench: A Modular Benchmark for Understanding Multi-Turn Jailbreak Attacks
cs.CR 2026-05 unverdicted novelty 6.0

MT-JailBench is a modular benchmark that standardizes evaluation of multi-turn jailbreaks to identify key success drivers and enable stronger combined attacks.
From History to State: Constant-Context Skill Learning for LLM Agents
cs.AI 2026-05 unverdicted novelty 6.0

Constant-context skill learning trains reusable task-family modules for LLM agents using a deterministic state block for progress tracking and subgoal rewards, achieving 89.6% unseen success on ALFWorld, 76.8% on WebS...
SinkTrack: Attention Sink based Context Anchoring for Large Language Models
cs.CV 2026-04 unverdicted novelty 6.0

SinkTrack uses attention sink at the BOS token to anchor LLMs to initial context, reducing hallucination and forgetting with reported gains on benchmarks like SQuAD2.0 and M3CoT.
AFRILANGTUTOR: Advancing Language Tutoring and Culture Education in Low-Resource Languages with Large Language Models
cs.CL 2026-04 unverdicted novelty 5.0

New dictionary-derived datasets enable fine-tuned LLMs to act as language tutors for ten low-resource African languages, with SFT plus DPO yielding 1.8-15.5% gains on LLM-as-judge metrics.
Token Statistics Reveal Conversational Drift in Multi-turn LLM Interaction
cs.CL 2026-03 unverdicted novelty 5.0

Bipredictability from token statistics monitors structural consistency in multi-turn LLM interactions, showing 85% alignment with structure but only 44% with semantics and 100% sensitivity to tested drifts across 4574 turns.