Decoupling Strategy and Generation in Negotiation Dialogues

He He , Derek Chen , Anusha Balakrishnan , Percy Liang

Authors on Pith no claims yet

classification 💻 cs.CL

keywords strategynegotiationgenerationlearningagentsapproachproposereinforcement

read the original abstract

We consider negotiation settings in which two agents use natural language to bargain on goods. Agents need to decide on both high-level strategy (e.g., proposing \$50) and the execution of that strategy (e.g., generating "The bike is brand new. Selling for just \$50."). Recent work on negotiation trains neural models, but their end-to-end nature makes it hard to control their strategy, and reinforcement learning tends to lead to degenerate solutions. In this paper, we propose a modular approach based on coarse di- alogue acts (e.g., propose(price=50)) that decouples strategy and generation. We show that we can flexibly set the strategy using supervised learning, reinforcement learning, or domain-specific knowledge without degeneracy, while our retrieval-based generation can maintain context-awareness and produce diverse utterances. We test our approach on the recently proposed DEALORNODEAL game, and we also collect a richer dataset based on real items on Craigslist. Human evaluation shows that our systems achieve higher task success rate and more human-like negotiation behavior than previous approaches.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A Benchmark for Multi-Party Negotiation Games from Real Negotiation Data
cs.MA 2026-03 accept novelty 7.0

A new benchmark for sequential multi-party negotiations from climate data shows no solver dominates and performance depends on game structure.
$\tau^2$-Bench: Evaluating Conversational Agents in a Dual-Control Environment
cs.AI 2025-06 unverdicted novelty 7.0

τ²-bench provides a Dec-POMDP-based telecom domain with compositional task generation and a tool-constrained user simulator to measure agent performance drops in dual-control versus single-control settings.
$\tau$-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
cs.AI 2024-06 unverdicted novelty 7.0

τ-bench shows state-of-the-art agents like GPT-4o succeed on under 50% of tool-using, rule-following tasks and are inconsistent across repeated trials.
SOMA: Efficient Multi-turn LLM Serving via Small Language Model
cs.CL 2026-05 unverdicted novelty 6.0

SOMA estimates a local response manifold from early turns and adapts a small surrogate model via divergence-maximizing prompts and localized LoRA fine-tuning for efficient multi-turn serving.