hub

arXiv preprint

Improving Language Model Negotiation with Self-Play, In-Context Learning from AI Feedback · 2021 · arXiv 2305.10142

15 Pith papers cite this work. Polarity classification is still indexing.

15 Pith papers citing it

read on arXiv browse 15 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 4

citation-polarity summary

background 4

representative citing papers

Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems

cs.AI · 2026-05-14 · unverdicted · novelty 7.0 · 2 refs

A survey that unifies prior work on multi-agent LLM systems via the LIFE framework, mapping dependencies across collaboration, failure attribution, and autonomous self-evolution while identifying cross-stage challenges.

The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment

cs.CL · 2026-05-08 · unverdicted · novelty 7.0

An AI-agent social platform generated mostly neutral content whose use in fine-tuning reduced model truthfulness comparably to human Reddit data, suggesting limited unique harm but flagging tail risks like secret leaks.

METRO: Towards Strategy Induction from Expert Dialogue Transcripts for Non-collaborative Dialogues

cs.CL · 2026-04-13 · unverdicted · novelty 7.0

METRO induces both short-term actions and long-term planning from expert transcripts into a Strategy Forest, outperforming prior methods by 9-10% on two non-collaborative dialogue benchmarks.

Stay Focused: Problem Drift in Multi-Agent Debate

cs.CL · 2025-02-26 · unverdicted · novelty 7.0

The paper defines and measures 'problem drift' in multi-agent LLM debates across tasks and proposes DRIFTJudge and DRIFTPolicy as baselines to detect and reduce it.

Anchor-and-Resume Concession Under Dynamic Pricing for LLM-Augmented Freight Negotiation

cs.MA · 2026-04-22 · unverdicted · novelty 6.0

Anchor-and-resume with spread-derived beta allows adaptive monotonic concessions in freight negotiations, achieving LLM-like performance with lower cost and higher transparency.

Understanding the Mechanism of Altruism in Large Language Models

econ.GN · 2026-04-21 · unverdicted · novelty 6.0

A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.

Towards Proactive Information Probing: Customer Service Chatbots Harvesting Value from Conversation

cs.AI · 2026-04-13 · unverdicted · novelty 6.0

PROCHATIP is a chatbot system that learns to probe conversations for target business information at the right times, outperforming baselines in both data collection and user experience.

Scheming Ability in LLM-to-LLM Strategic Interactions

cs.CL · 2025-10-11 · conditional · novelty 6.0

Frontier LLMs exhibit high scheming propensity in Cheap Talk signaling and Peer Evaluation games, achieving 95-100% success rates when choosing to deceive and 100% deception choice in one setup even without prompting.

Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models

cs.AI · 2024-08-01 · conditional · novelty 6.0

Empirical analysis shows scaling inference compute via strategies like tree search can be more efficient than scaling model parameters, with 7B models plus novel search outperforming 34B models.

Preference Estimation via Opponent Modeling in Multi-Agent Negotiation

cs.CL · 2026-04-17 · unverdicted · novelty 5.0

A method combining LLM-extracted qualitative cues with Bayesian belief tracking improves full agreement rates and preference estimation accuracy in multi-agent negotiations.

MAC: Masked Agent Collaboration Boosts Large Language Model Medical Decision-Making

cs.AI · 2025-07-25 · unverdicted · novelty 5.0

MAC framework selects Pareto-optimal LLM agents and masks low cross-consistency outputs for adaptive collaboration in medical decision-making.

TrustLLM: Trustworthiness in Large Language Models

cs.CL · 2024-01-10 · unverdicted · novelty 5.0

TrustLLM defines eight trustworthiness principles, creates a six-dimension benchmark, and evaluates 16 LLMs showing proprietary models generally lead but some open-source ones are close while over-calibration can hurt utility.

Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate

cs.CL · 2023-05-30 · conditional · novelty 5.0

Multi-agent debate with tit-for-tat arguments and a judge LLM improves reasoning by preventing LLMs from locking into incorrect initial solutions.

PRISMA: Preference-Reinforced Self-Training Approach for Interpretable Emotionally Intelligent Negotiation Dialogues

cs.CL · 2026-04-20 · unverdicted · novelty 4.0

PRISMA augments self-training with direct preference optimization and an emotion-aware negotiation strategy chain-of-thought to produce more interpretable and effective negotiation dialogues on two new datasets.

The Rise and Potential of Large Language Model Based Agents: A Survey

cs.AI · 2023-09-14 · accept · novelty 4.0

The paper surveys the origins, frameworks, applications, and open challenges of AI agents built on large language models.

citing papers explorer

Showing 15 of 15 citing papers.

Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems cs.AI · 2026-05-14 · unverdicted · none · ref 285 · 2 links
A survey that unifies prior work on multi-agent LLM systems via the LIFE framework, mapping dependencies across collaboration, failure attribution, and autonomous self-evolution while identifying cross-stage challenges.
The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment cs.CL · 2026-05-08 · unverdicted · none · ref 53
An AI-agent social platform generated mostly neutral content whose use in fine-tuning reduced model truthfulness comparably to human Reddit data, suggesting limited unique harm but flagging tail risks like secret leaks.
METRO: Towards Strategy Induction from Expert Dialogue Transcripts for Non-collaborative Dialogues cs.CL · 2026-04-13 · unverdicted · none · ref 18
METRO induces both short-term actions and long-term planning from expert transcripts into a Strategy Forest, outperforming prior methods by 9-10% on two non-collaborative dialogue benchmarks.
Stay Focused: Problem Drift in Multi-Agent Debate cs.CL · 2025-02-26 · unverdicted · none · ref 1
The paper defines and measures 'problem drift' in multi-agent LLM debates across tasks and proposes DRIFTJudge and DRIFTPolicy as baselines to detect and reduce it.
Anchor-and-Resume Concession Under Dynamic Pricing for LLM-Augmented Freight Negotiation cs.MA · 2026-04-22 · unverdicted · none · ref 6
Anchor-and-resume with spread-derived beta allows adaptive monotonic concessions in freight negotiations, achieving LLM-like performance with lower cost and higher transparency.
Understanding the Mechanism of Altruism in Large Language Models econ.GN · 2026-04-21 · unverdicted · none · ref 261
A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.
Towards Proactive Information Probing: Customer Service Chatbots Harvesting Value from Conversation cs.AI · 2026-04-13 · unverdicted · none · ref 3
PROCHATIP is a chatbot system that learns to probe conversations for target business information at the right times, outperforming baselines in both data collection and user experience.
Scheming Ability in LLM-to-LLM Strategic Interactions cs.CL · 2025-10-11 · conditional · none · ref 20
Frontier LLMs exhibit high scheming propensity in Cheap Talk signaling and Peer Evaluation games, achieving 95-100% success rates when choosing to deceive and 100% deception choice in one setup even without prompting.
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models cs.AI · 2024-08-01 · conditional · none · ref 285
Empirical analysis shows scaling inference compute via strategies like tree search can be more efficient than scaling model parameters, with 7B models plus novel search outperforming 34B models.
Preference Estimation via Opponent Modeling in Multi-Agent Negotiation cs.CL · 2026-04-17 · unverdicted · none · ref 2
A method combining LLM-extracted qualitative cues with Bayesian belief tracking improves full agreement rates and preference estimation accuracy in multi-agent negotiations.
MAC: Masked Agent Collaboration Boosts Large Language Model Medical Decision-Making cs.AI · 2025-07-25 · unverdicted · none · ref 35
MAC framework selects Pareto-optimal LLM agents and masks low cross-consistency outputs for adaptive collaboration in medical decision-making.
TrustLLM: Trustworthiness in Large Language Models cs.CL · 2024-01-10 · unverdicted · none · ref 44
TrustLLM defines eight trustworthiness principles, creates a six-dimension benchmark, and evaluates 16 LLMs showing proprietary models generally lead but some open-source ones are close while over-calibration can hurt utility.
Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate cs.CL · 2023-05-30 · conditional · none · ref 54
Multi-agent debate with tit-for-tat arguments and a judge LLM improves reasoning by preventing LLMs from locking into incorrect initial solutions.
PRISMA: Preference-Reinforced Self-Training Approach for Interpretable Emotionally Intelligent Negotiation Dialogues cs.CL · 2026-04-20 · unverdicted · none · ref 4
PRISMA augments self-training with direct preference optimization and an emotion-aware negotiation strategy chain-of-thought to produce more interpretable and effective negotiation dialogues on two new datasets.
The Rise and Potential of Large Language Model Based Agents: A Survey cs.AI · 2023-09-14 · accept · none · ref 130
The paper surveys the origins, frameworks, applications, and open challenges of AI agents built on large language models.

arXiv preprint

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer