From Debate to Decision: Conformal Social Choice for Safe Multi-Agent Deliberation

· 2026 · cs.AI · arXiv 2604.07667

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

Multi-agent debate improves LLM reasoning, yet agreement among agents is not evidence of correctness. When agents converge on a wrong answer through social reinforcement, consensus-based stopping commits that error to an automated action with no recourse. We introduce Conformal Social Choice, a post-hoc decision layer that converts debate outputs into calibrated act-versus-escalate decisions. Verbalized probability distributions from heterogeneous agents are aggregated via a linear opinion pool and calibrated with split conformal prediction, yielding prediction sets with a marginal coverage guarantee: the correct answer is included with probability ${\geq}\,1{-}\alpha$, without assumptions on individual model calibration. A hierarchical action policy maps singleton sets to autonomous action and larger sets to human escalation. On eight MMLU-Pro domains with three agents (Claude Haiku, DeepSeek-R1, Qwen-3 32B), coverage stays within 1--2 points of the target. The key finding is not that debate becomes more accurate, but that the conformal layer makes its failures actionable: 81.9% of wrong-consensus cases are intercepted at $\alpha{=}0.05$. Because the layer refuses to act on cases where debate is confidently wrong, the remaining conformal singletons reach 90.0--96.8% accuracy (up to 22.1pp above consensus stopping) -- a selection effect, not a reasoning improvement. This safety comes at the cost of automation, but the operating point is user-adjustable via $\alpha$.

representative citing papers

Cherry-pick Override: Unsafe Directional Commitment in LLM Judges under Mixed Evidence

cs.SE · 2026-06-05 · unverdicted · novelty 7.0

The paper defines Cherry-pick Override (CCO) as unauthorized directional commitment by LLM judges under mixed evidence and quantifies its prevalence (>84% on AVeriTeC conflicting subset) while testing intervention ladders and a two-channel reference probe.

Governance Gaps in Agent Interoperability Protocols: What MCP, A2A, and ACP Cannot Express

cs.MA · 2026-06-30 · unverdicted · novelty 5.0

Gap analysis of MCP, A2A, ACP, ANP, and ERC-8004 shows none support the full set of membership, deliberation, voting, dissent, escalation, and audit primitives required for governed agent communities.

When Does Delegation Beat Majority? A Delegation-Based Aggregator for Multi-Sample LLM Inference

cs.AI · 2026-06-06 · unverdicted · novelty 5.0

PPV delegation using letter entropy and per-question embedding cosine beats majority voting by 1.5 pp overall on MMLU-Pro in an unsupervised setting.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Governance Gaps in Agent Interoperability Protocols: What MCP, A2A, and ACP Cannot Express cs.MA · 2026-06-30 · unverdicted · none · ref 21 · internal anchor
Gap analysis of MCP, A2A, ACP, ANP, and ERC-8004 shows none support the full set of membership, deliberation, voting, dissent, escalation, and audit primitives required for governed agent communities.

From Debate to Decision: Conformal Social Choice for Safe Multi-Agent Deliberation

fields

years

verdicts

representative citing papers

citing papers explorer