When Identity Skews Debate: Anonymization for Bias-Reduced Multi-Agent Reasoning

Hyeong Kyu Choi , Xiaojin Zhu , Sharon Li

Authors on Pith no claims yet

classification 💻 cs.AI cs.MA

keywords identitybiasdebateagentspeerself-biassycophancyagent

read the original abstract

Multi-agent debate (MAD) aims to improve large language model (LLM) reasoning by letting multiple agents exchange answers and then aggregate their opinions. Yet recent studies reveal that agents are not neutral: they are prone to identity-driven sycophancy and self-bias, uncritically adopting a peer's view or stubbornly adhering to their own prior output, undermining the reliability of debate. In this work, we present the first principled framework that joins sycophancy and self-bias to mitigate and quantify identity bias in MAD. First, we formalize the debate dynamics as an identity-weighted Bayesian update process. Second, we propose response anonymization: by removing identity markers from prompts, agents cannot distinguish "self" from "peer", which forces equal weights on agent identity, thereby reducing bias and improving trustworthiness. Third, we define the Identity Bias Coefficient (IBC), a principled bias metric that measures an agent's tendency to follow its peer versus itself. Empirical studies across multiple models and benchmarks confirm that identity bias is widespread, with sycophancy far more common than self-bias. Our findings highlight the need to ensure that MAD systems reason based on content rather than identity. Code is released in https://github.com/deeplearning-wisc/MAD-identity-bias.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Peer Identity Bias in Multi-Agent LLM Evaluation: An Empirical Study Using the TRUST Democratic Discourse Analysis Pipeline
cs.CY 2026-04 unverdicted novelty 7.0

Single-channel anonymization hides identity bias via cancellation effects, but full-pipeline anonymization reveals that homogeneous ensembles amplify sycophancy while heterogeneous ones reduce it, with one model showi...
Epistemic Blinding: An Inference-Time Protocol for Auditing Prior Contamination in LLM-Assisted Analysis
cs.AI 2026-04 unverdicted novelty 7.0

Epistemic blinding is an inference-time protocol that anonymizes entity identifiers to measure and audit how much LLM outputs in agentic systems draw from parametric knowledge versus provided data.
The Reasoning Trap: An Information-Theoretic Bound on Closed-System Multi-Step LLM Reasoning
cs.CL 2026-05 unverdicted novelty 6.0

Closed-system multi-step LLM reasoning is subject to an information-theoretic bound where mutual information with evidence decreases, preserving accuracy while eroding faithfulness, with EGSR recovering it on SciFact ...
Fairness in Multi-Agent Systems for Software Engineering: An SDLC-Oriented Rapid Review
cs.SE 2026-04 unverdicted novelty 2.0

A rapid review of fairness in LLM-enabled multi-agent systems for the software development lifecycle concludes that the field lacks standardized evaluations, broad coverage, and effective governance, leaving it unprep...