Silencer: From discovery to mitigation of self-bias in llm-as-benchmark-generator

Yuan, P · 2025 · arXiv 2505.20738

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Evaluating Commercial AI Chatbots as News Intermediaries

cs.CL · 2026-05-21 · conditional · novelty 7.0

Commercial AI chatbots reach over 90% multiple-choice accuracy on recent news facts but lose 11-17% in free response and drop to 19-70% on subtle false-premise questions, with retrieval failures causing most errors and clear Anglophone bias.

When Identity Skews Debate: Anonymization for Bias-Reduced Multi-Agent Reasoning

cs.AI · 2025-10-08 · unverdicted · novelty 7.0

Anonymization in multi-agent debate reduces identity bias by equalizing self and peer weights in a Bayesian update model, quantified by the Identity Bias Coefficient.

citing papers explorer

Showing 2 of 2 citing papers.

Evaluating Commercial AI Chatbots as News Intermediaries cs.CL · 2026-05-21 · conditional · none · ref 68
Commercial AI chatbots reach over 90% multiple-choice accuracy on recent news facts but lose 11-17% in free response and drop to 19-70% on subtle false-premise questions, with retrieval failures causing most errors and clear Anglophone bias.
When Identity Skews Debate: Anonymization for Bias-Reduced Multi-Agent Reasoning cs.AI · 2025-10-08 · unverdicted · none · ref 15
Anonymization in multi-agent debate reduces identity bias by equalizing self and peer weights in a Bayesian update model, quantified by the Identity Bias Coefficient.

Silencer: From discovery to mitigation of self-bias in llm-as-benchmark-generator

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer