LLMs exhibit confirmation bias in an interactive rule-discovery task, which prompting interventions reduce and improve discovery rates from 42% to 56%.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Failing to Falsify: Evaluating and Mitigating Confirmation Bias in Language Models
LLMs exhibit confirmation bias in an interactive rule-discovery task, which prompting interventions reduce and improve discovery rates from 42% to 56%.