LLMs exhibit an accumulated message effect where conversation history polarity biases subsequent judgments, stronger for high-entropy items, independent of context length, and with a negativity bias.
Cognitive bias in high- stakes decision-making with llms
5 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 5representative citing papers
Dynamically adjusting beta via LLM-as-judge downweights biased comparisons to learn more rational reward models from flawed human preferences.
Numeric anchors embedded in images systematically bias VLM quality judgments more than severe visual degradation, with layer-wise probing showing that anchor-saturated layers are suboptimal for quality prediction.
LLMs exhibit persistent inertia in value orientations, with harm avoidance and fairness remaining skewed across persona prompts.
A literature review that categorizes bias in LLMs, surveys evaluation and mitigation techniques, and discusses ethical implications.
citing papers explorer
-
AMEL: Accumulated Message Effects on LLM Judgments
LLMs exhibit an accumulated message effect where conversation history polarity biases subsequent judgments, stronger for high-entropy items, independent of context length, and with a negativity bias.
-
Mitigating Cognitive Bias in RLHF by Altering Rationality
Dynamically adjusting beta via LLM-as-judge downweights biased comparisons to learn more rational reward models from flawed human preferences.
-
Don't Look at the Numbers: Visual Anchoring Bias and Layer-wise Representation in VLMs
Numeric anchors embedded in images systematically bias VLM quality judgments more than severe visual degradation, with layer-wise probing showing that anchor-saturated layers are suboptimal for quality prediction.