2410.15413 , primaryclass =

Simon Malberg, Roman Poletukhin, Carolin M · 2024 · arXiv 2410.15413

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Mitigating Cognitive Bias in RLHF by Altering Rationality

cs.AI · 2026-05-07 · unverdicted · novelty 6.0

Dynamically adjusting beta via LLM-as-judge downweights biased comparisons to learn more rational reward models from flawed human preferences.

Fragile Preferences: A Deep Dive Into Order Effects in Large Language Models

cs.AI · 2025-06-17 · unverdicted · novelty 6.0

LLMs exhibit quality-dependent order biases and name biases in pairwise comparisons that can cause selection of inferior options, demonstrated across resume and color tasks with a new classification of preferences as robust, fragile, or indifferent.

citing papers explorer

Showing 2 of 2 citing papers.

Mitigating Cognitive Bias in RLHF by Altering Rationality cs.AI · 2026-05-07 · unverdicted · none · ref 14
Dynamically adjusting beta via LLM-as-judge downweights biased comparisons to learn more rational reward models from flawed human preferences.
Fragile Preferences: A Deep Dive Into Order Effects in Large Language Models cs.AI · 2025-06-17 · unverdicted · none · ref 58
LLMs exhibit quality-dependent order biases and name biases in pairwise comparisons that can cause selection of inferior options, demonstrated across resume and color tasks with a new classification of preferences as robust, fragile, or indifferent.

2410.15413 , primaryclass =

fields

years

verdicts

representative citing papers

citing papers explorer