Beacon is a new single-turn benchmark that measures latent sycophancy in LLMs, showing it decomposes into linguistic and affective sub-biases that scale with model capacity and can be modulated by prompt and activation interventions.
Multidimensional irt for forced choice tests.Heliyon, 10(9):e20915, 2024
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Beacon: Single-Turn Diagnosis and Mitigation of Latent Sycophancy in Large Language Models
Beacon is a new single-turn benchmark that measures latent sycophancy in LLMs, showing it decomposes into linguistic and affective sub-biases that scale with model capacity and can be modulated by prompt and activation interventions.