Agreeableness in AI personas reliably predicts sycophantic behavior in 9 of 13 tested language models.
Role-playing evaluation for large language models, 2025
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CL 2verdicts
UNVERDICTED 2roles
background 1polarities
unclear 1representative citing papers
LLM moral robustness under persona role-play is largely determined by model family with Claude models most consistent, while susceptibility shows little family dependence.
citing papers explorer
-
Too Nice to Tell the Truth: Quantifying Agreeableness-Driven Sycophancy in Role-Playing Language Models
Agreeableness in AI personas reliably predicts sycophantic behavior in 9 of 13 tested language models.
-
Moral Susceptibility and Robustness under Persona Role-Play in Large Language Models
LLM moral robustness under persona role-play is largely determined by model family with Claude models most consistent, while susceptibility shows little family dependence.