Towards under- standing sycophancy in language models

· 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Explanation Fairness in Large Language Models: An Empirical Analysis of Disparities in How LLMs Justify Decisions Across Demographic Groups

cs.CL · 2026-05-09 · conditional · novelty 6.0

LLMs produce explanations with significant disparities in verbosity, sentiment, hedging, faithfulness, and lexical complexity across demographic groups, varying by model and only partially mitigated by prompting.

citing papers explorer

Showing 1 of 1 citing paper.

Explanation Fairness in Large Language Models: An Empirical Analysis of Disparities in How LLMs Justify Decisions Across Demographic Groups cs.CL · 2026-05-09 · conditional · none · ref 15
LLMs produce explanations with significant disparities in verbosity, sentiment, hedging, faithfulness, and lexical complexity across demographic groups, varying by model and only partially mitigated by prompting.

Towards under- standing sycophancy in language models

fields

years

verdicts

representative citing papers

citing papers explorer