LLMs produce explanations with significant disparities in verbosity, sentiment, hedging, faithfulness, and lexical complexity across demographic groups, varying by model and only partially mitigated by prompting.
Towards under- standing sycophancy in language models
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Explanation Fairness in Large Language Models: An Empirical Analysis of Disparities in How LLMs Justify Decisions Across Demographic Groups
LLMs produce explanations with significant disparities in verbosity, sentiment, hedging, faithfulness, and lexical complexity across demographic groups, varying by model and only partially mitigated by prompting.