LLMs exhibit identity-dependent hedging on human rights questions, with group identity as the strongest predictor among tested factors, and group steering mitigates the disparity.
In Proceedings of the AAAI Conference on Artificial Intelligence , Vol
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CY 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Hedging and Non-Affirmation: Quantifying LLM Alignment on Questions of Human Rights
LLMs exhibit identity-dependent hedging on human rights questions, with group identity as the strongest predictor among tested factors, and group steering mitigates the disparity.