LLMs exhibit identity-dependent hedging on human rights questions, with group identity as the strongest predictor among tested factors, and group steering mitigates the disparity.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 3roles
background 1polarities
unclear 1representative citing papers
Introduces the Mechanism Plausibility Scale, a four-level framework separating generative sufficiency from mechanistic plausibility in LLM-based agent-based models.
CERTA adds relevance-based certainty estimation to RAG so LLMs can better signal uncertainty on non-objective questions, reducing overconfidence.
citing papers explorer
-
Hedging and Non-Affirmation: Quantifying LLM Alignment on Questions of Human Rights
LLMs exhibit identity-dependent hedging on human rights questions, with group identity as the strongest predictor among tested factors, and group steering mitigates the disparity.
-
Mechanism Plausibility in Generative Agent-Based Modeling
Introduces the Mechanism Plausibility Scale, a four-level framework separating generative sufficiency from mechanistic plausibility in LLM-based agent-based models.
-
"I Don't Know" -- Towards Appropriate Trust with Certainty-Aware Retrieval Augmented Generation
CERTA adds relevance-based certainty estimation to RAG so LLMs can better signal uncertainty on non-objective questions, reducing overconfidence.