AgentFairBench is a multi-domain benchmark for demographic disparity in LLM agent actions, with a pilot showing no significant effect for Claude Haiku 4.5 after arity-matched noise correction.
Measuring Gender and Racial Biases in Large Language Models: Intersectional Evidence from Automated Resume Evaluation.PNAS Nexus, 4(3), 2025
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 6roles
background 1polarities
support 1representative citing papers
Persona prefixes reduce brand recommendation Jaccard similarity by 0.12-0.20, with mid-market brands swapping up to 75% of recommendations while category leaders remain ~80% consistent across OpenAI and Anthropic models.
Demographic bias in LLM dispatch decisions appears mainly in ambiguous-severity incidents, varies by language and demographic axis with religious appearance showing the largest effects, and does not transfer consistently across English and Mandarin.
LLMs produce lower-fidelity summaries of identical public comments when attributed to lower-status occupations like street vendors versus financial analysts, with inconsistent race effects and no gender effects.
Controlled prompt interventions reveal strong affiliation bias in LLM peer reviews favoring top-ranked institutions, plus effects from seniority and publication history.
Weird generalization in fine-tuned models is brittle, appearing only in specific cases and disappearing under prompt-based interventions that make the undesired behavior expected.
citing papers explorer
-
AgentFairBench: Do LLM Agents Discriminate When They Act?
AgentFairBench is a multi-domain benchmark for demographic disparity in LLM agent actions, with a pilot showing no significant effect for Claude Haiku 4.5 after arity-matched noise correction.
-
Persona Conditioning of Brand Recommendations in Retrieval-Augmented Commercial Chat: A Prominence-Stratified Cross-Provider Audit
Persona prefixes reduce brand recommendation Jaccard similarity by 0.12-0.20, with mid-market brands swapping up to 75% of recommendations while category leaders remain ~80% consistent across OpenAI and Anthropic models.