Side-by-side comparison of intent-equivalent SAE and AAVE tweets significantly exacerbates covert dialect bias in LMs compared to isolated evaluation, with explicit dialect labels worsening the effect further.
Large Language Models Discriminate Against Speakers of G erman Dialects
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
Meta-analysis of 33 ACL papers shows inconsistent LLM-as-a-Judge results, overtrust, and single-model reliance in multilingual/low-resource settings, with recommendations for better practice.
LLMs show minimal sociodemographic disparities in advice because they infer user demographics poorly from history; conversation topics are the main predictor and act as proxies for groups.
citing papers explorer
-
Side-by-side Comparison Amplifies Dialect Bias in Language Models
Side-by-side comparison of intent-equivalent SAE and AAVE tweets significantly exacerbates covert dialect bias in LMs compared to isolated evaluation, with explicit dialect labels worsening the effect further.
-
Challenges and Recommendations for LLMs-as-a-Judge in Multilingual Settings and Low-Resource Languages
Meta-analysis of 33 ACL papers shows inconsistent LLM-as-a-Judge results, overtrust, and single-model reliance in multilingual/low-resource settings, with recommendations for better practice.
-
Topics as Proxies for Sociodemographics: How Conversational Context Affects LLM Answers
LLMs show minimal sociodemographic disparities in advice because they infer user demographics poorly from history; conversation topics are the main predictor and act as proxies for groups.