ArabCulture-Dialogue dataset shows LLMs perform worse on dialectal Arabic than Modern Standard Arabic across cultural reasoning, translation, and generation tasks.
Incorporating Dialectal Variability for Socially Equitable Language Identification
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
Incidental multilingualism from uneven web training makes LLMs unequal, brittle, and opaque across languages.
New Zealand Reddit users link language to place and form contiguous speech communities with complex geographic alignment; Word2Vec embeddings reveal semantic variations and shifts in NZ English on a 4.26 billion word corpus.
citing papers explorer
-
Cultural Benchmarking of LLMs in Standard and Dialectal Arabic Dialogues
ArabCulture-Dialogue dataset shows LLMs perform worse on dialectal Arabic than Modern Standard Arabic across cultural reasoning, translation, and generation tasks.
-
Lost in the Tower of Babel: The Adverse Effects of Incidental Multilingualism in LLMs
Incidental multilingualism from uneven web training makes LLMs unequal, brittle, and opaque across languages.
-
Language, Place, and Social Media: Geographic Dialect Alignment in New Zealand
New Zealand Reddit users link language to place and form contiguous speech communities with complex geographic alignment; Word2Vec embeddings reveal semantic variations and shifts in NZ English on a 4.26 billion word corpus.