LLMs infer cultural context from cues but fail to apply it for adapted responses unless prompted sequentially, shown via the CAPRI dataset on units, time, and quantity expressions.
A fine-grained comparison of pragmatic language understanding in humans and language models
4 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 4years
2026 4representative citing papers
LLMs recover only ~20% of explicit pragmatic shifts under implicit cultural cues across five languages, responding mainly to linguistic structure rather than cultural associations as shown by Hindi-Urdu controls.
LLMs struggle to infer pragmatic meaning from non-verbal responses alone, showing accuracy drops of up to 60 percentage points versus verbal responses, though in-context learning improves results.
LLMs perform substantially better as pragmatic listeners judging language than as speakers generating it, revealing weak alignment between the two roles.
citing papers explorer
-
LLMs Infer Cultural Context but Fail to Apply It When Responding
LLMs infer cultural context from cues but fail to apply it for adapted responses unless prompted sequentially, shown via the CAPRI dataset on units, time, and quantity expressions.
-
Unveiling the Limits of Large Language Models in Inferring Pragmatic Meaning from Non-Verbal Responses
LLMs struggle to infer pragmatic meaning from non-verbal responses alone, showing accuracy drops of up to 60 percentage points versus verbal responses, though in-context learning improves results.
-
How Hypocritical Is Your LLM judge? Listener-Speaker Asymmetries in the Pragmatic Competence of Large Language Models
LLMs perform substantially better as pragmatic listeners judging language than as speakers generating it, revealing weak alignment between the two roles.