Do LLMs Use Cultural Knowledge Without Being Told? A Multilingual Evaluation of Implicit Pragmatic Adaptation

Christian Grimme; Janina L\"utke Stockdiek; Lennart Sch\"apermeier; Marie Griesbach; Mehwish Nasim; Neel Ganapathi Sabhahit; Pranav Bhandari; Sanjeevan Selvaganapathy; Usman Naseem

arxiv: 2604.17718 · v1 · submitted 2026-04-20 · 💻 cs.CL · cs.SI

Do LLMs Use Cultural Knowledge Without Being Told? A Multilingual Evaluation of Implicit Pragmatic Adaptation

Mehwish Nasim , Sanjeevan Selvaganapathy , Neel Ganapathi Sabhahit , Marie Griesbach , Pranav Bhandari , Janina L\"utke Stockdiek , Lennart Sch\"apermeier , Usman Naseem

show 1 more author

Christian Grimme

This is my paper

Pith reviewed 2026-05-10 04:45 UTC · model grok-4.3

classification 💻 cs.CL cs.SI

keywords LLMscultural pragmaticsimplicit adaptationmultilingual evaluationpragmatic context sensitivityPCScultural knowledgepragmatic features

0 comments

The pith

LLMs recover only about one-fifth of the pragmatic shifts they show under explicit cultural instructions when culture is only implied by context.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether large language models adapt their speaking style to cultural norms when those norms are suggested only by the conversational situation, rather than stated directly. It runs 60 scenarios in five languages under three prompt conditions and scores the outputs on twelve pragmatic features such as authority deference and group framing. The key metric, Pragmatic Context Sensitivity, measures how much of the shift produced by an explicit cultural prompt reappears when the model sees only an implicit cue. Results show an average recovery of roughly one-fifth across models, with authority cues transferring better than group-framing cues and some hedging behaviors actively suppressed. This matters because everyday language use relies heavily on implied context, so limited implicit adaptation limits how well current models fit diverse cultural settings without extra guidance.

Core claim

Across four deployed LLMs and five languages, the primary stable-only PCS mean is 0.196, meaning the models recover only about one-fifth of the pragmatic shift they can produce when given explicit cultural instructions. Transfer is strongest for authority-related cues and weakest for individual-versus-group framing. Uncertainty-related behaviour is mixed, with hedging density showing negative explicit gaps in all languages. Hindi and Urdu, which share grammar but index distinct cultures, produce no reliable baseline difference, indicating that models respond primarily to linguistic structure rather than cultural associations carried by the language.

What carries the argument

Pragmatic Context Sensitivity (PCS), defined as the fraction of the explicit cultural prompt shift (neutral baseline to explicit instruction) that reappears under implicit situational cueing.

If this is right

Models adapt more readily to authority cues than to group-framing cues when culture is only implied.
Alignment training suppresses certain uncertainty expressions such as hedging across all tested languages.
Responses track linguistic structure more closely than the cultural community indexed by a language.
Cultural pragmatics in LLMs is limited by an explicit-versus-implicit deployment gap rather than by missing factual knowledge alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Prompt engineering or fine-tuning that targets implicit cues could raise cultural appropriateness without needing explicit instructions every time.
Benchmarks that isolate grammar-matched languages could help separate linguistic from cultural effects in future model evaluations.
Low implicit adaptation suggests current models may require user-supplied context or post-processing when deployed in settings where cultural norms are rarely stated outright.

Load-bearing premise

The twelve pragmatic features validly capture culturally relevant differences and the implicit prompts contain no explicit cultural information that leaks into the measured shift.

What would settle it

Running the same scenarios with a fresh set of purely implicit prompts that produce PCS values near 1.0 across models, or showing that the twelve features do not distinguish known cultural differences in human responses.

Figures

Figures reproduced from arXiv: 2604.17718 by Christian Grimme, Janina L\"utke Stockdiek, Lennart Sch\"apermeier, Marie Griesbach, Mehwish Nasim, Neel Ganapathi Sabhahit, Pranav Bhandari, Sanjeevan Selvaganapathy, Usman Naseem.

**Figure 1.** Figure 1: Three-prompt design used throughout the paper. Prompt A is the neutral baseline, Prompt B adds an explicit cultural instruction, and Prompt C adds only implicit situational cueing. PCS asks how much of the Prompt A→B shift is recovered in Prompt A→C. An answer can be factually correct yet still sound socially wrong: too direct with a superior, too individualistic in a family decision, or too casual in a r… view at source ↗

**Figure 2.** Figure 2: One illustrative cell from the released results: [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Mean stable-only PCS by language and prag [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 5.** Figure 5: Language Default Index (LDI) heatmap across [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: Hindi-Urdu baseline comparison across all 12 [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗

read the original abstract

Many benchmarks show that large language models can answer direct questions about culture. We study a different question: do they also change how they speak when culture is only implied by the situation? We evaluate 60 culturally grounded conversational scenarios across five languages in three conditions: a neutral baseline (Prompt A), an explicit cultural instruction (Prompt B), and implicit situational cueing (Prompt C). We score responses on 12 pragmatic features covering deference to authority, individual-versus-group framing, and uncertainty management. We define Pragmatic Context Sensitivity (PCS) as the fraction of the Prompt A->B shift that reappears under Prompt A->C. Across four deployed LLMs and five languages (English, German, Hindi, Nepali, Urdu), the primary stable-only PCS mean is 0.196 (SD = 0.113), indicating that the models recover only about one-fifth of the pragmatic shift they can produce when instructed explicitly. Transfer is strongest for authority-related cues (0.299) and weakest for individual-versus-group framing (0.120). Uncertainty-related behaviour is mixed: hedging density exhibits negative explicit gaps in all five languages, suggesting that alignment training actively suppresses the target behaviour. Because Hindi and Urdu share core grammar yet index distinct cultural communities, we use them as a natural control; a paired analysis finds no reliable baseline difference (t = 0.96, p = 0.339, dz = 0.06), suggesting that models respond primarily to linguistic structure rather than to the cultural associations a language carries. We argue that multilingual cultural pragmatics is an explicit-versus-implicit deployment problem, not only a factual knowledge problem.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LLMs recover only about one-fifth of explicit pragmatic cultural shifts when culture is merely implied, and the Hindi-Urdu pair gives a clean separation of language structure from cultural indexing.

read the letter

The paper's central result is that across four models and five languages the implicit condition recovers a mean 0.196 of the shift produced by explicit cultural instructions. That single ratio is the main deliverable and it lands as a within-subfield benchmark for how much cultural pragmatics happens without being told. The PCS definition itself is straightforward: divide the A-to-C difference by the A-to-B difference, with no fitted parameters. The Hindi-Urdu control is the clearest methodological contribution. The languages share core grammar yet point to distinct communities, and the paired test on the neutral baseline shows no reliable difference, which supports the claim that models are responding to linguistic form more than to the cultural associations carried by the language. The three-condition design is simple to follow, the authority-versus-group breakdown adds granularity, and the hedging suppression result flags a possible alignment side-effect worth tracking. The numbers are reported with SDs and a t-test, which is better than many abstracts in this area. The soft spot is the 12 pragmatic features. No inter-annotator agreement, no human calibration, and no sample prompts or rubric appear in the provided text. If the features are scored with any systematic bias that is consistent across conditions, the ratio can stabilize mechanically even if the underlying model behavior is different. The stress-test concern about shared priors in numerator and denominator is therefore live until the methods section shows otherwise. This is for people working on multilingual cultural competence and implicit adaptation in deployed systems. The design is original enough and the control is sharp enough that it deserves referee time, though the authors will need to supply the scoring details and prompts before the 0.196 number can be treated as fully reproducible.

Referee Report

2 major / 3 minor

Summary. The paper evaluates whether LLMs implicitly adapt their language use to cultural contexts in conversational scenarios without explicit instructions. Using 60 scenarios in five languages (English, German, Hindi, Nepali, Urdu) and four LLMs, responses are generated under neutral (A), explicit (B), and implicit (C) conditions. Responses are scored on 12 pragmatic features, and Pragmatic Context Sensitivity (PCS) is defined as the ratio of the A-to-C shift to the A-to-B shift. The key finding is a mean PCS of 0.196 (SD = 0.113) for stable features, with stronger transfer for authority cues and weaker for group framing. A control comparing Hindi and Urdu shows no significant difference, suggesting models are sensitive to language structure rather than associated cultures.

Significance. The results, if robust, indicate that LLMs recover only a small fraction of culturally appropriate pragmatic shifts when culture is implied rather than stated, pointing to an explicit-versus-implicit deployment gap in current models. This is significant for understanding the limits of cultural knowledge in LLMs beyond factual recall. The multilingual design and the Hindi-Urdu natural control provide a strong test of whether effects are cultural or linguistic. The negative explicit gaps in hedging behavior across languages is an interesting secondary finding that may reflect alignment effects. The concrete statistics and paired t-test add credibility to the empirical contribution.

major comments (2)

Abstract and Methods: The PCS metric (mean 0.196) is load-bearing for the central claim of limited implicit adaptation, but the abstract and methods do not provide the scoring rubric, inter-annotator agreement, or human calibration for the 12 pragmatic features. Without this, it is unclear if the features validly index cultural differences or if scoring biases (e.g., consistent over-detection of deference) affect both shifts equally, making the ratio potentially artifactual as noted in the stress-test concern, which does land here.
Results (Hindi-Urdu control): The paired t-test (t = 0.96, p = 0.339, dz = 0.06) is used to argue no reliable baseline difference, but the manuscript does not specify the number of observations or how features were aggregated for this test. This is important to evaluate the power of the control and whether it adequately rules out cultural associations.

minor comments (3)

Abstract: The term 'stable-only PCS' is used without definition in the abstract; clarify what 'stable' refers to (perhaps features with positive explicit gaps).
Methods: The prompt templates for A, B, and C are not shown; including them would aid reproducibility.
Discussion: The claim that 'alignment training actively suppresses the target behaviour' for hedging is interpretive; support with more evidence or tone it down.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback, which identifies key areas where additional methodological detail will strengthen the paper. We address each major comment below and have revised the manuscript to improve transparency on scoring and statistical procedures.

read point-by-point responses

Referee: Abstract and Methods: The PCS metric (mean 0.196) is load-bearing for the central claim of limited implicit adaptation, but the abstract and methods do not provide the scoring rubric, inter-annotator agreement, or human calibration for the 12 pragmatic features. Without this, it is unclear if the features validly index cultural differences or if scoring biases (e.g., consistent over-detection of deference) affect both shifts equally, making the ratio potentially artifactual as noted in the stress-test concern, which does land here.

Authors: We agree that greater detail on the scoring process is warranted to support the validity of the PCS metric. The methods section currently describes the 12 pragmatic features at a high level, but we will expand it in the revision to include the complete scoring rubric for each feature. We will also add inter-annotator agreement statistics and a description of how the features were calibrated against established work in cultural pragmatics. These additions will show that the same rubric was applied uniformly across conditions, reducing the likelihood that differential scoring biases artifactually inflate or deflate the A-to-C versus A-to-B ratio. revision: yes
Referee: Results (Hindi-Urdu control): The paired t-test (t = 0.96, p = 0.339, dz = 0.06) is used to argue no reliable baseline difference, but the manuscript does not specify the number of observations or how features were aggregated for this test. This is important to evaluate the power of the control and whether it adequately rules out cultural associations.

Authors: We appreciate the referee noting this gap in reporting. In the revised results section we will explicitly state the number of observations used for the paired t-test and clarify the aggregation procedure (i.e., whether feature scores were averaged per response or analyzed individually before pairing). This information will allow readers to assess the statistical power of the control and evaluate whether the null result adequately supports the interpretation that models respond primarily to linguistic structure rather than cultural associations. revision: yes

Circularity Check

0 steps flagged

No circularity: PCS is a direct empirical ratio from observed shifts

full rationale

The paper defines Pragmatic Context Sensitivity (PCS) as the fraction of the A-to-B explicit shift recovered in the A-to-C implicit condition, computed directly from scored differences on 12 predefined pragmatic features across LLM responses. This is a straightforward measurement and averaging operation with no fitted parameters, self-referential equations, load-bearing self-citations, or imported uniqueness claims. The central numerical result (mean PCS 0.196) follows immediately from the condition-wise feature scores without reducing to its inputs by construction. The Hindi-Urdu control and language-specific analyses are likewise independent empirical comparisons. No steps match any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that the chosen pragmatic features are culturally diagnostic and that the implicit prompts isolate situational cues without cultural leakage; no free parameters or invented entities are introduced.

axioms (1)

domain assumption The 12 pragmatic features validly measure culturally relevant differences in responses
Scoring depends on these features being appropriate proxies for deference, framing, and uncertainty management.

pith-pipeline@v0.9.0 · 5656 in / 1302 out tokens · 48745 ms · 2026-05-10T04:45:52.219263+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages

[1]

2024 , eprint=

Towards Measuring the Representation of Subjective Global Opinions in Language Models , author=. 2024 , eprint=

work page 2024
[2]

Assessing Cross-Cultural Alignment between C hat GPT and Human Societies: An Empirical Study

Cao, Yong and Zhou, Li and Lee, Seolhwa and Cabello, Laura and Chen, Min and Hershcovich, Daniel. Assessing Cross-Cultural Alignment between C hat GPT and Human Societies: An Empirical Study. Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP). 2023. doi:10.18653/v1/2023.c3nlp-1.7

work page doi:10.18653/v1/2023.c3nlp-1.7 2023
[3]

S ocial CC : Interactive Evaluation for Cultural Competence in Language Agents

Wu, Jincenzi and Lian, Jianxun and Wang, Dingdong and Meng, Helen M. S ocial CC : Interactive Evaluation for Cultural Competence in Language Agents. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.1594

work page doi:10.18653/v1/2025.acl-long.1594 2025
[4]

C ultural B ench: A Robust, Diverse and Challenging Benchmark for Measuring LM s' Cultural Knowledge Through Human- AI Red-Teaming

Chiu, Yu Ying and Jiang, Liwei and Lin, Bill Yuchen and Park, Chan Young and Li, Shuyue Stella and Ravi, Sahithya and Bhatia, Mehar and Antoniak, Maria and Tsvetkov, Yulia and Shwartz, Vered and Choi, Yejin. C ultural B ench: A Robust, Diverse and Challenging Benchmark for Measuring LM s' Cultural Knowledge Through Human- AI Red-Teaming. Proceedings of th...

work page doi:10.18653/v1/2025.acl-long.1247 2025
[5]

N orm A d: A Framework for Measuring the Cultural Adaptability of Large Language Models

Rao, Abhinav and Yerukola, Akhila and Shah, Vishwa and Reinecke, Katharina and Sap, Maarten. N orm A d: A Framework for Measuring the Cultural Adaptability of Large Language Models. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)...

work page doi:10.18653/v1/2025.naacl-long.120 2025
[6]

2025 , eprint=

Localized Cultural Knowledge is Conserved and Controllable in Large Language Models , author=. 2025 , eprint=

work page 2025
[7]

and Le, Quoc V

Wei, Jason and Wang, Xuezhi and Schuurmans, Dale and Bosma, Maarten and Ichter, Brian and Xia, Fei and Chi, Ed H. and Le, Quoc V. and Zhou, Denny , title =. Proceedings of the 36th International Conference on Neural Information Processing Systems , articleno =. 2022 , isbn =

work page 2022
[8]

Proceedings of the 36th International Conference on Neural Information Processing Systems , articleno =

Kojima, Takeshi and Gu, Shixiang Shane and Reid, Machel and Matsuo, Yutaka and Iwasawa, Yusuke , title =. Proceedings of the 36th International Conference on Neural Information Processing Systems , articleno =. 2022 , isbn =

work page 2022
[9]

Hofstede's Model of National Cultural Differences and their Consequences: A Triumph of Faith - a Failure of Analysis , volume =

Mc Sweeney, Brendan , year =. Hofstede's Model of National Cultural Differences and their Consequences: A Triumph of Faith - a Failure of Analysis , volume =. Human Relations - HUM RELAT , doi =

work page
[10]

Brown, Penelope and Levinson, Stephen C. , year=. Politeness: Some Universals in Language Usage , publisher=

work page
[11]

1995 , publisher =

Intercultural Communication: A Discourse Approach , author =. 1995 , publisher =

work page 1995
[12]

The goldilocks of pragmatic understanding: fine-tuning strategy matters for implicature resolution by LLMs , year =

Ruis, Laura and Khan, Akbir and Biderman, Stella and Hooker, Sara and Rockt\". The goldilocks of pragmatic understanding: fine-tuning strategy matters for implicature resolution by LLMs , year =. Proceedings of the 37th International Conference on Neural Information Processing Systems , articleno =

work page
[13]

A fine-grained comparison of pragmatic language understanding in humans and language models

Hu, Jennifer and Floyd, Sammy and Jouravlev, Olessia and Fedorenko, Evelina and Gibson, Edward. A fine-grained comparison of pragmatic language understanding in humans and language models. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023. doi:10.18653/v1/2023.acl-long.230

work page doi:10.18653/v1/2023.acl-long.230 2023
[14]

E ti C or: Corpus for Analyzing LLM s for Etiquettes

Dwivedi, Ashutosh and Lavania, Pradhyumna and Modi, Ashutosh. E ti C or: Corpus for Analyzing LLM s for Etiquettes. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023. doi:10.18653/v1/2023.emnlp-main.428

work page doi:10.18653/v1/2023.emnlp-main.428 2023
[15]

2020 , editor =

Hu, Junjie and Ruder, Sebastian and Siddhant, Aditya and Neubig, Graham and Firat, Orhan and Johnson, Melvin , booktitle =. 2020 , editor =

work page 2020
[16]

MEGA : Multilingual evaluation of generative AI

Ahuja, Kabir and Diddee, Harshita and Hada, Rishav and Ochieng, Millicent and Ramesh, Krithika and Jain, Prachi and Nambi, Akshay and Ganu, Tanuja and Segal, Sameer and Ahmed, Mohamed and Bali, Kalika and Sitaram, Sunayana. MEGA : Multilingual Evaluation of Generative AI. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processi...

work page doi:10.18653/v1/2023.emnlp-main.258 2023
[17]

Better to Ask in English: Cross- 13 Lingual Evaluation of Large Language Models for Healthcare Queries

Jin, Yiqiao and Chandra, Mohit and Verma, Gaurav and Hu, Yibo and De Choudhury, Munmun and Kumar, Srijan , title =. Proceedings of the ACM Web Conference 2024 , pages =. 2024 , isbn =. doi:10.1145/3589334.3645643 , abstract =

work page doi:10.1145/3589334.3645643 2024
[18]

Computational evidence that H indi and U rdu share a grammar but not the lexicon

Prasad, K.V.S and Virk, Shafqat Mumtaz. Computational evidence that H indi and U rdu share a grammar but not the lexicon. Proceedings of the 3rd Workshop on South and Southeast A sian Natural Language Processing. 2012

work page 2012
[19]

1997 , issue_date =

Gusfield, Dan , title =. 1997 , issue_date =. doi:10.1145/270563.571472 , journal =

work page doi:10.1145/270563.571472 1997
[20]

Urdu: A Computational Approach for the Exploration of Similarities Under Phonetic Aspects , journal =

Hindustani or Hindi vs. Urdu: A Computational Approach for the Exploration of Similarities Under Phonetic Aspects , journal =. 2020 , publisher =. doi:10.14569/IJACSA.2020.0111191 , url =

work page doi:10.14569/ijacsa.2020.0111191 2020
[21]

1978 , publisher=

Value systems in forty countries: Interpretation, validation and consequence for theory , author=. 1978 , publisher=

work page 1978

[1] [1]

2024 , eprint=

Towards Measuring the Representation of Subjective Global Opinions in Language Models , author=. 2024 , eprint=

work page 2024

[2] [2]

Assessing Cross-Cultural Alignment between C hat GPT and Human Societies: An Empirical Study

Cao, Yong and Zhou, Li and Lee, Seolhwa and Cabello, Laura and Chen, Min and Hershcovich, Daniel. Assessing Cross-Cultural Alignment between C hat GPT and Human Societies: An Empirical Study. Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP). 2023. doi:10.18653/v1/2023.c3nlp-1.7

work page doi:10.18653/v1/2023.c3nlp-1.7 2023

[3] [3]

S ocial CC : Interactive Evaluation for Cultural Competence in Language Agents

Wu, Jincenzi and Lian, Jianxun and Wang, Dingdong and Meng, Helen M. S ocial CC : Interactive Evaluation for Cultural Competence in Language Agents. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2025. doi:10.18653/v1/2025.acl-long.1594

work page doi:10.18653/v1/2025.acl-long.1594 2025

[4] [4]

C ultural B ench: A Robust, Diverse and Challenging Benchmark for Measuring LM s' Cultural Knowledge Through Human- AI Red-Teaming

Chiu, Yu Ying and Jiang, Liwei and Lin, Bill Yuchen and Park, Chan Young and Li, Shuyue Stella and Ravi, Sahithya and Bhatia, Mehar and Antoniak, Maria and Tsvetkov, Yulia and Shwartz, Vered and Choi, Yejin. C ultural B ench: A Robust, Diverse and Challenging Benchmark for Measuring LM s' Cultural Knowledge Through Human- AI Red-Teaming. Proceedings of th...

work page doi:10.18653/v1/2025.acl-long.1247 2025

[5] [5]

N orm A d: A Framework for Measuring the Cultural Adaptability of Large Language Models

Rao, Abhinav and Yerukola, Akhila and Shah, Vishwa and Reinecke, Katharina and Sap, Maarten. N orm A d: A Framework for Measuring the Cultural Adaptability of Large Language Models. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)...

work page doi:10.18653/v1/2025.naacl-long.120 2025

[6] [6]

2025 , eprint=

Localized Cultural Knowledge is Conserved and Controllable in Large Language Models , author=. 2025 , eprint=

work page 2025

[7] [7]

and Le, Quoc V

Wei, Jason and Wang, Xuezhi and Schuurmans, Dale and Bosma, Maarten and Ichter, Brian and Xia, Fei and Chi, Ed H. and Le, Quoc V. and Zhou, Denny , title =. Proceedings of the 36th International Conference on Neural Information Processing Systems , articleno =. 2022 , isbn =

work page 2022

[8] [8]

Proceedings of the 36th International Conference on Neural Information Processing Systems , articleno =

Kojima, Takeshi and Gu, Shixiang Shane and Reid, Machel and Matsuo, Yutaka and Iwasawa, Yusuke , title =. Proceedings of the 36th International Conference on Neural Information Processing Systems , articleno =. 2022 , isbn =

work page 2022

[9] [9]

Hofstede's Model of National Cultural Differences and their Consequences: A Triumph of Faith - a Failure of Analysis , volume =

Mc Sweeney, Brendan , year =. Hofstede's Model of National Cultural Differences and their Consequences: A Triumph of Faith - a Failure of Analysis , volume =. Human Relations - HUM RELAT , doi =

work page

[10] [10]

Brown, Penelope and Levinson, Stephen C. , year=. Politeness: Some Universals in Language Usage , publisher=

work page

[11] [11]

1995 , publisher =

Intercultural Communication: A Discourse Approach , author =. 1995 , publisher =

work page 1995

[12] [12]

The goldilocks of pragmatic understanding: fine-tuning strategy matters for implicature resolution by LLMs , year =

Ruis, Laura and Khan, Akbir and Biderman, Stella and Hooker, Sara and Rockt\". The goldilocks of pragmatic understanding: fine-tuning strategy matters for implicature resolution by LLMs , year =. Proceedings of the 37th International Conference on Neural Information Processing Systems , articleno =

work page

[13] [13]

A fine-grained comparison of pragmatic language understanding in humans and language models

Hu, Jennifer and Floyd, Sammy and Jouravlev, Olessia and Fedorenko, Evelina and Gibson, Edward. A fine-grained comparison of pragmatic language understanding in humans and language models. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023. doi:10.18653/v1/2023.acl-long.230

work page doi:10.18653/v1/2023.acl-long.230 2023

[14] [14]

E ti C or: Corpus for Analyzing LLM s for Etiquettes

Dwivedi, Ashutosh and Lavania, Pradhyumna and Modi, Ashutosh. E ti C or: Corpus for Analyzing LLM s for Etiquettes. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023. doi:10.18653/v1/2023.emnlp-main.428

work page doi:10.18653/v1/2023.emnlp-main.428 2023

[15] [15]

2020 , editor =

Hu, Junjie and Ruder, Sebastian and Siddhant, Aditya and Neubig, Graham and Firat, Orhan and Johnson, Melvin , booktitle =. 2020 , editor =

work page 2020

[16] [16]

MEGA : Multilingual evaluation of generative AI

Ahuja, Kabir and Diddee, Harshita and Hada, Rishav and Ochieng, Millicent and Ramesh, Krithika and Jain, Prachi and Nambi, Akshay and Ganu, Tanuja and Segal, Sameer and Ahmed, Mohamed and Bali, Kalika and Sitaram, Sunayana. MEGA : Multilingual Evaluation of Generative AI. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processi...

work page doi:10.18653/v1/2023.emnlp-main.258 2023

[17] [17]

Better to Ask in English: Cross- 13 Lingual Evaluation of Large Language Models for Healthcare Queries

Jin, Yiqiao and Chandra, Mohit and Verma, Gaurav and Hu, Yibo and De Choudhury, Munmun and Kumar, Srijan , title =. Proceedings of the ACM Web Conference 2024 , pages =. 2024 , isbn =. doi:10.1145/3589334.3645643 , abstract =

work page doi:10.1145/3589334.3645643 2024

[18] [18]

Computational evidence that H indi and U rdu share a grammar but not the lexicon

Prasad, K.V.S and Virk, Shafqat Mumtaz. Computational evidence that H indi and U rdu share a grammar but not the lexicon. Proceedings of the 3rd Workshop on South and Southeast A sian Natural Language Processing. 2012

work page 2012

[19] [19]

1997 , issue_date =

Gusfield, Dan , title =. 1997 , issue_date =. doi:10.1145/270563.571472 , journal =

work page doi:10.1145/270563.571472 1997

[20] [20]

Urdu: A Computational Approach for the Exploration of Similarities Under Phonetic Aspects , journal =

Hindustani or Hindi vs. Urdu: A Computational Approach for the Exploration of Similarities Under Phonetic Aspects , journal =. 2020 , publisher =. doi:10.14569/IJACSA.2020.0111191 , url =

work page doi:10.14569/ijacsa.2020.0111191 2020

[21] [21]

1978 , publisher=

Value systems in forty countries: Interpretation, validation and consequence for theory , author=. 1978 , publisher=

work page 1978