Which Institutional Frameworks Do Chatbots Assume? Auditing Jurisdictional Defaults in Multilingual LLMs

Harini Suresh; Zhizhi Wang

arxiv: 2606.00333 · v1 · pith:4M2FOOG7new · submitted 2026-05-29 · 💻 cs.CL

Which Institutional Frameworks Do Chatbots Assume? Auditing Jurisdictional Defaults in Multilingual LLMs

Zhizhi Wang , Harini Suresh This is my paper

Pith reviewed 2026-06-28 22:05 UTC · model grok-4.3

classification 💻 cs.CL

keywords multilingual LLMsjurisdictional defaultslegal-administrative promptsinstitutional frameworkslanguage-based biasAI auditingunderspecified queries

0 comments

The pith

Multilingual LLMs default to U.S. legal frameworks on English prompts and China frameworks on Chinese prompts when jurisdiction is omitted.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether large language models treat the language of a query as a cue for which country's rules to apply when answering questions about taxes, labor, healthcare, and similar topics without naming any jurisdiction. Across seven models, English prompts more often trigger answers framed under U.S. rules while Chinese prompts more often trigger answers framed under China rules, with the effect strengthening when the prompt demands a single answer. Pooled results show 74.5 percent of English responses adopt a U.S. framework and 53.3 percent of Chinese responses adopt a China framework. This pattern holds in every model tested and creates the risk that users receive fluent but jurisdictionally mismatched advice simply because they wrote in their preferred language rather than the language tied to the relevant rules.

Core claim

When jurisdiction is omitted from prompts about taxes, labor protections, healthcare, education, pensions, and administrative procedures, seven LLMs consistently supply answers assuming the legal-administrative framework associated with the input language: China-specific answers rise with Mandarin Chinese input, while U.S.-specific, comparative, or generic answers rise with English input. This directional pattern appears across all models and system-prompt conditions, and prompts requiring a single answer increase jurisdiction selection to 74.5 percent U.S. framework for English input and 53.3 percent China framework for Chinese input.

What carries the argument

Input language functioning as an implicit default selector among institutional frameworks in underspecified legal-administrative queries.

If this is right

Users whose comfortable language differs from the relevant jurisdiction face elevated risk of receiving answers under the wrong set of rules.
LLM interfaces should request location information or explicitly state the jurisdictional scope when it is absent from the prompt.
The observed pattern is independent of whether the model was developed in the United States or China.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same language cue may shape model outputs on non-legal topics such as political or cultural advice.
Testing additional languages could reveal whether the effect is specific to English-Chinese pairs or generalizes to other language pairs.
Explicit jurisdiction statements in prompts may serve as an override that reduces reliance on language defaults.

Load-bearing premise

The 60 prompts adequately represent real underspecified legal-administrative queries and the manual annotations of the 2,520 responses correctly and consistently identify the assumed jurisdiction without substantial bias from prompt choice or annotator judgment.

What would settle it

Running the same 60 prompts with an added explicit statement of a different jurisdiction and checking whether the language-based pattern disappears or reverses.

Figures

Figures reproduced from arXiv: 2606.00333 by Harini Suresh, Zhizhi Wang.

**Figure 1.** Figure 1: Sample prompts short noun phrase without a country qualifier. A bilingual researcher manually produced the parallel English and Chinese versions and checked each paired prompt for semantic equivalence [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 2.** Figure 2: Framework distribution all seven models, 74.5% of responses to English input adopt a U.S. framework, while 53.3% of responses to Chinese input adopt a China framework. This condition is therefore best read as a commitment test rather than as a mitigation: when models are required to choose one framework, their choices follow the same language-conditioned direction observed in the less restrictive conditi… view at source ↗

**Figure 3.** Figure 3: shows how system prompts affect whether a model commits to one jurisdiction [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Language gaps cell percentage, allowing direct comparison of U.S., China, other-single-jurisdiction, multiple-framework, and generic responses across the full design. 4.4 Robustness and Statistical Analysis Because responses are nested within prompts and models, we avoid treating the pooled 2,520 responses as independent observations. We therefore report robustness checks that respect the paired English–… view at source ↗

**Figure 5.** Figure 5: Full framework distribution assumed in the response [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

LLMs increasingly answer questions about taxes, labor protections, healthcare, education, pensions, and administrative procedures, where usefulness often depends on the applicable jurisdiction. Multilingual users may write in their most comfortable language rather than one associated with the country or region whose rules apply. We ask whether deployed LLMs use input language as a default jurisdictional signal when prompts omit any country or region. Prior multilingual audits show that prompt language can shift cultural, political, or normative outputs; we examine which legal-administrative framework models supply when jurisdiction is underspecified. We evaluate seven LLMs developed in the United States or China on 60 underspecified legal-administrative prompts in English and Mandarin Chinese under three system-prompt conditions, yielding 2,520 manually annotated responses. Across models and conditions, Chinese input more often produces China-specific answers, while English input more often produces U.S.-specific, comparative, or generic answers. Prompts requiring a single answer further increase jurisdiction selection: pooled across models, 74.5% of English-input responses adopt a U.S. framework, while 53.3% of Chinese-input responses adopt a China framework. This directional pattern appears in all seven models. We describe this deployment-level pattern as institutional-framework misselection risk: a fluent answer may rely on a legal-administrative context the user did not intend, especially when their preferred language differs from the relevant jurisdiction. LLM interfaces should not route institutional advice by input language alone; when location is absent, they should request it or state the jurisdictional scope of the answer.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper documents a consistent language-to-jurisdiction pattern in legal prompts across seven models, but the manual annotation step lacks the usual reliability checks.

read the letter

The main thing to know is that English prompts led models to assume US legal frameworks more often while Chinese prompts led to China-specific ones, with the gap widening when the prompt asked for a single answer. Pooled across models the figures are 74.5 percent US for English inputs and 53.3 percent China for Chinese inputs, and the direction held in all seven models they ran.

They took 60 underspecified legal-administrative prompts, ran them in English and Mandarin on models from both the US and China, varied the system prompt, and hand-labeled the jurisdictional assumptions in the 2520 outputs. That produces a quantitative mapping that earlier multilingual audits had not reported for this exact setting. The consistency across models and conditions is the clearest result.

The work is a straightforward audit with no fitted parameters or circular claims. It flags a deployment issue that matters for users whose first language does not match the jurisdiction they actually need.

The soft spot is the annotation process. The abstract gives no inter-annotator agreement numbers, no coding protocol, and no examples of how ambiguous or multi-jurisdiction answers were handled. The prompts are presented as representative without a sampling description. If either the prompt set or the labeling scheme is narrow, the reported percentages could move. That is the part that needs the most detail in a full version.

This is useful for anyone working on multilingual LLM interfaces or safety evaluations. It shows a practical risk that current systems do not flag. The core observation is solid enough from the numbers given that it deserves a serious referee, even though the methods will need tightening for reproducibility.

Referee Report

2 major / 2 minor

Summary. The paper audits whether seven US- and China-developed LLMs default to US- or China-specific legal-administrative frameworks when answering 60 underspecified prompts (taxes, labor, healthcare, etc.) in English versus Mandarin Chinese under three system-prompt conditions. Manual annotation of the resulting 2,520 responses reveals a consistent directional pattern: English inputs more often elicit U.S.-specific, comparative, or generic answers (pooled 74.5% U.S. framework), while Chinese inputs more often elicit China-specific answers (pooled 53.3% China framework); single-answer prompts amplify jurisdiction selection. The authors label this “institutional-framework misselection risk” and recommend that interfaces request location or state scope rather than routing by input language alone.

Significance. If the annotation scheme is reliable, the result supplies a concrete, falsifiable empirical baseline on language-as-jurisdiction proxy in deployed multilingual models. The scale (seven models, 2,520 responses, three conditions) and the fact that the directional pattern holds in every model are strengths; the finding directly informs interface design for cross-lingual users of legal-administrative advice.

major comments (2)

[Abstract / Evaluation Setup] The central pooled percentages (74.5 % English-input U.S. framework, 53.3 % Chinese-input China framework) rest entirely on human classification of 2,520 model outputs. The manuscript provides neither an annotation protocol, inter-annotator agreement statistics, nor a description of how ambiguous or multi-jurisdictional answers were coded (see the abstract and the paragraph describing the 2,520 responses). This directly affects the load-bearing claim that the directional pattern is robust.
[Prompt Construction] The 60 prompts are presented as representative of underspecified legal-administrative queries, yet no sampling frame, coverage argument, or sensitivity analysis is supplied. If the prompt set is idiosyncratic, the reported jurisdiction-selection rates could shift substantially.

minor comments (2)

[Abstract] The abstract states that prompt examples and annotation guidelines are omitted; providing at least a few representative prompts and the exact coding rubric would allow readers to assess classification decisions.
[Results] Table or figure that breaks the 74.5 % / 53.3 % figures down by model and condition would make the “appears in all seven models” claim easier to verify at a glance.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The two major comments identify areas where additional methodological transparency will strengthen the paper. We address each point below and commit to revisions that directly respond to the concerns while preserving the core empirical contribution.

read point-by-point responses

Referee: [Abstract / Evaluation Setup] The central pooled percentages (74.5 % English-input U.S. framework, 53.3 % Chinese-input China framework) rest entirely on human classification of 2,520 model outputs. The manuscript provides neither an annotation protocol, inter-annotator agreement statistics, nor a description of how ambiguous or multi-jurisdictional answers were coded (see the abstract and the paragraph describing the 2,520 responses). This directly affects the load-bearing claim that the directional pattern is robust.

Authors: We agree that the submitted manuscript omitted a full description of the annotation protocol. In the revision we will add a dedicated subsection (and appendix) that (i) reproduces the complete annotation guidelines used by the three annotators, (ii) specifies the coding rules for ambiguous, multi-jurisdictional, or refusal responses (treated as 'comparative' or 'generic' when no single jurisdiction was selected), and (iii) reports inter-annotator agreement on a 10 % stratified sample (Cohen's κ). These additions will make the 74.5 % / 53.3 % figures directly reproducible and will address the robustness concern. revision: yes
Referee: [Prompt Construction] The 60 prompts are presented as representative of underspecified legal-administrative queries, yet no sampling frame, coverage argument, or sensitivity analysis is supplied. If the prompt set is idiosyncratic, the reported jurisdiction-selection rates could shift substantially.

Authors: The 60 prompts were constructed to span six high-stakes domains (tax, labor, healthcare, education, pensions, administrative procedures) with balanced single-answer versus open-ended formats. While a formal probabilistic sampling frame from a larger population of queries was not feasible, the set was iteratively refined for domain coverage and underspecification. In revision we will (i) add an explicit coverage argument mapping each prompt to the six domains, (ii) include a sensitivity analysis that recomputes the pooled percentages after successively dropping each domain, and (iii) report that the directional English→U.S. / Chinese→China pattern remains stable across all leave-one-domain-out subsets. These steps will quantify the dependence on the particular prompt collection. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical counts from model outputs and manual annotation

full rationale

The paper reports an audit consisting of 60 prompts run on seven LLMs under controlled conditions, followed by manual annotation of 2,520 responses into jurisdictional categories and simple percentage tabulation. No equations, parameters, derivations, or uniqueness theorems appear. The central 74.5%/53.3% figures are direct empirical tallies, not outputs of any fitted model or self-referential construction. No self-citations are invoked to justify load-bearing premises, ansatzes, or uniqueness results. The methodology is externally replicable via the prompt set and annotation criteria; therefore the findings do not reduce to their inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The study is an empirical audit with no fitted parameters or new entities; it rests on two domain assumptions about prompt design and response classification.

axioms (2)

domain assumption The 60 prompts omit any explicit country or region and are therefore jurisdiction-underspecified.
Invoked in the description of the prompt set used for all conditions.
domain assumption Human annotators can reliably classify model responses as adopting a U.S. framework, China framework, comparative, or generic.
Required for the reported percentages and the claim that the directional pattern holds across models.

pith-pipeline@v0.9.1-grok · 5812 in / 1345 out tokens · 34056 ms · 2026-06-28T22:05:21.997969+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 17 canonical work pages

[1]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =

AlKhamissi, Badr and ElNokrashy, Muhammad and Alkhamissi, Mai and Diab, Mona , title =. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =. 2024 , address =. doi:10.18653/v1/2024.acl-long.671 , url =

work page doi:10.18653/v1/2024.acl-long.671 2024
[2]

2026 , note =

AI Mistakes Accountants Are Fixing This Tax Season , howpublished =. 2026 , note =

2026
[3]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =

Bignotti, Camilla and Camassa, Carolina , title =. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =. 2024 , doi =

2024
[4]

Cheong, Inyoung and Xia, King and Feng, K. J. Kevin and Chen, Quan Ze and Zhang, Amy X. , title =. Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency , series =. 2024 , publisher =

2024
[5]

and Zhang, Wei and Gomes, Jose O

Guey, William and Bougault, Pierrick and de Moura, Vitor D. and Zhang, Wei and Gomes, Jose O. , title =. 2025 , doi =

2025
[6]

Guha, Neel and Nyarko, Julian and Ho, Daniel E. and R. LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models , booktitle =. 2023 , url =

2023
[7]

, title =

Haslett, David and Huang, Linus Ta-Lun and Khalatbari, Leila and Hsiao, Janet Hui-wen and Chan, Antoni B. , title =. 2025 , doi =

2025
[8]

Findings of the Association for Computational Linguistics: ACL 2025 , pages =

Helwe, Chadi and Balalau, Oana and Ceolin, Davide , title =. Findings of the Association for Computational Linguistics: ACL 2025 , pages =. 2025 , address =. doi:10.18653/v1/2025.findings-acl.883 , url =

work page doi:10.18653/v1/2025.findings-acl.883 2025
[9]

Challenges and Strategies in Cross-Cultural

Hershcovich, Daniel and Frank, Stella and Lent, Heather and de Lhoneux, Miryam and Abdou, Mostafa and Brandl, Stephanie and Bugliarello, Emanuele and Cabello Piqueras, Laura and Chalkidis, Ilias and Cui, Ruixiang and Fierro, Constanza and Margatina, Katerina and Rust, Phillip and S. Challenges and Strategies in Cross-Cultural. Proceedings of the 60th Annu...

work page doi:10.18653/v1/2022.acl-long.482 2022
[10]

2025 , doi =

Huang, PeiHsuan and Lin, ZihWei and Imbot, Simon and Fu, WenCheng and Tu, Ethan , title =. 2025 , doi =

2025
[11]

Proceedings of the 33rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems , pages =

Janowicz, Krzysztof and Liu, Zilong and Mai, Gengchen and Wang, Zhangyu and Majic, Ivan and Fortacz, Alexandra and McKenzie, Grant and Gao, Song , title =. Proceedings of the 33rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems , pages =. 2025 , publisher =

2025
[12]

The State and Fate of Linguistic Diversity and Inclusion in the NLP World

Joshi, Pratik and Santy, Sebastin and Budhiraja, Amar and Bali, Kalika and Choudhury, Monojit , title =. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , pages =. 2020 , address =. doi:10.18653/v1/2020.acl-main.560 , url =

work page doi:10.18653/v1/2020.acl-main.560 2020
[13]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =

Kay, Jackie and Kasirzadeh, Atoosa and Mohamed, Shakir , title =. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =. 2024 , doi =

2024
[14]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =

Kumar, Shivani and Jurgens, David , title =. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =. 2025 , address =. doi:10.18653/v1/2025.acl-long.294 , url =

work page doi:10.18653/v1/2025.acl-long.294 2025
[15]

Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency , series =

Lopez, Paola , title =. Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency , series =. 2024 , publisher =

2024
[16]

and Ritter, Alan and Xu, Wei , title =

Naous, Tarek and Ryan, Michael J. and Ritter, Alan and Xu, Wei , title =. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =. 2024 , address =. doi:10.18653/v1/2024.acl-long.862 , url =

work page doi:10.18653/v1/2024.acl-long.862 2024
[17]

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models , booktitle =

R. Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models , booktitle =. 2024 , address =. doi:10.18653/v1/2024.acl-long.816 , url =

work page doi:10.18653/v1/2024.acl-long.816 2024
[18]

and Held, William and Yang, Diyi , title =

Ryan, Michael J. and Held, William and Yang, Diyi , title =. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =. 2024 , address =. doi:10.18653/v1/2024.acl-long.853 , url =

work page doi:10.18653/v1/2024.acl-long.853 2024
[19]

Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in

Rystr. Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in. Proceedings of Interdisciplinary Workshop on Observations of Misunderstood, Misguided and Malicious Use of Language Models , pages =. 2025 , address =

2025
[20]

2026 , doi =

Smirnov, Oleg , title =. 2026 , doi =

2026
[21]

, title =

Varshney, Kush R. , title =. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =. 2024 , doi =

2024
[22]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =

Vida, Karina and Damken, Fabian and Lauscher, Anne , title =. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =. 2024 , doi =

2024
[23]

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , month = jul, year =

Do Prompt-Based Models Really Understand the Meaning of Their Prompts? , author =. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , month = jul, year =. doi:10.18653/v1/2022.naacl-main.167 , pages =

work page doi:10.18653/v1/2022.naacl-main.167 2022
[24]

, title =

Wang, Wenxuan and Jiao, Wenxiang and Huang, Jingyuan and Dai, Ruyi and Huang, Jen-tse and Tu, Zhaopeng and Lyu, Michael R. , title =. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =. 2024 , address =. doi:10.18653/v1/2024.acl-long.345 , url =

work page doi:10.18653/v1/2024.acl-long.345 2024
[25]

2026 , note =

Why Using. 2026 , note =

2026
[26]

2024 , doi =

Zhong, Qishuai and Yun, Yike and Sun, Aixin , title =. 2024 , doi =

2024
[27]

, title =

Zhou, Naitian and Bamman, David and Bleaman, Isaac L. , title =. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , year =. doi:10.18653/v1/2025.acl-long.1256 , url =

work page doi:10.18653/v1/2025.acl-long.1256 2025
[28]

The Twelfth International Conference on Learning Representations , year =

Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I Learned to Start Worrying about Prompt Formatting , author =. The Twelfth International Conference on Learning Representations , year =
[29]

Durmus, Esin and Nyugen, Karina and Liao, Thomas I. and Schiefer, Nicholas and Askell, Amanda and Bakhtin, Anton and Chen, Carol and Hatfield-Dodds, Zac and Hernandez, Danny and Joseph, Nicholas and Lovitt, Liane and McCandlish, Sam and Sikder, Orowa and Tamkin, Alex and Thamkul, Janel and Kaplan, Jared and Clark, Jack and Ganguli, Deep , title =. First C...

2024
[30]

and Kizilcec, Ren

Tao, Yan and Viberg, Olga and Baker, Ryan S. and Kizilcec, Ren. Cultural Bias and Cultural Alignment of Large Language Models , journal =. 2024 , doi =

2024
[31]

Grattafiori, Aaron and Dubey, Abhimanyu and Jauhri, Abhinav and. The. 2024 , eprint =

2024
[32]

Penedo, Guilherme and Kydl. The. Advances in Neural Information Processing Systems 37 (. 2024 , url =

2024
[33]

2024 , eprint =

Yang, An and Yang, Baosong and Hui, Binyuan and Zheng, Bo and Yu, Bowen and Zhou, Chang and Li, Chengpeng and Li, Chengyuan and Liu, Dayiheng and Huang, Fei and Dong, Guanting and Wei, Haoran and Lin, Huan and Tang, Jialong and Wang, Jialin and Yang, Jian and Tu, Jianhong and Zhang, Jianwei and Ma, Jianxin and Yang, Jianxin and Xu, Jin and Zhou, Jingren a...

2024
[34]

Computational Linguistics , volume =

Pawar, Siddhesh and Park, Junyeong and Jin, Jiho and Arora, Arnav and Myung, Junho and Yadav, Srishti and Haznitrama, Faiz Ghifari and Song, Inhwa and Oh, Alice and Augenstein, Isabelle , title =. Computational Linguistics , volume =. 2025 , doi =. 2411.00860 , archivePrefix =

arXiv 2025
[35]

2007 , publisher =

Fricker, Miranda , title =. 2007 , publisher =

2007
[36]

and Boyd, Danah and Friedler, Sorelle A

Selbst, Andrew D. and Boyd, Danah and Friedler, Sorelle A. and Venkatasubramanian, Suresh and Vertesi, Janet , title =. Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT*) , pages =. 2019 , publisher =

2019
[37]

, title =

Herd, Pamela and Moynihan, Donald P. , title =. 2018 , publisher =

2018
[38]

Biased Tales: Cultural and Topic Bias in Generating Children's Stories , booktitle =

Rooein, Donya and Zouhar, Vil\'. Biased Tales: Cultural and Topic Bias in Generating Children's Stories , booktitle =. 2025 , address =. doi:10.18653/v1/2025.emnlp-main.3 , url =

work page doi:10.18653/v1/2025.emnlp-main.3 2025
[39]

Findings of the Association for Computational Linguistics: EMNLP 2024 , pages =

Bhatt, Shaily and Diaz, Fernando , title =. Findings of the Association for Computational Linguistics: EMNLP 2024 , pages =. 2024 , address =. doi:10.18653/v1/2024.findings-emnlp.942 , url =

work page doi:10.18653/v1/2024.findings-emnlp.942 2024
[40]

Proceedings of the Third Workshop on Narrative Understanding , pages =

Lucy, Li and Bamman, David , title =. Proceedings of the Third Workshop on Narrative Understanding , pages =. 2021 , address =. doi:10.18653/v1/2021.nuse-1.5 , url =

work page doi:10.18653/v1/2021.nuse-1.5 2021
[41]

Findings of the Association for Computational Linguistics: EMNLP 2023 , pages =

Wan, Yixin and Pu, George and Sun, Jiao and Garimella, Aparna and Chang, Kai-Wei and Peng, Nanyun , title =. Findings of the Association for Computational Linguistics: EMNLP 2023 , pages =. 2023 , address =. doi:10.18653/v1/2023.findings-emnlp.243 , url =

work page doi:10.18653/v1/2023.findings-emnlp.243 2023
[42]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =

Toro Isaza, Paulina and Xu, Guangxuan and Oloko, Toye and Hou, Yufang and Peng, Nanyun and Wang, Dakuo , title =. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =. 2023 , address =. doi:10.18653/v1/2023.acl-long.359 , url =

work page doi:10.18653/v1/2023.acl-long.359 2023
[43]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =

Arzaghi, Mina and Carichon, Florian and Farnadi, Golnoosh , title =. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =. 2024 , doi =

2024
[44]

Transactions of the Association for Computational Linguistics , volume =

Malaviya, Chaitanya and Chang, Joseph Chee and Roth, Dan and Iyyer, Mohit and Yatskar, Mark and Lo, Kyle , title =. Transactions of the Association for Computational Linguistics , volume =. 2025 , address =. doi:10.1162/tacl.a.24 , url =

work page doi:10.1162/tacl.a.24 2025
[45]

2024 , url =

Niklaus, Joel and Matoshi, Veton and St. 2024 , url =

2024
[46]

1948 , note =

Universal Declaration of Human Rights , howpublished =. 1948 , note =

1948
[47]

Decent Work , howpublished =
[48]

2026 , note =

Worldwide Governance Indicators , howpublished =. 2026 , note =

2026

[1] [1]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =

AlKhamissi, Badr and ElNokrashy, Muhammad and Alkhamissi, Mai and Diab, Mona , title =. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =. 2024 , address =. doi:10.18653/v1/2024.acl-long.671 , url =

work page doi:10.18653/v1/2024.acl-long.671 2024

[2] [2]

2026 , note =

AI Mistakes Accountants Are Fixing This Tax Season , howpublished =. 2026 , note =

2026

[3] [3]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =

Bignotti, Camilla and Camassa, Carolina , title =. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =. 2024 , doi =

2024

[4] [4]

Cheong, Inyoung and Xia, King and Feng, K. J. Kevin and Chen, Quan Ze and Zhang, Amy X. , title =. Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency , series =. 2024 , publisher =

2024

[5] [5]

and Zhang, Wei and Gomes, Jose O

Guey, William and Bougault, Pierrick and de Moura, Vitor D. and Zhang, Wei and Gomes, Jose O. , title =. 2025 , doi =

2025

[6] [6]

Guha, Neel and Nyarko, Julian and Ho, Daniel E. and R. LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models , booktitle =. 2023 , url =

2023

[7] [7]

, title =

Haslett, David and Huang, Linus Ta-Lun and Khalatbari, Leila and Hsiao, Janet Hui-wen and Chan, Antoni B. , title =. 2025 , doi =

2025

[8] [8]

Findings of the Association for Computational Linguistics: ACL 2025 , pages =

Helwe, Chadi and Balalau, Oana and Ceolin, Davide , title =. Findings of the Association for Computational Linguistics: ACL 2025 , pages =. 2025 , address =. doi:10.18653/v1/2025.findings-acl.883 , url =

work page doi:10.18653/v1/2025.findings-acl.883 2025

[9] [9]

Challenges and Strategies in Cross-Cultural

Hershcovich, Daniel and Frank, Stella and Lent, Heather and de Lhoneux, Miryam and Abdou, Mostafa and Brandl, Stephanie and Bugliarello, Emanuele and Cabello Piqueras, Laura and Chalkidis, Ilias and Cui, Ruixiang and Fierro, Constanza and Margatina, Katerina and Rust, Phillip and S. Challenges and Strategies in Cross-Cultural. Proceedings of the 60th Annu...

work page doi:10.18653/v1/2022.acl-long.482 2022

[10] [10]

2025 , doi =

Huang, PeiHsuan and Lin, ZihWei and Imbot, Simon and Fu, WenCheng and Tu, Ethan , title =. 2025 , doi =

2025

[11] [11]

Proceedings of the 33rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems , pages =

Janowicz, Krzysztof and Liu, Zilong and Mai, Gengchen and Wang, Zhangyu and Majic, Ivan and Fortacz, Alexandra and McKenzie, Grant and Gao, Song , title =. Proceedings of the 33rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems , pages =. 2025 , publisher =

2025

[12] [12]

The State and Fate of Linguistic Diversity and Inclusion in the NLP World

Joshi, Pratik and Santy, Sebastin and Budhiraja, Amar and Bali, Kalika and Choudhury, Monojit , title =. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , pages =. 2020 , address =. doi:10.18653/v1/2020.acl-main.560 , url =

work page doi:10.18653/v1/2020.acl-main.560 2020

[13] [13]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =

Kay, Jackie and Kasirzadeh, Atoosa and Mohamed, Shakir , title =. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =. 2024 , doi =

2024

[14] [14]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =

Kumar, Shivani and Jurgens, David , title =. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =. 2025 , address =. doi:10.18653/v1/2025.acl-long.294 , url =

work page doi:10.18653/v1/2025.acl-long.294 2025

[15] [15]

Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency , series =

Lopez, Paola , title =. Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency , series =. 2024 , publisher =

2024

[16] [16]

and Ritter, Alan and Xu, Wei , title =

Naous, Tarek and Ryan, Michael J. and Ritter, Alan and Xu, Wei , title =. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =. 2024 , address =. doi:10.18653/v1/2024.acl-long.862 , url =

work page doi:10.18653/v1/2024.acl-long.862 2024

[17] [17]

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models , booktitle =

R. Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models , booktitle =. 2024 , address =. doi:10.18653/v1/2024.acl-long.816 , url =

work page doi:10.18653/v1/2024.acl-long.816 2024

[18] [18]

and Held, William and Yang, Diyi , title =

Ryan, Michael J. and Held, William and Yang, Diyi , title =. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =. 2024 , address =. doi:10.18653/v1/2024.acl-long.853 , url =

work page doi:10.18653/v1/2024.acl-long.853 2024

[19] [19]

Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in

Rystr. Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in. Proceedings of Interdisciplinary Workshop on Observations of Misunderstood, Misguided and Malicious Use of Language Models , pages =. 2025 , address =

2025

[20] [20]

2026 , doi =

Smirnov, Oleg , title =. 2026 , doi =

2026

[21] [21]

, title =

Varshney, Kush R. , title =. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =. 2024 , doi =

2024

[22] [22]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =

Vida, Karina and Damken, Fabian and Lauscher, Anne , title =. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =. 2024 , doi =

2024

[23] [23]

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , month = jul, year =

Do Prompt-Based Models Really Understand the Meaning of Their Prompts? , author =. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , month = jul, year =. doi:10.18653/v1/2022.naacl-main.167 , pages =

work page doi:10.18653/v1/2022.naacl-main.167 2022

[24] [24]

, title =

Wang, Wenxuan and Jiao, Wenxiang and Huang, Jingyuan and Dai, Ruyi and Huang, Jen-tse and Tu, Zhaopeng and Lyu, Michael R. , title =. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =. 2024 , address =. doi:10.18653/v1/2024.acl-long.345 , url =

work page doi:10.18653/v1/2024.acl-long.345 2024

[25] [25]

2026 , note =

Why Using. 2026 , note =

2026

[26] [26]

2024 , doi =

Zhong, Qishuai and Yun, Yike and Sun, Aixin , title =. 2024 , doi =

2024

[27] [27]

, title =

Zhou, Naitian and Bamman, David and Bleaman, Isaac L. , title =. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , year =. doi:10.18653/v1/2025.acl-long.1256 , url =

work page doi:10.18653/v1/2025.acl-long.1256 2025

[28] [28]

The Twelfth International Conference on Learning Representations , year =

Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I Learned to Start Worrying about Prompt Formatting , author =. The Twelfth International Conference on Learning Representations , year =

[29] [29]

Durmus, Esin and Nyugen, Karina and Liao, Thomas I. and Schiefer, Nicholas and Askell, Amanda and Bakhtin, Anton and Chen, Carol and Hatfield-Dodds, Zac and Hernandez, Danny and Joseph, Nicholas and Lovitt, Liane and McCandlish, Sam and Sikder, Orowa and Tamkin, Alex and Thamkul, Janel and Kaplan, Jared and Clark, Jack and Ganguli, Deep , title =. First C...

2024

[30] [30]

and Kizilcec, Ren

Tao, Yan and Viberg, Olga and Baker, Ryan S. and Kizilcec, Ren. Cultural Bias and Cultural Alignment of Large Language Models , journal =. 2024 , doi =

2024

[31] [31]

Grattafiori, Aaron and Dubey, Abhimanyu and Jauhri, Abhinav and. The. 2024 , eprint =

2024

[32] [32]

Penedo, Guilherme and Kydl. The. Advances in Neural Information Processing Systems 37 (. 2024 , url =

2024

[33] [33]

2024 , eprint =

Yang, An and Yang, Baosong and Hui, Binyuan and Zheng, Bo and Yu, Bowen and Zhou, Chang and Li, Chengpeng and Li, Chengyuan and Liu, Dayiheng and Huang, Fei and Dong, Guanting and Wei, Haoran and Lin, Huan and Tang, Jialong and Wang, Jialin and Yang, Jian and Tu, Jianhong and Zhang, Jianwei and Ma, Jianxin and Yang, Jianxin and Xu, Jin and Zhou, Jingren a...

2024

[34] [34]

Computational Linguistics , volume =

Pawar, Siddhesh and Park, Junyeong and Jin, Jiho and Arora, Arnav and Myung, Junho and Yadav, Srishti and Haznitrama, Faiz Ghifari and Song, Inhwa and Oh, Alice and Augenstein, Isabelle , title =. Computational Linguistics , volume =. 2025 , doi =. 2411.00860 , archivePrefix =

arXiv 2025

[35] [35]

2007 , publisher =

Fricker, Miranda , title =. 2007 , publisher =

2007

[36] [36]

and Boyd, Danah and Friedler, Sorelle A

Selbst, Andrew D. and Boyd, Danah and Friedler, Sorelle A. and Venkatasubramanian, Suresh and Vertesi, Janet , title =. Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT*) , pages =. 2019 , publisher =

2019

[37] [37]

, title =

Herd, Pamela and Moynihan, Donald P. , title =. 2018 , publisher =

2018

[38] [38]

Biased Tales: Cultural and Topic Bias in Generating Children's Stories , booktitle =

Rooein, Donya and Zouhar, Vil\'. Biased Tales: Cultural and Topic Bias in Generating Children's Stories , booktitle =. 2025 , address =. doi:10.18653/v1/2025.emnlp-main.3 , url =

work page doi:10.18653/v1/2025.emnlp-main.3 2025

[39] [39]

Findings of the Association for Computational Linguistics: EMNLP 2024 , pages =

Bhatt, Shaily and Diaz, Fernando , title =. Findings of the Association for Computational Linguistics: EMNLP 2024 , pages =. 2024 , address =. doi:10.18653/v1/2024.findings-emnlp.942 , url =

work page doi:10.18653/v1/2024.findings-emnlp.942 2024

[40] [40]

Proceedings of the Third Workshop on Narrative Understanding , pages =

Lucy, Li and Bamman, David , title =. Proceedings of the Third Workshop on Narrative Understanding , pages =. 2021 , address =. doi:10.18653/v1/2021.nuse-1.5 , url =

work page doi:10.18653/v1/2021.nuse-1.5 2021

[41] [41]

Findings of the Association for Computational Linguistics: EMNLP 2023 , pages =

Wan, Yixin and Pu, George and Sun, Jiao and Garimella, Aparna and Chang, Kai-Wei and Peng, Nanyun , title =. Findings of the Association for Computational Linguistics: EMNLP 2023 , pages =. 2023 , address =. doi:10.18653/v1/2023.findings-emnlp.243 , url =

work page doi:10.18653/v1/2023.findings-emnlp.243 2023

[42] [42]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =

Toro Isaza, Paulina and Xu, Guangxuan and Oloko, Toye and Hou, Yufang and Peng, Nanyun and Wang, Dakuo , title =. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages =. 2023 , address =. doi:10.18653/v1/2023.acl-long.359 , url =

work page doi:10.18653/v1/2023.acl-long.359 2023

[43] [43]

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =

Arzaghi, Mina and Carichon, Florian and Farnadi, Golnoosh , title =. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society , volume =. 2024 , doi =

2024

[44] [44]

Transactions of the Association for Computational Linguistics , volume =

Malaviya, Chaitanya and Chang, Joseph Chee and Roth, Dan and Iyyer, Mohit and Yatskar, Mark and Lo, Kyle , title =. Transactions of the Association for Computational Linguistics , volume =. 2025 , address =. doi:10.1162/tacl.a.24 , url =

work page doi:10.1162/tacl.a.24 2025

[45] [45]

2024 , url =

Niklaus, Joel and Matoshi, Veton and St. 2024 , url =

2024

[46] [46]

1948 , note =

Universal Declaration of Human Rights , howpublished =. 1948 , note =

1948

[47] [47]

Decent Work , howpublished =

[48] [48]

2026 , note =

Worldwide Governance Indicators , howpublished =. 2026 , note =

2026