pith. machine review for the scientific record. sign in

arxiv: 2604.22749 · v1 · submitted 2026-04-24 · 💻 cs.CL

Recognition: unknown

Representational Harms in LLM-Generated Narratives Against Global Majority Nationalities

Authors on Pith no claims yet

Pith reviewed 2026-05-08 11:40 UTC · model grok-4.3

classification 💻 cs.CL
keywords representational harmsLLM biasnational identityGlobal Majoritynarrative generationstereotypeserasuresubordinated portrayals
0
0 comments X

The pith

Minoritized national identities appear in subordinated portrayals over fifty times more often than dominant ones when LLMs generate narratives.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests how large language models create stories when given open-ended prompts that include different national identities. It finds that identities from the Global Majority show up less often in neutral or powerful positions and far more often in roles that suggest subordination, limitation, or harm. This imbalance stays in place even when prompts replace any US references with other nationalities, so the pattern is not just the model copying the user's cue. If the measurements hold, then models already in use for writing, simulation, or decision support can quietly reinforce one-sided views of most of the world's populations. The authors therefore argue that evaluation of these harms must start from Global Majority perspectives rather than default US-centric assumptions.

Core claim

Large language models produce persistent representational harms tied to national origin in open-ended story generation. Minoritized national identities are simultaneously underrepresented in power-neutral stories and overrepresented in subordinated character portrayals, which occur more than fifty times more frequently than dominant portrayals. The degree of harm increases when US nationality cues appear in the input, yet the same directional bias remains when those cues are replaced by non-US identities, showing the effect is not reducible to sycophancy.

What carries the argument

Systematic prompting of LLMs to generate narratives followed by human annotation that distinguishes power-neutral, subordinated, and dominant character portrayals by national identity.

If this is right

  • LLMs deployed in enterprise, government, or simulation tasks can embed these national-origin disparities without any explicit user request for bias.
  • Removing US references from prompts does not eliminate the pattern, so simple prompt adjustments are unlikely to remove the harms.
  • Evaluation and mitigation methods must be redesigned to center Global Majority perspectives rather than relying on US-default assumptions.
  • Continued uncritical use of current models risks amplifying one-dimensional portrayals in high-stakes contexts such as asylum interviews or content generation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The underlying training distributions or alignment processes appear to embed uneven global representations that surface most clearly in unconstrained creative tasks.
  • Extending the same measurement approach to other generative settings, such as dialogue or planning, could reveal whether the disparity is specific to narrative or more general.
  • Organizations adopting these models for international applications would benefit from routine testing against balanced sets of national-identity prompts before deployment.

Load-bearing premise

The specific prompts, generation settings, and annotation rules used to label stereotypes, erasure, and subordinated portrayals measure actual representational harms without the researchers' cultural framing or prompt artifacts shaping the results.

What would settle it

A larger replication with varied neutral prompts, multiple models, and independent annotators that finds no reliable difference in subordinated portrayal rates across national identity groups would falsify the central disparity claim.

Figures

Figures reproduced from arXiv: 2604.22749 by Evan Shieh, Harini Suresh, Ilana Nguyen, Thema Monroe-White.

Figure 1
Figure 1. Figure 1: Overall design of Study 1 and 2; see Study Design section for detail and view at source ↗
Figure 2
Figure 2. Figure 2: Relative frequency (%) of non-US country mentions that refer to a dominant character (left) or a subordinated view at source ↗
Figure 3
Figure 3. Figure 3: Two-dimensional t-SNE visualization of country distributions across all 150 possible story characters (using Hellinger view at source ↗
read the original abstract

Large language models (LLMs) are increasingly used for text generation tasks from everyday use to high-stakes enterprise and government applications, including simulated interviews with asylum seekers. While many works highlight the new potential applications of LLMs, there are risks of LLMs encoding and perpetuating harmful biases about non-dominant communities across the globe. To better evaluate and mitigate such harms, more research examining how LLMs portray diverse individuals is needed. In this work, we study how national origin identities are portrayed by widely-adopted LLMs in response to open-ended narrative generation prompts. Our findings demonstrate the presence of persistent representational harms by national origin, including harmful stereotypes, erasure, and one-dimensional portrayals of Global Majority identities. Minoritized national identities are simultaneously underrepresented in power-neutral stories and overrepresented in subordinated character portrayals, which are over fifty times more likely to appear than dominant portrayals. The degree of harm is amplified when US nationality cues (e.g., ``American'') are present in input prompts. Notably, we find that the harms we identify cannot be explained away via sycophancy, as US-centric biases persist even when replacing US nationality cues with non-US national identities in the prompts. Based on our findings, we call for further exploration of cultural harms in LLMs through methodologies that center Global Majority perspectives and challenge the uncritical adoption of US-based LLMs for the classification, surveillance, and misrepresentation of the majority of our planet.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper examines representational harms in open-ended narrative generation by LLMs, focusing on national-origin identities. It reports that minoritized (Global Majority) nationalities are underrepresented in power-neutral stories and overrepresented in subordinated, stereotyped, or erased character portrayals, with the latter appearing over fifty times more often than dominant portrayals. Harms are amplified by US nationality cues in prompts but persist even when such cues are replaced by non-US identities. The work concludes by advocating for evaluation methodologies that center Global Majority perspectives rather than relying on US-centric LLMs for classification or representation tasks.

Significance. If the quantitative claims hold under rigorous validation, the results would provide concrete evidence of nationality-based representational biases in widely deployed LLMs, with direct relevance to high-stakes applications such as simulated interviews. The empirical focus on open-ended generation and the explicit comparison of US vs. non-US cues offers a useful lens beyond standard stereotype benchmarks. The call for culturally centered methods is a constructive contribution, though its impact depends on the transparency and reproducibility of the underlying measurements.

major comments (2)
  1. [Abstract / Methods] Abstract and Methods: The headline claim that subordinated portrayals of minoritized national identities are 'over fifty times more likely' than dominant ones is presented without any reported sample size, number of generations per nationality, prompt templates, inter-annotator agreement statistics, blinding procedures, or statistical tests. This absence prevents assessment of whether the 50x ratio is supported by the data or sensitive to annotation choices.
  2. [Methods / Results] Annotation and coding scheme: The paper correctly notes that US-centric framing can distort classification of harms, yet the operational definitions of 'power-neutral,' 'subordinated character portrayals,' 'stereotypes,' and 'erasure' are applied via human annotation with no excerpted rubric, examples of coded stories, or discussion of how annotator cultural backgrounds were accounted for. Because these categories are central to the 50x ratio, any implicit framing in the coding scheme directly affects the validity of the central empirical result.
minor comments (2)
  1. [Abstract] The abstract refers to 'widely-adopted LLMs' but does not name the specific models, temperatures, or decoding settings used; these details should be stated explicitly in the methods.
  2. [Results] The manuscript would benefit from a table or appendix listing the exact nationalities tested, the number of stories generated per identity, and the distribution of annotation labels to allow readers to evaluate the scale and balance of the study.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful and constructive comments. The feedback identifies key opportunities to enhance the transparency of our methods and the presentation of our empirical results. We respond to each major comment below and will incorporate revisions to address the concerns.

read point-by-point responses
  1. Referee: [Abstract / Methods] Abstract and Methods: The headline claim that subordinated portrayals of minoritized national identities are 'over fifty times more likely' than dominant ones is presented without any reported sample size, number of generations per nationality, prompt templates, inter-annotator agreement statistics, blinding procedures, or statistical tests. This absence prevents assessment of whether the 50x ratio is supported by the data or sensitive to annotation choices.

    Authors: We agree that the abstract would benefit from including high-level methodological details to support the central claim. The full Methods section already reports the sample size, generations per nationality, prompt templates, inter-annotator agreement, blinding procedures, and statistical tests used to validate the 50x ratio. In the revised manuscript we will expand the abstract with a concise summary of the study scale and statistical support. This change improves accessibility without altering the reported findings or their interpretation. revision: yes

  2. Referee: [Methods / Results] Annotation and coding scheme: The paper correctly notes that US-centric framing can distort classification of harms, yet the operational definitions of 'power-neutral,' 'subordinated character portrayals,' 'stereotypes,' and 'erasure' are applied via human annotation with no excerpted rubric, examples of coded stories, or discussion of how annotator cultural backgrounds were accounted for. Because these categories are central to the 50x ratio, any implicit framing in the coding scheme directly affects the validity of the central empirical result.

    Authors: We acknowledge that greater transparency around the annotation process is needed. Although operational definitions appear in the Methods, we will revise the manuscript to include an excerpted coding rubric, multiple examples of coded stories, and an explicit discussion of annotator cultural backgrounds and the steps taken to incorporate diverse perspectives. These materials will be added to the Methods section and/or an appendix, directly supporting evaluation of the 50x ratio and related results. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical counts from LLM outputs and human annotations form an independent measurement chain.

full rationale

The paper performs a direct empirical study: it generates open-ended narratives from LLMs using fixed prompts, then applies human annotation to label categories such as stereotypes, erasure, and subordinated vs. dominant portrayals. The headline quantitative result (minoritized identities overrepresented in subordinated portrayals by a factor of fifty) is obtained by counting these labels across the generated corpus. No equations, fitted parameters, or first-principles derivations are present; therefore no step reduces by construction to its own inputs. Self-citations, if any, support background claims rather than load-bearing uniqueness theorems or ansatzes. The analysis remains self-contained as observational data collection and aggregation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the validity of the prompting strategy and the human coding scheme for harms; no free parameters, mathematical axioms, or new postulated entities are introduced.

axioms (1)
  • domain assumption Open-ended narrative prompts reliably surface underlying model representations of national identities.
    The study treats generated stories as diagnostic of model biases without additional validation that the prompt format itself does not induce the observed patterns.

pith-pipeline@v0.9.0 · 5565 in / 1306 out tokens · 58117 ms · 2026-05-08T11:40:33.007732+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

84 extracted references · 26 canonical work pages · 1 internal anchor

  1. [1]

    Dhruv Agarwal, Mor Naaman, and Aditya Vashistha. 2025. Ai suggestions homogenize writing toward western styles and diminish cultural nuances. InProceedings of the 2025 CHI conference on human factors in computing systems. 1–21

  2. [2]

    2024.Resisting borders and technologies of violence

    Mizue Aizeki, Matt Mahmoudi, Coline Schupfer, and Ruha Benjamin. 2024.Resisting borders and technologies of violence. Haymarket Books

  3. [3]

    If I Had Another Job, I Would Not Accept Data Annotation Tasks

    Roukaya al Hammada. 2024. "If I Had Another Job, I Would Not Accept Data Annotation Tasks. ". https://data-workers.org/roukaya/ Accessed: 2025-05-20

  4. [4]

    Badr AlKhamissi, Muhammad ElNokrashy, Mai Alkhamissi, and Mona Diab. 2024. Investigating Cultural Alignment of Large Language Models. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Lun- Wei Ku, Andre Martins, and Vivek Srikumar (Eds.). Association for Computational Linguistics, Bangkok, T...

  5. [5]

    World Atlas. 2018. A List of Nationalities. https://www.worldatlas.com/articles/what-is-a-demonym-a-list-of-nationalities.html. Accessed: 2025-04-11

  6. [6]

    Solon Barocas, Kate Crawford, Aaron Shapiro, and Hanna Wallach. 2017. The problem with bias: From allocative to representational harms in machine learning. InSIGCIS conference paper

  7. [7]

    Vilna Bashi. 2004. Globalized anti-blackness: Transnationalizing Western immigration law, policy, and practice.Ethnic and Racial Studies27, 4 (2004), 584–606

  8. [8]

    Jeanne Batalova. 2024. Explainer: Who Are Immigrants in the United States? https://www.youtube.com/watch?v=3OHZcfuTvqU Accessed: 2025-05-20

  9. [9]

    2014.Connected sociologies

    Gurminder K Bhambra. 2014.Connected sociologies. Bloomsbury Academic

  10. [10]

    2018.Rethinking racial capitalism: Questions of reproduction and survival

    Gargi Bhattacharyya. 2018.Rethinking racial capitalism: Questions of reproduction and survival. Rowman & Littlefield International, Ltd

  11. [11]

    2024.Everyone who is gone is here: The United States, Central America, and the making of a crisis

    Jonathan Blitzer. 2024.Everyone who is gone is here: The United States, Central America, and the making of a crisis. Penguin

  12. [12]

    Britannica. 2019. https://www.britannica.com/topic/list-of-countries-1993160. Accessed: 2024-12-17

  13. [13]

    Rosemary M Campbell-Stephens and Rosemary M Campbell-Stephens. 2021. Introduction: Global majority decolonising narratives. Educational leadership and the global majority: Decolonising narratives(2021), 1–21

  14. [14]

    2014.Undocumented: How immigration became illegal

    Aviva Chomsky. 2014.Undocumented: How immigration became illegal. Beacon Press

  15. [15]

    Nick Couldry and Ulises A Mejias. 2019. Data colonialism: Rethinking big data’s relation to the contemporary subject.Television & New Media20, 4 (2019), 336–349

  16. [16]

    Kate Crawford. 2017. The Trouble with Bias. Invited talk presented at NeurIPS. https://neurips.cc/virtual/2017/invited-talk/8742 December 2017

  17. [17]

    Oprah Cunningham. 2025. How Anti-Blackness Shapes U.S. Immigration Policy — A Q&A with Attorney and Immigration Policy Advocate Breanne Palmer. https://www.washingtonpost.com/technology/2024/08/04/chatgpt-use-real-ai-chatbot-conversations/ Accessed: 2026-04-21

  18. [18]

    Daniel Dale. 2025. Almost eight years later, Trump confirms he used the phrase ‘shithole countries’. https://www.cnn.com/2025/12/10/ politics/donald-trump-shithole-countries-phrase Accessed: 2026-04-21

  19. [19]

    DHS. 2024. FACT SHEET: DHS Completes First Phase of AI Technology Pilots, Hires New AI Corps Members, Furthers Efforts for Safe and Secure AI Use and Development. https://www.dhs.gov/archive/news/2024/10/30/fact-sheet-dhs-completes-first-phase-ai- technology-pilots-hires-new-ai-corps Accessed: 2025-05-20

  20. [20]

    Roxanne Dunbar-Ortiz. 2021. Not a Nation of Immigrants.Monthly Review73, 4 (09 2021), 17–30. https://www.proquest.com/magazines/ not-nation-immigrants/docview/2571494474/se-2 Name - US Citizenship & Immigration Services; Democratic Party; Copyright - Copyright Monthly Review Press Sep 2021; People - Obama, Barack; Last updated - 2023-01-27

  21. [21]

    Esin Durmus, Karina Nguyen, Thomas I. Liao, Nicholas Schiefer, Amanda Askell, Anton Bakhtin, Carol Chen, Zac Hatfield-Dodds, Danny Hernandez, Nicholas Joseph, Liane Lovitt, Sam McCandlish, Orowa Sikder, Alex Tamkin, Janel Thamkul, Jared Kaplan, Jack Clark, and Deep Ganguli. 2024. Towards Measuring the Representation of Subjective Global Opinions in Langua...

  22. [22]

    Ashutosh Dwivedi, Pradhyumna Lavania, and Ashutosh Modi. 2023. EtiCor: Corpus for Analyzing LLMs for Etiquettes. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 6921–6931. doi:10.18653/v1/2023.emnlp-main.428

  23. [23]

    Tyna Eloundou, Alex Beutel, David G Robinson, Keren Gu-Lemberg, Anna-Luisa Brakman, Pamela Mishkin, Meghan Shah, Johannes Heidecke, Lilian Weng, and Adam Tauman Kalai. 2024. First-Person Fairness in Chatbots.arXiv preprint arXiv:2410.19803(2024). Nationality Bias in LLM-Generated Narratives FAccT ’26, June 25–28, 2026, Montreal, QC, Canada

  24. [24]

    Aaron Fanous, Jacob Goldberg, Ank Agarwal, Joanna Lin, Anson Zhou, Sonnet Xu, Vasiliki Bikia, Roxana Daneshjou, and Sanmi Koyejo

  25. [25]

    InProceedings of the AAAI/ACM Conference on AI, Ethics, and Society, Vol

    Syceval: Evaluating llm sycophancy. InProceedings of the AAAI/ACM Conference on AI, Ethics, and Society, Vol. 8. 893–900

  26. [26]

    Fasica Berhane Gebrekidan. 2024. Content moderation: The harrowing, traumatizing job that left many African data workers with mental health issues and drug dependency. https://data-workers.org/fasica/ Edited by Milagros Miceli, Adio Dinika, Krystal Kauffman, Camilla Salim Wagner, and Laurenz Sachenbacher, Creative Commons BY 4.0. Accessed: 2025-05-20

  27. [27]

    Torsa Ghosal. 2024. Towards Narrative AI Studies. InThe Routledge Handbook of AI and Literature(1st ed.), Will Slocombe and Genevieve Liveley (Eds.). Routledge, Chapter 31. doi:10.4324/9781003255789-31

  28. [28]

    Tarleton Gillespie. 2024. Generative AI and the politics of visibility.Big Data & Society11, 2 (2024), 20539517241252131

  29. [29]

    Ramón Grosfoguel. 2011. Decolonizing post-colonial studies and paradigms of political-economy: Transmodernity, decolonial thinking, and global coloniality.Transmodernity: journal of peripheral cultural production of the luso-hispanic world1, 1 (2011)

  30. [30]

    Yuelin Han, Zhifeng Wu, Pengfei Li, Adam Wierman, and Shaolei Ren. 2024. The Unpaid Toll: Quantifying the Public Health Impact of AI.arXiv preprint arXiv:2412.06288(2024)

  31. [31]

    Karen Hao. 2025. We’re Definitely Going to Build a Bunker Before We Release AGI. https://www.theatlantic.com/technology/archive/ 2025/05/karen-hao-empire-of-ai-excerpt/682798/ Accessed: 2025-05-20

  32. [32]

    Nada Hashmi, Sydney Lodge, Cassidy R Sugimoto, and Thema Monroe-White. 2025. Echoes of Eugenics: Tracing the Ideological Persistence of Scientific Racism in Scholarly Discourse. InProceedings of the 5th ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization. 82–92

  33. [33]

    Ernst Hellinger. 1909. Neue begründung der theorie quadratischer formen von unendlichvielen veränderlichen.Journal für die reine und angewandte Mathematik1909, 136 (1909), 210–271

  34. [34]

    Joseph Henrich, Steven J Heine, and Ara Norenzayan. 2010. The weirdest people in the world?Behavioral and brain sciences33, 2-3 (2010), 61–83

  35. [35]

    2019.How to hide an empire: A short history of the greater United States

    Daniel Immerwahr. 2019.How to hide an empire: A short history of the greater United States. Random House

  36. [36]

    Alex Irwin-Hunt. 2025. OpenAI’s Stargate: a vehicle for US foreign influence? https://www.fdiintelligence.com/content/8aefbeff-54be- 4a8e-88ae-b55d75f57f9e Accessed: 2026-04-21

  37. [37]

    Jared Katzman, Angelina Wang, Morgan Scheuerman, Su Lin Blodgett, Kristen Laird, Hanna Wallach, and Solon Barocas. 2023. Taxonomizing and measuring representational harms: A look at image tagging. InProceedings of the AAAI Conference on artificial intelligence, Vol. 37. 14277–14285

  38. [38]

    Hadas Kotek, Rikker Dockum, and David Sun. 2023. Gender bias and stereotypes in large language models. InProceedings of the ACM collective intelligence conference. 12–24

  39. [39]

    Hwang, Hyunwoo Kim, Sebastin Santy, Taylor Sorensen, Bill Yuchen Lin, Nouha Dziri, Xiang Ren, and Yejin Choi

    Huihan Li, Liwei Jiang, Jena D. Hwang, Hyunwoo Kim, Sebastin Santy, Taylor Sorensen, Bill Yuchen Lin, Nouha Dziri, Xiang Ren, and Yejin Choi. 2024. CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting. arXiv:2404.10199 [cs.CL] https://arxiv.org/abs/2404.10199

  40. [40]

    Dara Lind. 2025. Trump Administration Responds to Tragedy By Putting Hundreds of Thousands of Legal Immigrants’ Lives On Hold. https://www.americanimmigrationcouncil.org/blog/trump-administration-halts-immigration-benefits/ Accessed: 2026-04-21

  41. [41]

    Zhuozhuo Joy Liu, Farhan Samir, Mehar Bhatia, Laura K Nelson, and Vered Shwartz. 2025. Is it bad to work all the time? Cross-cultural evaluation of social norm biases in GPT-4.arXiv preprint arXiv:2505.18322(2025)

  42. [42]

    Austin van Loon, Salvatore Giorgi, Robb Willer, and Johannes Eichstaedt. 2022. Negative Associations in Word Embeddings Predict Anti-black Bias across Regions–but Only via Name Frequency.Proceedings of the International AAAI Conference on Web and Social Media16, 1 (May 2022), 1419–1424. doi:10.1609/icwsm.v16i1.19399

  43. [43]

    Alexandra Sasha Luccioni and Alex Hernandez-Garcia. 2023. Counting carbon: A survey of factors influencing the emissions of machine learning.arXiv preprint arXiv:2302.08476(2023)

  44. [44]

    Li Lucy and David Bamman. 2021. Gender and representation bias in GPT-3 generated stories. InProceedings of the third workshop on narrative understanding. 48–55

  45. [45]

    Josephine Lukito and Meredith L Pruden. 2023. Critical computation: mixed-methods approaches to big language data analysis.Review of Communication23, 1 (2023), 62–78

  46. [46]

    Kristian Lum, Jacy Reese Anthis, Chirag Nagpal, and Alexander D’Amour. 2024. Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation. arXiv:2402.12649 [cs.CL] https://arxiv.org/abs/2402.12649

  47. [47]

    Sara Marcucci. 2025. Developing an Impact Framework of Planetary Justice Impacts of AI. https://aiplanetaryjustice.com/diaries/ developing-an-impact-framework Accessed: 2026-04-21

  48. [48]

    Ulises A Mejias and Nick Couldry. 2024. Data grab: The new colonialism of big tech and how to fight back. InData Grab. University of Chicago Press

  49. [49]

    Merrill and Rachel Lerman

    Jeremy B. Merrill and Rachel Lerman. 2024. What do people really ask chatbots? It’s a lot of sex and homework. https://www. washingtonpost.com/technology/2024/08/04/chatgpt-use-real-ai-chatbot-conversations/ Accessed: 2025-05-20

  50. [50]

    Jennifer Mickel, Maria De-Arteaga, Leqi Liu, and Kevin Tian. 2025. More of the Same: Persistent Representational Harms Under Increased Representation.arXiv preprint arXiv:2503.00333(2025). FAccT ’26, June 25–28, 2026, Montreal, QC, Canada Nguyen et al

  51. [51]

    Shakir Mohamed, Marie-Therese Png, and William Isaac. 2020. Decolonial AI: Decolonial theory as sociotechnical foresight in artificial intelligence.Philosophy & Technology33 (2020), 659–684

  52. [52]

    Anjishnu Mukherjee, Chahat Raj, Ziwei Zhu, and Antonios Anastasopoulos. 2023. Global Voices, Local Biases: Socio-Cultural Prejudices across Languages. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 15828–15845....

  53. [53]

    NAFSA. 2025. Travel Bans and Restrictions. https://www.nafsa.org/regulatory-information/travel-bans-and-restrictions Accessed: 2026-04-21

  54. [54]

    Having Beer after Prayer? Measuring Cultural Bias in Large Language Models,

    Tarek Naous, Michael J. Ryan, Alan Ritter, and Wei Xu. 2024. Having Beer after Prayer? Measuring Cultural Bias in Large Language Models. arXiv:2305.14456 [cs.CL] https://arxiv.org/abs/2305.14456

  55. [55]

    The Group of 77. 2021. https://www.g77.org/doc/members_print.html. Accessed: 2024-12-17

  56. [56]

    Shiva Omrani Sabbaghi, Robert Wolfe, and Aylin Caliskan. 2023. Evaluating biased attitude associations of language models in an intersectional context. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society. 542–553

  57. [57]

    OpenAI. 2025. Introducing OpenAI for Countries. https://openai.com/global-affairs/openai-for-countries/ Accessed: 2025-05-19

  58. [58]

    Shramay Palta and Rachel Rudinger. 2023. FORK: A Bite-Sized Test Set for Probing Culinary Cultural Biases in Commonsense Reasoning Models. InFindings of the Association for Computational Linguistics: ACL 2023, Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toronto, Canada, 9952–9962. doi:10.18653/v1/...

  59. [59]

    Alicia Parrish, Angelica Chen, Nikita Nangia, Vishakh Padmakumar, Jason Phang, Jana Thompson, Phu Mon Htut, and Samuel R. Bowman. 2021. BBQ: A Hand-Built Bias Benchmark for Question Answering.CoRRabs/2110.08193 (2021). arXiv:2110.08193 https: //arxiv.org/abs/2110.08193

  60. [60]

    Vinodkumar Prabhakaran, Rida Qadri, and Ben Hutchinson. 2022. Cultural incongruencies in artificial intelligence.arXiv preprint arXiv:2211.13069(2022)

  61. [61]

    Jill Walker Rettberg and Hermann Wigers. 2025. AI-generated stories favour stability over change: homogeneity and cultural stereotyping in narratives generated by gpt-4o-mini.arXiv preprint arXiv:2507.22445(2025)

  62. [62]

    Paola Ricaurte. 2022. Ethics for the majority world: AI and the question of violence at scale.Media, Culture & Society44, 4 (2022), 726–745

  63. [63]

    Charles Rollet. 2025. Scale AI is facing a third worker lawsuit in about a month. https://techcrunch.com/2025/01/22/scale-ai-is-facing- a-third-worker-lawsuit-in-about-a-month/ Accessed: 2025-05-20

  64. [64]

    Ryan, William Held, and Diyi Yang

    Michael J. Ryan, William Held, and Diyi Yang. 2024. Unintended Impacts of LLM Alignment on Global Representation. arXiv:2402.15018 [cs.CL] https://arxiv.org/abs/2402.15018

  65. [65]

    Edward W Said. 1977. Orientalism.The Georgia Review31, 1 (1977), 162–206

  66. [66]

    Abel Salinas, Parth Shah, Yuzhong Huang, Robert McCormack, and Fred Morstatter. 2023. The unequal opportunities of large language models: Examining demographic biases in job recommendations by chatgpt and llama. InProceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization. 1–15

  67. [67]

    Renee Shelby, Shalaleh Rismani, Kathryn Henne, AJung Moon, Negar Rostamzadeh, Paul Nicholas, N’Mah Yilla-Akbari, Jess Gallegos, Andrew Smart, Emilio Garcia, and Gurleen Virk. 2023. Sociotechnical Harms of Algorithmic Systems: Scoping a Taxonomy for Harm Reduction. InProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society(Montréal, QC, Canad...

  68. [68]

    AI literacy

    Evan Shieh. 2025. Stop calling it “AI literacy” if it doesn’t teach history. https://www.civicsoftechnology.org/blog/stop-calling-it-ai- literacy-if-it-doesnt-teach-history Accessed: 2026-04-21

  69. [69]

    Evan Shieh and Thema Monroe-White. 2025. Teaching Parrots to See Red: Self-Audits of Generative Language Models Overlook Sociotechnical Harms. InProceedings of the AAAI Symposium Series, Vol. 6. 333–340

  70. [70]

    Evan Shieh, Faye-Marie Vassel, Cassidy R Sugimoto, and Thema Monroe-White. 2026. Intersectional biases in narratives produced by open-ended prompting of generative language models.Nature Communications(2026)

  71. [71]

    Jaemarie Solyst, Ellia Yang, Shixian Xie, Amy Ogan, Jessica Hammer, and Motahhare Eslami. 2023. The potential of diverse youth as stakeholders in identifying and mitigating algorithmic bias for a future of fairer AI.Proceedings of the ACM on Human-Computer Interaction7, CSCW2 (2023), 1–27

  72. [72]

    Rebecca Tan and Regine Cabato. 2023. Behind the AI boom, an army of overseas workers in ‘digital sweatshops’. https://www. washingtonpost.com/world/2023/08/28/scale-ai-remotasks-philippines-artificial-intelligence/ Accessed: 2025-05-20

  73. [73]

    Raphael Tang, Xinyu Zhang, Jimmy Lin, and Ferhan Ture. 2023. What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations. arXiv:2311.18812 [cs.CL] https://arxiv.org/abs/2311.18812

  74. [74]

    Yan Tao, Olga Viberg, Ryan S Baker, and René F Kizilcec. 2024. Cultural bias and cultural alignment of large language models.PNAS Nexus3, 9 (Sept. 2024). doi:10.1093/pnasnexus/pgae346

  75. [75]

    Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE.Journal of machine learning research9, 11 (2008)

  76. [76]

    Vassel and Evan Shieh

    Faye-Marie. Vassel and Evan Shieh. 2024. HAI Seminar: Intersectional Biases in Generative Language Models and Their Psychosocial Impacts. https://www.youtube.com/watch?v=3OHZcfuTvqU Accessed: 2025-05-20. Nationality Bias in LLM-Generated Narratives FAccT ’26, June 25–28, 2026, Montreal, QC, Canada

  77. [77]

    Faye-Marie Vassel, Evan Shieh, Cassidy R Sugimoto, and Thema Monroe-White. 2024. The Psychosocial Impacts of Generative AI Harms. InProceedings of the AAAI Symposium Series, Vol. 3. 440–447

  78. [78]

    2013.Undoing border imperialism

    Harsha Walia. 2013.Undoing border imperialism. Vol. 6. Ak Press

  79. [79]

    2021.Border and rule: Global migration, capitalism, and the rise of racist nationalism

    Harsha Walia. 2021.Border and rule: Global migration, capitalism, and the rise of racist nationalism. Haymarket books

  80. [80]

    ``Kelly is a Warm Person, Joseph is a Role Model'': Gender Biases in LLM -Generated Reference Letters

    Yixin Wan, George Pu, Jiao Sun, Aparna Garimella, Kai-Wei Chang, and Nanyun Peng. 2023. “Kelly is a Warm Person, Joseph is a Role Model”: Gender Biases in LLM-Generated Reference Letters. InFindings of the Association for Computational Linguistics: EMNLP 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Sin...

Showing first 80 references.