pith. sign in

arxiv: 2404.07475 · v2 · submitted 2024-04-11 · 💻 cs.CL · cs.AI· cs.CY· cs.LG

Laissez-Faire Harms: Algorithmic Biases in Generative Language Models

Pith reviewed 2026-05-24 01:58 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.CYcs.LG
keywords generative language modelsalgorithmic biasopen-ended promptsintersectional identitiessubordinationstereotypingharms of omissionlaissez-faire setting
0
0 comments X

The pith

Generative language models produce hundreds to thousands of times more subordinated portrayals of minoritized identities even when prompts leave identity unspecified.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests generative language models on open-ended prompts that do not mention race, gender, or sexual orientation. It documents that outputs from five widely used models systematically omit, subordinate, or stereotype people with intersectional minoritized identities at far higher rates than they produce representative or empowering portrayals. These patterns match known psychological harms such as stereotype threat. The work treats this laissez-faire prompting as a closer stand-in for ordinary consumer use than explicit identity prompts.

Core claim

In the laissez-faire setting of open-ended prompts, synthetically generated texts from ChatGPT3.5, ChatGPT4, Claude2.0, Llama2, and PaLM2 perpetuate harms of omission, subordination, and stereotyping for minoritized individuals with intersectional race, gender, and/or sexual orientation identities, making such individuals hundreds to thousands of times more likely to encounter LM-generated outputs that portray their identities in a subordinated manner compared to representative or empowering portrayals, while also reproducing stereotypes known to trigger psychological harms.

What carries the argument

The laissez-faire prompting method, in which identity classifications are left unspecified, paired with systematic coding of generated texts into the three harm categories of omission, subordination, and stereotyping.

If this is right

  • Minoritized consumers face elevated exposure to subordinating content in routine interactions with language models.
  • Stereotypes such as the perpetual foreigner appear in outputs and can activate stereotype threat that impairs performance and self-perception.
  • Existing bias-mitigation techniques focused on explicit identity prompts leave open-ended use cases largely unaddressed.
  • Consumer protections and targeted AI education programs become necessary to offset the documented harms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Mitigation efforts may need to target the base training data or decoding procedures rather than prompt engineering alone.
  • Widespread deployment of these models could widen existing disparities in how different groups experience everyday information tools.
  • Regulators could treat generative outputs similarly to other media that carry documented group-based harms.
  • Testing the same models on prompts drawn from actual user logs would provide a stronger check on ecological validity.

Load-bearing premise

Open-ended prompts form a representative sample of everyday consumer use and that analysts can categorize harms objectively without introducing their own bias.

What would settle it

A replication that finds the rate of subordinated portrayals for minoritized identities in model outputs is statistically indistinguishable from the rate for non-minoritized identities or from rates in comparable human-written texts.

Figures

Figures reproduced from arXiv: 2404.07475 by Cassidy Sugimoto, Evan Shieh, Faye-Marie Vassel, Thema Monroe-White.

Figure 1
Figure 1. Figure 1: Likelihoods by Race, Sexual Orientation, and Gender. [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overall Subordination Ratios by Gender and Race. 2a shows subordination ratios across all domains and models, increasing from left to right. Ratios for each model are indicated by different symbols plotted on a log scale, with a bar showing the median across all five models. Redder colors represent greater degrees of statistical confidence (p-values for the ratio distribution), compared against the null hy… view at source ↗
Figure 3
Figure 3. Figure 3: Subordination Ratios by Name and Racial Likelihoods. 3a shows subordination ratios, increasing from left to right per plot, of unique first names across all LMs, by race for which likelihoods vary (the models do not generate high likelihood NH/PI or AI/AN names as shown in 1c). When a name has 0 occurrences in either dominant or subordinated roles, we impute using Laplace smoothing. 3b plots overall subord… view at source ↗
read the original abstract

The rapid deployment of generative language models (LMs) has raised concerns about social biases affecting the well-being of diverse consumers. The extant literature on generative LMs has primarily examined bias via explicit identity prompting. However, prior research on bias in earlier language-based technology platforms, including search engines, has shown that discrimination can occur even when identity terms are not specified explicitly. Studies of bias in LM responses to open-ended prompts (where identity classifications are left unspecified) are lacking and have not yet been grounded in end-consumer harms. Here, we advance studies of generative LM bias by considering a broader set of natural use cases via open-ended prompting. In this "laissez-faire" setting, we find that synthetically generated texts from five of the most pervasive LMs (ChatGPT3.5, ChatGPT4, Claude2.0, Llama2, and PaLM2) perpetuate harms of omission, subordination, and stereotyping for minoritized individuals with intersectional race, gender, and/or sexual orientation identities (AI/AN, Asian, Black, Latine, MENA, NH/PI, Female, Non-binary, Queer). We find widespread evidence of bias to an extent that such individuals are hundreds to thousands of times more likely to encounter LM-generated outputs that portray their identities in a subordinated manner compared to representative or empowering portrayals. We also document a prevalence of stereotypes (e.g. perpetual foreigner) in LM-generated outputs that are known to trigger psychological harms that disproportionately affect minoritized individuals. These include stereotype threat, which leads to impaired cognitive performance and increased negative self-perception. Our findings highlight the urgent need to protect consumers from discriminatory harms caused by language models and invest in critical AI education programs tailored towards empowering diverse consumers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that in a 'laissez-faire' setting of open-ended prompts without explicit identity terms, five major generative LMs (ChatGPT-3.5/4, Claude 2.0, Llama 2, PaLM 2) produce outputs that disproportionately inflict harms of omission, subordination, and stereotyping on minoritized intersectional identities (AI/AN, Asian, Black, Latine, MENA, NH/PI, Female, Non-binary, Queer). It quantifies that such individuals are 'hundreds to thousands of times more likely' to encounter subordinated portrayals than representative or empowering ones, and documents stereotypes (e.g., perpetual foreigner) known to trigger psychological harms such as stereotype threat.

Significance. If the measurement pipeline were shown to be reliable, the work would usefully extend bias research from explicit-identity prompting to naturalistic consumer use cases and would strengthen calls for consumer protections and targeted AI education. The emphasis on end-consumer harms and intersectionality is a constructive framing.

major comments (2)
  1. [Methods] Methods section: no inter-rater reliability statistics, coding manual, sample sizes for prompts/outputs, or controls for prompt sensitivity are reported. These details are load-bearing for the central 'hundreds to thousands of times more likely' ratio, which is produced by human classification of free-form text into omission/subordination/stereotyping categories.
  2. [Results] Results / Abstract: the extreme likelihood ratios rest on interpretive coding of LM outputs without reported blinding, replication, or sensitivity analysis; the ratios are therefore sensitive to how 'subordinated manner' is operationalized and to individual annotator thresholds.
minor comments (2)
  1. [Abstract] Abstract: the list of identities mixes racial/ethnic, gender, and sexual-orientation categories without explicit justification for the intersectional sampling frame.
  2. [Introduction] Notation: 'laissez-faire' is used as a technical term for open-ended prompting; a brief definition or contrast with prior explicit-prompting literature would aid clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which identify key areas for improving methodological transparency. We address each major comment below and will revise the manuscript to incorporate additional details where the original submission was incomplete.

read point-by-point responses
  1. Referee: [Methods] Methods section: no inter-rater reliability statistics, coding manual, sample sizes for prompts/outputs, or controls for prompt sensitivity are reported. These details are load-bearing for the central 'hundreds to thousands of times more likely' ratio, which is produced by human classification of free-form text into omission/subordination/stereotyping categories.

    Authors: We acknowledge that the Methods section does not report inter-rater reliability statistics, a full coding manual, exact sample sizes, or explicit controls for prompt sensitivity. These omissions limit the ability to fully assess the reliability of the human-coded ratios. In the revised manuscript, we will add these elements, including IRR metrics such as Cohen's kappa or Fleiss' kappa, the complete coding manual with category definitions and examples, precise counts of prompts and outputs per model and identity category, and any sensitivity checks on prompt variations. This will directly support the validity of the reported likelihood ratios. revision: yes

  2. Referee: [Results] Results / Abstract: the extreme likelihood ratios rest on interpretive coding of LM outputs without reported blinding, replication, or sensitivity analysis; the ratios are therefore sensitive to how 'subordinated manner' is operationalized and to individual annotator thresholds.

    Authors: We agree that the lack of reported blinding, replication procedures, and sensitivity analysis in the Results section leaves the ratios vulnerable to variations in operationalization and annotator judgment. The revised version will include details on the annotation protocol (including blinding where applicable), replication across multiple annotators, and sensitivity analyses that test alternative definitions of 'subordinated,' 'representative,' and 'empowering' categories. We will also note any limitations arising from annotator thresholds. These additions will be made to strengthen the presentation of the findings. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical classification of model outputs with no derivations or self-referential reductions

full rationale

The paper reports direct empirical observations of LM outputs under open-ended prompts, followed by human categorization into harms of omission, subordination, and stereotyping. No equations, fitted parameters, predictions derived from inputs, or mathematical derivations appear in the provided text. Central claims rest on counted frequencies in classified outputs rather than any self-definitional, fitted-input, or self-citation load-bearing steps that reduce results to inputs by construction. Self-citations, if present, are not invoked to justify uniqueness theorems or ansatzes that close a circular loop. The analysis is therefore self-contained against external benchmarks of model behavior.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim depends on unstated choices in prompt selection and harm categorization that function as domain assumptions; no free parameters or invented entities are visible in the abstract.

axioms (2)
  • domain assumption Open-ended prompts without identity terms represent natural consumer use cases.
    The laissez-faire framing treats these prompts as proxies for real-world interactions where users do not specify identities.
  • domain assumption Harms of omission, subordination, and stereotyping can be reliably identified and quantified from model text.
    The paper's ratio claims rest on this measurement step being reproducible and unbiased.

pith-pipeline@v0.9.0 · 5867 in / 1433 out tokens · 29338 ms · 2026-05-24T01:58:01.359387+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

181 extracted references · 181 canonical work pages · 6 internal anchors

  1. [1]

    These Women Tried to Warn Us About AI

    O’Neil, L. These Women Tried to Warn Us About AI. Rolling Stone (2023). Available at: https://www.rollingstone.com/culture/culture- features/women-warnings-ai-danger-risk-before-chatgpt-1234804367/ (Accessed: 17th December 2023)

  2. [2]

    doi: 10.18653/v1/D19-1339

    Sheng, E., Chang, K.-W., Natarajan, P. & Peng, N. The Woman Worked as a Babysitter: On Biases in Language Generation. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9 th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (2019). Doi:10.18653/v1/d19-1339

  3. [3]

    Discrimination in Online Ad Delivery

    Sweeney, L. Discrimination in Online Ad Delivery. Queue 11, 10–29 (2013)

  4. [4]

    Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell

    Dhamala, J. Sun, T., Kumar, V., Krishna, S., Pruksachatkun, Y., Chang, K. W., & Gupta, R. Bold: Dataset and Metrics for Measuring Biases in Open- Ended Language Generation. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (2021). Doi:10.1145/3442188.3445924

  5. [5]

    Language (Technology) is Power: A Critical Survey of ``Bias'' in NLP

    Blodgett, S. L., Barocas, S., Daumé III, H. & Wallach, H. Language (technology) IS POWER: A Critical Survey of “Bias” in NLP. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020). Doi:10.18653/v1/2020.acl-main.485

  6. [6]

    What Exactly Are the Dangers Posed by A.I.? The New York Times (2023)

    Metz, C. What Exactly Are the Dangers Posed by A.I.? The New York Times (2023). Available at: https://www.nytimes.com/2023/05/01/technology/ai- problems-danger-chatgpt.html (Accessed: 17th December 2023)

  7. [7]

    & Casey, D

    Nguyen, T., Jump, A. & Casey, D. Emerging Tech Impact Radar: 2023. Gartner (2023). Available at: https://www.gartner.com/en/doc/emerging- technologies-and-trends-impact-radar-excerpt. (Accessed: 17th December 2023)

  8. [8]

    ChatGPT has entered the classroom: How LLMs could transform education

    Extance, A. ChatGPT has entered the classroom: How LLMs could transform education. Nature 623, 474–477 (2023)

  9. [9]

    Sal Khan. 2023. How AI could save (not destroy) education. Sal Khan: How AI could save (not destroy) education , TED Talk (April 2023)

  10. [10]

    The Future of Education? California Teachers Association (2023)

    Peeples, J. The Future of Education? California Teachers Association (2023). Available at: https://www.cta.org/educator/posts/the-future-of-education. (Accessed: 17th December 2023)

  11. [11]

    Teaching with AI (2023)

    OpenAI. Teaching with AI (2023). Available at: https://openai.com/blog/teaching-with-ai. (Accessed: 17th December 2023)

  12. [12]

    Hayden Field. 2024. OpenAI Announces First Partnership With a University. CNBC (2024). Retrieved from: https://www.cnbc.com/2024/01/18/openai-announces-first-partnership-with-a-university.html (Accessed: 19th January 2024)

  13. [13]

    Chow, A. R. Why People Are Confessing Their Love for AI Chatbots. Time (2023). Available at: https://time.com/6257790/ai-chatbots-love/. (Accessed: 17th December 2023)

  14. [14]

    Using A.I

    Carballo, R. Using A.I. to Talk to the Dead. The New York Times (2023). Available at: https://www.nytimes.com/2023/12/11/technology/ai-chatbots- dead-relatives.html. (Accessed: 17th December 2023)

  15. [15]

    In Hollywood Writers’ Battle Against AI, Humans Win (For Now)

    Coyle, J. In Hollywood Writers’ Battle Against AI, Humans Win (For Now). AP News (2023). Available at: https://apnews.com/article/17sian17ood- ai-strike-wga-artificial-intelligence-39ab72582c3a15f77510c9c30a45ffc8. (Accessed: 17 th December 2023)

  16. [16]

    Eating Disorder Helpline Takes Down Chatbot After it Gave Weight Loss Advice

    Wells, K. Eating Disorder Helpline Takes Down Chatbot After it Gave Weight Loss Advice. NPR (2023). Available at: https://www.npr.org/2023/06/08/1181131532/eating -disorder-helpline-takes-down-chatbot-after-it-gave-weight-loss-advice. (Accessed: 17 th December 2023)

  17. [17]

    Noble, S. U. Algorithms of Oppression: How Search Engines Reinforce Racism. (New York University Press, 2018)

  18. [18]

    Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell

    Bender, E. M., Gebru, T., McMillan -Major, A. & Shmitchell, S. On the Dangers of Stochastic Parrots. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (2021). Doi:10.1145/3442188.344592

  19. [19]

    Race After Technology: Abolitionist Tools for the New Jim Code

    Benjamin, R. Race After Technology: Abolitionist Tools for the New Jim Code. (John Wiley & Sons, 2019)

  20. [20]

    Amazon Scraps Secret AI Recruiting Tool That Showed Bias Against Women

    Dastin, J. Amazon Scraps Secret AI Recruiting Tool That Showed Bias Against Women. Reuters (2018). Available at: https://jp.reuters.com/article/us- amazon-com-jobs-automation-insight-idUSKCN1MK08G. (Accessed: 17 th December 2023)

  21. [21]

    Math is Hard!

    Steele, J. R. & Ambady, N. “Math is Hard!” The Effect of Gender Priming on Women’s Attitudes. Journal of Experimental Social Psychology 42, 428–436 (2006)

  22. [22]

    A., Fujita, K

    Shih, M., Ambady, N., Richeson, J. A., Fujita, K. & Gray, H. M. Stereotype Performance Boosts: The Impact of Self-relevance and the Manner of Stereotype Activation. Journal of Personality and Social Psychology 83, 638–647 (2002)

  23. [23]

    & Gebru, T

    Buolamwini, J. & Gebru, T. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. Proceedings of the 1 st Conference on Fairness, Accountability and Transparency 77–91 (PMLR, 2018)

  24. [24]

    & Asano, Y., 2021

    Kirk, H.R., Jun, Y., Volpin, F., Iqbal, H., Benussi, E., Dreyer, F., Shtedritski, A. & Asano, Y., 2021. Bias Out-of-the-box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models. Advances in Neural Information Processing Systems (2021)

  25. [25]

    & Agarwal, S., et al

    Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A. & Agarwal, S., et al. Language Models are Few-shot Learners. Advances in Neural Information Processing Systems (2020)

  26. [26]

    & Ganguli, D

    Tamkin, A., Askell, A., Lovitt, L., Durmus, E., Joseph, N., Kravec, S., Nguyen, K., Kaplan, J. & Ganguli, D.. Evaluating and Mitigating Discrimination in Language Model Decisions. arXiv preprint arXiv:2312.03689 (2023)

  27. [27]

    & Zou, L

    Cao, Y., Sotnikova, A., Daumé III, H., Rudinger, R. & Zou, L. Theory-grounded Measurement of U.S. Social Stereotypes in English Language Models. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2022). Doi:10.18653/v1/2022.naacl-main.92

  28. [28]

    & Mullainathan, S

    Bertrand, M. & Mullainathan, S. Are Emily and Greg More Employable than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination. American Economic Review (2003). https://doi.org/10.3386/w9873

  29. [29]

    & Lee, T

    Bommasani, R., Liang, P. & Lee, T. Holistic Evaluation of Language Models. A nnals of the New York Academy of Sciences (2023)

  30. [30]

    Llama 2: Open Foundation and Fine-Tuned Chat Models

    Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S . & Bikel, D. Llama 2: Open Foundation and Fine-tuned Chat Models. arXiv preprint arXiv:2307.09288 (2023). 18

  31. [31]

    Scaling Instruction-Finetuned Language Models

    Chung, H.W., Hou, L., Longpre, S., Zoph, B., Tay, Y., Fedus, W., Li, Y., Wang, X., Dehghani, M., Brahma, S. & Webson, A.. Sca ling Instruction- Finetuned Language Models. arXiv preprint arXiv:2210.11416 (2022)

  32. [32]

    PaLM 2 Technical Report

    Anil, R., Dai, A.M., Firat, O., Johnson, M., Lepikhin, D., Passos, A., Shakeri, S., Taropa, E., Bailey, P., Chen, Z. & Chu, E. PaLM 2 Technical Report. arXiv preprint arXiv:2305.10403 (2023)

  33. [33]

    & Smith-Loud, J

    Hanna, A., Denton, E., Smart, A. & Smith-Loud, J. Towards a Critical Race Methodology in Algorithmic Fairness. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (2020). Doi:10.1145/3351095.3372826

  34. [34]

    L., Waseem, Z

    Field, A., Blodgett, S. L., Waseem, Z. & Tsvetkov, Y. A Survey of Race, Racism, and Anti -Racism in NLP. Proceedings of the 59 th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (2021). Doi:10.18653/v1/2021.acl-long.149

  35. [35]

    Fealing, K. H. & Incorvaia, A. D. Understanding Diversity: Overcoming the Small-N Problem. Harvard Data Science Review (2022). Available at: https://hdsr.mitpress.mit.edu/pub/vn6ib3o5/release/1. (Accessed: 17 th December 2023)

  36. [36]

    Mapping the Margins: Intersectionality, Identity Politics, and Violence Against Women of Color

    Crenshaw, K.W. Mapping the Margins: Intersectionality, Identity Politics, and Violence Against Women of Color. Stanford Law R eview 43, 1241 (1991)

  37. [37]

    Intersectionally fair

    Kong, Y. Are “Intersectionally fair” AI Algorithms Really Fair to Women of color? A Philosophical Analysis. 2022 ACM Conference on Fairness, Accountability, and Transparency (2022). Doi:10.1145/3531146.3533114

  38. [38]

    Cho, S., Crenshaw, K. W. & McCall, L. Toward a Field of Intersectionality Studies: Theory, Applications, and Praxis. Signs: Journal of Women in Culture and Society 38, 785–810 (2013)

  39. [39]

    & Chang, K

    Ovalle, A., Subramonian, A., Gautam, V., Gee, G. & Chang, K. -W. Factoring the Matrix of Domination: A Critical Review and Reimagination of Intersectionality in AI Fairness. Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society (2023). Doi:10.1145/3600211.3604705

  40. [40]

    Steele, C. M. & Aronson, J. Stereotype Threat and the Intellectual Test Performance of African Americans. Journal of Personality and Social Psychology 69, 5 (1995), 797-811. https://doi.org/10.1037//0022-3514.69.5.797

  41. [41]

    G., Spencer, S

    Davies, P. G., Spencer, S. J., Quinn, D. M., & Gerhardstein, R. Consuming Images: How Television Commercials that Elicit Stereotype Threat Can Restrain Women Academically and Professionally . Personality and Social Psychology Bulletin 28, 12 (2002), 1615 -1628. https://doi.org/10.1177/014616702237644

  42. [42]

    Devine, P. G. Stereotypes and Prejudice: Their Automatic and Controlled Components. Journal of Personality and Social Psychology 56, 1 (1989), 5-

  43. [43]

    https://doi.org/10.1037//0022-3514.56.1.5

  44. [44]

    A Future Denied

    Elliott-Groves, E. & Fryberg, S. A. “A Future Denied” for Young Indigenous People: From Social Disruption to Possible Futures. Handbook of Indigenous Education 1–19 (2017). Doi:10.1007/978-981-10-1839-8_50-1

  45. [45]

    & Virk, G

    Shelby, R., Rismani, S., Henne, K., Moon, A., Rostamzadeh, N., Nicholas, P., Yilla -Akbari, N.M., Gallegos, J., Smart, A., Garcia, E. & Virk, G . Sociotechnical Harms of Algorithmic Systems: Scoping a Taxonomy for Harm Reduction. Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society (2023). Doi:10.1145/3600211.3604673

  46. [46]

    & Nelson, A

    Lazar, S. & Nelson, A. AI Safety on Whose Terms? Science 381, 138–138 (2023)

  47. [47]

    (2021, September)

    Monroe-White, T., Marshall, B., & Contreras -Palacios, H. (2021, September). Waking up to Marginalization: Public Value Failures in Artificial Intelligence and Data Science. In Artificial Intelligence Diversity, Belonging, Equity, and Inclusion (pp. 7 -21). PMLR

  48. [48]

    & Metz, C

    Griffith, E. & Metz, C. A New Area of A.I. Booms, Even Amid the Tech Gloom. The New York Times (2023)

  49. [49]

    White House

    U.S. White House. FACT SHEET: Biden -Harris Administration Secures Voluntary Commitments from Leading Artificial Intelligence Companies to Manage the Risks Posed by AI. The White House (2023). Available at: https://www.whitehouse.gov/briefing-room/statements- releases/2023/07/21/fact-sheet-biden-harris-administration-secures-voluntary-commitments-from-lea...

  50. [50]

    Bargh, J. A. & Chartrand, T. L. Studying the Mind in the Middle: A Practical Guide to Priming and Automaticity. Handbook of Research Methods in Social and Personality Psychology, 2 (2000), 253-285

  51. [51]

    A., Chen, M

    Bargh, J. A., Chen, M. & Burrows, L. Automaticity of Social Behavior: Direct Effects of Trait Construct and Stereotype Activation on Action. Journal of Personality and Social Psychology, 71 (1996), no. 2, 230

  52. [52]

    Aronson, J., Quinn, D. M. & Spencer, S. J. Stereotype Threat and the Academic Underperformance of Minorities and Women. Prejudice, Academic Press (1998), 83-103

  53. [53]

    M., Blanton, H

    Gonzales, P. M., Blanton, H. & Williams, K. J. The Effects of Stereotype Threat and Double -Minority Status on the Test Performance of Latino Women. Personality and Social Psychology Bulletin 28, 5 (2002), 659-670. https://doi.org/10.1177/0146167202288010

  54. [54]

    Office of Management and Budget

    U.S. Office of Management and Budget. Initial Proposals for Updating OMB’s Race and Ethnicity Statistical Standards. Federal Register (2023). Available at: https://www.federalregister.gov/documents/2023/01/27/2023 -01635/initial-proposals-for-updating-ombs-race-and-ethnicity-statistical- standards (Accessed: 17th December 2023)

  55. [55]

    & Kalai, A.T

    Bolukbasi, T., Chang K.W., Zou, J., Saligrama, V. & Kalai, A.T. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. Advances in Neural Information Processing Systems (2016)

  56. [56]

    & Sugimoto, C.R

    Kozlowski, D., Murray, D.S., Bell, A., Hulsey, W., Larivière, V., Monroe -White, T. & Sugimoto, C.R. Avoiding Bias When Inferring Race Using Name-based Approaches. PLOS One (2022)

  57. [57]

    Florida Voter Registration Data (2017 and 2022)

    Sood, G. Florida Voter Registration Data (2017 and 2022). (2022) doi:10.7910/DVN/UBIG3F

  58. [58]

    T., Himmelstein, D

    Le, T. T., Himmelstein, D. S., Hippen, A. A., Gazzara, M. R. & Greene, C. S. Analysis of Scientific Society Honors Reveals Disparities. Cell Systems 12, 900-906.e5 (2021). https://doi.org/10.1016/j.cels.2021.07.007

  59. [59]

    & Chang, K.W

    Dev, S., Sheng, E., Zhao, J., Amstutz, A., Sun, J., Hou, Y., Sanseverino, M., Kim, J., Nishi, A., Peng, N. & Chang, K.W . On Measures of Biases and Harms in NLP. In Findings of the Association for Computational Linguistics: AACL -IJCNLP (2022). 19

  60. [60]

    & Jernite, Y

    Luccioni, S., Akiki, C., Mitchell, M. & Jernite, Y. Stable Bias: Evaluating Societal Representations in Diffusion Models. Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2023)

  61. [61]

    & Watson, T

    Deng, B. & Watson, T. LGBTQ+ data availability. Brookings (2023). Available at: https://www.brookings.edu/articles/lgbtq-data-availability-what- we-can-learn-from-four-major-surveys/ (Accessed: 17th December 2023)

  62. [62]

    & Smalarz, L

    Huynh, Q.-L., Devos, T. & Smalarz, L. Perpetual Foreigner in One’s Own Land: Potential Implications for Identity and Psychological Adju stment. Journal of Social and Clinical Psychology 30, 133–162 (2011)

  63. [63]

    & Varshney, L

    Hemmatian, B. & Varshney, L. R. Debiased Large Language Models Still Associate Muslims with Uniquely Violent Acts. PsyArXiv Prepints (2022). Doi:10.31234/osf.io/xpeka

  64. [64]

    Recent Developments: Hitting the Ceiling: An Examination of Barriers to Success for Asian American Women

    Li, P. Recent Developments: Hitting the Ceiling: An Examination of Barriers to Success for Asian American Women. Berkeley Journal of Gender, Law & Justice 29, (2014)

  65. [65]

    T., Valencia, B

    Steketee, A., Williams, M. T., Valencia, B. T., Printz, D. & Hooper, L. M. Racial and Language Microaggressions in the School Ecology. Perspectives on Psychological Science 16, 1075–1098 (2021)

  66. [66]

    Aronson, B. A. The White Savior Industrial Complex: A Cultural Studies Analysis of a teacher educator, savior film, and futur e teachers. Journal of Critical Thought and Praxis 6, (2017)

  67. [67]

    Waugh, L. R. Marked and Unmarked: A Choice Between Unequals in Semiotic Structure. Semiotica 38, (1982)

  68. [68]

    & May, J

    Felkner, V.K., Chang, H.C.H., Jang, E. & May, J. WinoQueer: A Community -in-the-Loop Benchmark for Anti -LGBTQ+ Bias in Large Language Models. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (2023)

  69. [69]

    Solorzano, C. D. & Hernandez, C. T. Evaluating Machine Perception of Indigeneity: An Analysis of ChatGPT’s Perceptions of Ind igenous Roles in Diverse Scenarios. arXiv preprint arXiv:2310.09237 (2023) doi:10.13140/RG.2.2.30617.39520

  70. [70]

    Frozen in time

    Leavitt, P. A., Covarrubias, R., Perez, Y. A. & Fryberg, S. A. “Frozen in time”: The Impact of Native American Media Representations on Identity and Self‐understanding. Journal of Social Issues 71, 39–53 (2015)

  71. [71]

    An Indigenous Peoples’ History of the United States

    Dunbar-Ortiz, R. An Indigenous Peoples’ History of the United States. Beacon Press (2014)

  72. [72]

    Schopmans, H. R. From Coded Bias to Existential Threat. Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society (2022). Doi:10.1145/3514094.3534161

  73. [73]

    A General Language Assistant as a Laboratory for Alignment

    Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., Jones, A., Joseph, N., Mann, B., DasSarma, N. & Elhage, N. A general language assistant as a laboratory for alignment. arXiv preprint arXiv:2112.00861 (2021)

  74. [74]

    How We’ve Created a Helpful and Responsible Bard Experience for Teens

    Doshi, T. How We’ve Created a Helpful and Responsible Bard Experience for Teens. Google: The Keyword – Product Updates (November 2023). Retrieved from: https://blog.google/products/bard/google-bard-expansion-teens/

  75. [75]

    Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned

    Ganguli, D., Lovitt, L., Kernion, J., Askell, A., Bai, Y., Kadavath, S., Mann, B., Perez, E., Schiefer, N., Ndousse, K. & Jones, A. Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned. arXiv preprint arXiv:2209.07858 (2022)

  76. [76]

    GPT-4 Technical Report

    OpenAI. GPT-4 Technical Report. arXiv preprint arXiv:2303.08774v4 (2023)

  77. [77]

    & Liang, P

    Bommasani, R., Klyman, K., Longpre, S., Kapoor, S., Maslej, N., Xiong, B., Zhang, D. & Liang, P. The Foundation Model Transpa rency Index. arXiv:2310.12941. Retrieved from https://arxiv.org/abs/2310.12941

  78. [78]

    Bridgland, V. M. E., Jones, P. J. & Bellet, B. W. A Meta-Analysis of the Efficacy of Trigger Warnings, Content Warnings, and Content Notes. Clinical Psychological Science (2022)

  79. [79]

    United States, Department of Justice, Civil Rights Division, Southern District of New York. 2022. National Coalition on Black Civic Participation vs. Wohl – Statement of Interest of the United States of America. United States Department of Justice (2022). https://www.justice.gov/d9/case- documents/attachments/2022/08/12/ncbp_v_wohl_us_soi_filed_8_12_22_ro_tag.pdf

  80. [80]

    W., Wallach, H., Daumé, H

    Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., Wallach, H., Daumé, H. & Crawford, K. Datasheets for Datasets. Communications of the ACM 64, no. 12 (2021), 86-92

Showing first 80 references.