Laissez-Faire Harms: Algorithmic Biases in Generative Language Models
Pith reviewed 2026-05-24 01:58 UTC · model grok-4.3
The pith
Generative language models produce hundreds to thousands of times more subordinated portrayals of minoritized identities even when prompts leave identity unspecified.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the laissez-faire setting of open-ended prompts, synthetically generated texts from ChatGPT3.5, ChatGPT4, Claude2.0, Llama2, and PaLM2 perpetuate harms of omission, subordination, and stereotyping for minoritized individuals with intersectional race, gender, and/or sexual orientation identities, making such individuals hundreds to thousands of times more likely to encounter LM-generated outputs that portray their identities in a subordinated manner compared to representative or empowering portrayals, while also reproducing stereotypes known to trigger psychological harms.
What carries the argument
The laissez-faire prompting method, in which identity classifications are left unspecified, paired with systematic coding of generated texts into the three harm categories of omission, subordination, and stereotyping.
If this is right
- Minoritized consumers face elevated exposure to subordinating content in routine interactions with language models.
- Stereotypes such as the perpetual foreigner appear in outputs and can activate stereotype threat that impairs performance and self-perception.
- Existing bias-mitigation techniques focused on explicit identity prompts leave open-ended use cases largely unaddressed.
- Consumer protections and targeted AI education programs become necessary to offset the documented harms.
Where Pith is reading between the lines
- Mitigation efforts may need to target the base training data or decoding procedures rather than prompt engineering alone.
- Widespread deployment of these models could widen existing disparities in how different groups experience everyday information tools.
- Regulators could treat generative outputs similarly to other media that carry documented group-based harms.
- Testing the same models on prompts drawn from actual user logs would provide a stronger check on ecological validity.
Load-bearing premise
Open-ended prompts form a representative sample of everyday consumer use and that analysts can categorize harms objectively without introducing their own bias.
What would settle it
A replication that finds the rate of subordinated portrayals for minoritized identities in model outputs is statistically indistinguishable from the rate for non-minoritized identities or from rates in comparable human-written texts.
Figures
read the original abstract
The rapid deployment of generative language models (LMs) has raised concerns about social biases affecting the well-being of diverse consumers. The extant literature on generative LMs has primarily examined bias via explicit identity prompting. However, prior research on bias in earlier language-based technology platforms, including search engines, has shown that discrimination can occur even when identity terms are not specified explicitly. Studies of bias in LM responses to open-ended prompts (where identity classifications are left unspecified) are lacking and have not yet been grounded in end-consumer harms. Here, we advance studies of generative LM bias by considering a broader set of natural use cases via open-ended prompting. In this "laissez-faire" setting, we find that synthetically generated texts from five of the most pervasive LMs (ChatGPT3.5, ChatGPT4, Claude2.0, Llama2, and PaLM2) perpetuate harms of omission, subordination, and stereotyping for minoritized individuals with intersectional race, gender, and/or sexual orientation identities (AI/AN, Asian, Black, Latine, MENA, NH/PI, Female, Non-binary, Queer). We find widespread evidence of bias to an extent that such individuals are hundreds to thousands of times more likely to encounter LM-generated outputs that portray their identities in a subordinated manner compared to representative or empowering portrayals. We also document a prevalence of stereotypes (e.g. perpetual foreigner) in LM-generated outputs that are known to trigger psychological harms that disproportionately affect minoritized individuals. These include stereotype threat, which leads to impaired cognitive performance and increased negative self-perception. Our findings highlight the urgent need to protect consumers from discriminatory harms caused by language models and invest in critical AI education programs tailored towards empowering diverse consumers.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that in a 'laissez-faire' setting of open-ended prompts without explicit identity terms, five major generative LMs (ChatGPT-3.5/4, Claude 2.0, Llama 2, PaLM 2) produce outputs that disproportionately inflict harms of omission, subordination, and stereotyping on minoritized intersectional identities (AI/AN, Asian, Black, Latine, MENA, NH/PI, Female, Non-binary, Queer). It quantifies that such individuals are 'hundreds to thousands of times more likely' to encounter subordinated portrayals than representative or empowering ones, and documents stereotypes (e.g., perpetual foreigner) known to trigger psychological harms such as stereotype threat.
Significance. If the measurement pipeline were shown to be reliable, the work would usefully extend bias research from explicit-identity prompting to naturalistic consumer use cases and would strengthen calls for consumer protections and targeted AI education. The emphasis on end-consumer harms and intersectionality is a constructive framing.
major comments (2)
- [Methods] Methods section: no inter-rater reliability statistics, coding manual, sample sizes for prompts/outputs, or controls for prompt sensitivity are reported. These details are load-bearing for the central 'hundreds to thousands of times more likely' ratio, which is produced by human classification of free-form text into omission/subordination/stereotyping categories.
- [Results] Results / Abstract: the extreme likelihood ratios rest on interpretive coding of LM outputs without reported blinding, replication, or sensitivity analysis; the ratios are therefore sensitive to how 'subordinated manner' is operationalized and to individual annotator thresholds.
minor comments (2)
- [Abstract] Abstract: the list of identities mixes racial/ethnic, gender, and sexual-orientation categories without explicit justification for the intersectional sampling frame.
- [Introduction] Notation: 'laissez-faire' is used as a technical term for open-ended prompting; a brief definition or contrast with prior explicit-prompting literature would aid clarity.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which identify key areas for improving methodological transparency. We address each major comment below and will revise the manuscript to incorporate additional details where the original submission was incomplete.
read point-by-point responses
-
Referee: [Methods] Methods section: no inter-rater reliability statistics, coding manual, sample sizes for prompts/outputs, or controls for prompt sensitivity are reported. These details are load-bearing for the central 'hundreds to thousands of times more likely' ratio, which is produced by human classification of free-form text into omission/subordination/stereotyping categories.
Authors: We acknowledge that the Methods section does not report inter-rater reliability statistics, a full coding manual, exact sample sizes, or explicit controls for prompt sensitivity. These omissions limit the ability to fully assess the reliability of the human-coded ratios. In the revised manuscript, we will add these elements, including IRR metrics such as Cohen's kappa or Fleiss' kappa, the complete coding manual with category definitions and examples, precise counts of prompts and outputs per model and identity category, and any sensitivity checks on prompt variations. This will directly support the validity of the reported likelihood ratios. revision: yes
-
Referee: [Results] Results / Abstract: the extreme likelihood ratios rest on interpretive coding of LM outputs without reported blinding, replication, or sensitivity analysis; the ratios are therefore sensitive to how 'subordinated manner' is operationalized and to individual annotator thresholds.
Authors: We agree that the lack of reported blinding, replication procedures, and sensitivity analysis in the Results section leaves the ratios vulnerable to variations in operationalization and annotator judgment. The revised version will include details on the annotation protocol (including blinding where applicable), replication across multiple annotators, and sensitivity analyses that test alternative definitions of 'subordinated,' 'representative,' and 'empowering' categories. We will also note any limitations arising from annotator thresholds. These additions will be made to strengthen the presentation of the findings. revision: yes
Circularity Check
No circularity: empirical classification of model outputs with no derivations or self-referential reductions
full rationale
The paper reports direct empirical observations of LM outputs under open-ended prompts, followed by human categorization into harms of omission, subordination, and stereotyping. No equations, fitted parameters, predictions derived from inputs, or mathematical derivations appear in the provided text. Central claims rest on counted frequencies in classified outputs rather than any self-definitional, fitted-input, or self-citation load-bearing steps that reduce results to inputs by construction. Self-citations, if present, are not invoked to justify uniqueness theorems or ansatzes that close a circular loop. The analysis is therefore self-contained against external benchmarks of model behavior.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Open-ended prompts without identity terms represent natural consumer use cases.
- domain assumption Harms of omission, subordination, and stereotyping can be reliably identified and quantified from model text.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We quantify their relative frequency using the subordination ratio (see Equation 4), which we define as the proportion of a demographic observed in the subordinate role compared to the dominant role.
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We model race using first name as the majority (90.9%) of LM responses... fractionalized counting
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
These Women Tried to Warn Us About AI
O’Neil, L. These Women Tried to Warn Us About AI. Rolling Stone (2023). Available at: https://www.rollingstone.com/culture/culture- features/women-warnings-ai-danger-risk-before-chatgpt-1234804367/ (Accessed: 17th December 2023)
work page 2023
-
[2]
Sheng, E., Chang, K.-W., Natarajan, P. & Peng, N. The Woman Worked as a Babysitter: On Biases in Language Generation. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9 th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (2019). Doi:10.18653/v1/d19-1339
-
[3]
Discrimination in Online Ad Delivery
Sweeney, L. Discrimination in Online Ad Delivery. Queue 11, 10–29 (2013)
work page 2013
-
[4]
Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell
Dhamala, J. Sun, T., Kumar, V., Krishna, S., Pruksachatkun, Y., Chang, K. W., & Gupta, R. Bold: Dataset and Metrics for Measuring Biases in Open- Ended Language Generation. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (2021). Doi:10.1145/3442188.3445924
-
[5]
Language (Technology) is Power: A Critical Survey of ``Bias'' in NLP
Blodgett, S. L., Barocas, S., Daumé III, H. & Wallach, H. Language (technology) IS POWER: A Critical Survey of “Bias” in NLP. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020). Doi:10.18653/v1/2020.acl-main.485
-
[6]
What Exactly Are the Dangers Posed by A.I.? The New York Times (2023)
Metz, C. What Exactly Are the Dangers Posed by A.I.? The New York Times (2023). Available at: https://www.nytimes.com/2023/05/01/technology/ai- problems-danger-chatgpt.html (Accessed: 17th December 2023)
work page 2023
-
[7]
Nguyen, T., Jump, A. & Casey, D. Emerging Tech Impact Radar: 2023. Gartner (2023). Available at: https://www.gartner.com/en/doc/emerging- technologies-and-trends-impact-radar-excerpt. (Accessed: 17th December 2023)
work page 2023
-
[8]
ChatGPT has entered the classroom: How LLMs could transform education
Extance, A. ChatGPT has entered the classroom: How LLMs could transform education. Nature 623, 474–477 (2023)
work page 2023
-
[9]
Sal Khan. 2023. How AI could save (not destroy) education. Sal Khan: How AI could save (not destroy) education , TED Talk (April 2023)
work page 2023
-
[10]
The Future of Education? California Teachers Association (2023)
Peeples, J. The Future of Education? California Teachers Association (2023). Available at: https://www.cta.org/educator/posts/the-future-of-education. (Accessed: 17th December 2023)
work page 2023
-
[11]
OpenAI. Teaching with AI (2023). Available at: https://openai.com/blog/teaching-with-ai. (Accessed: 17th December 2023)
work page 2023
-
[12]
Hayden Field. 2024. OpenAI Announces First Partnership With a University. CNBC (2024). Retrieved from: https://www.cnbc.com/2024/01/18/openai-announces-first-partnership-with-a-university.html (Accessed: 19th January 2024)
work page 2024
- [13]
- [14]
-
[15]
In Hollywood Writers’ Battle Against AI, Humans Win (For Now)
Coyle, J. In Hollywood Writers’ Battle Against AI, Humans Win (For Now). AP News (2023). Available at: https://apnews.com/article/17sian17ood- ai-strike-wga-artificial-intelligence-39ab72582c3a15f77510c9c30a45ffc8. (Accessed: 17 th December 2023)
work page 2023
-
[16]
Eating Disorder Helpline Takes Down Chatbot After it Gave Weight Loss Advice
Wells, K. Eating Disorder Helpline Takes Down Chatbot After it Gave Weight Loss Advice. NPR (2023). Available at: https://www.npr.org/2023/06/08/1181131532/eating -disorder-helpline-takes-down-chatbot-after-it-gave-weight-loss-advice. (Accessed: 17 th December 2023)
work page 2023
-
[17]
Noble, S. U. Algorithms of Oppression: How Search Engines Reinforce Racism. (New York University Press, 2018)
work page 2018
-
[18]
Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell
Bender, E. M., Gebru, T., McMillan -Major, A. & Shmitchell, S. On the Dangers of Stochastic Parrots. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (2021). Doi:10.1145/3442188.344592
-
[19]
Race After Technology: Abolitionist Tools for the New Jim Code
Benjamin, R. Race After Technology: Abolitionist Tools for the New Jim Code. (John Wiley & Sons, 2019)
work page 2019
-
[20]
Amazon Scraps Secret AI Recruiting Tool That Showed Bias Against Women
Dastin, J. Amazon Scraps Secret AI Recruiting Tool That Showed Bias Against Women. Reuters (2018). Available at: https://jp.reuters.com/article/us- amazon-com-jobs-automation-insight-idUSKCN1MK08G. (Accessed: 17 th December 2023)
work page 2018
-
[21]
Steele, J. R. & Ambady, N. “Math is Hard!” The Effect of Gender Priming on Women’s Attitudes. Journal of Experimental Social Psychology 42, 428–436 (2006)
work page 2006
-
[22]
Shih, M., Ambady, N., Richeson, J. A., Fujita, K. & Gray, H. M. Stereotype Performance Boosts: The Impact of Self-relevance and the Manner of Stereotype Activation. Journal of Personality and Social Psychology 83, 638–647 (2002)
work page 2002
-
[23]
Buolamwini, J. & Gebru, T. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. Proceedings of the 1 st Conference on Fairness, Accountability and Transparency 77–91 (PMLR, 2018)
work page 2018
-
[24]
Kirk, H.R., Jun, Y., Volpin, F., Iqbal, H., Benussi, E., Dreyer, F., Shtedritski, A. & Asano, Y., 2021. Bias Out-of-the-box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models. Advances in Neural Information Processing Systems (2021)
work page 2021
-
[25]
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A. & Agarwal, S., et al. Language Models are Few-shot Learners. Advances in Neural Information Processing Systems (2020)
work page 2020
-
[26]
Tamkin, A., Askell, A., Lovitt, L., Durmus, E., Joseph, N., Kravec, S., Nguyen, K., Kaplan, J. & Ganguli, D.. Evaluating and Mitigating Discrimination in Language Model Decisions. arXiv preprint arXiv:2312.03689 (2023)
-
[27]
Cao, Y., Sotnikova, A., Daumé III, H., Rudinger, R. & Zou, L. Theory-grounded Measurement of U.S. Social Stereotypes in English Language Models. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2022). Doi:10.18653/v1/2022.naacl-main.92
-
[28]
Bertrand, M. & Mullainathan, S. Are Emily and Greg More Employable than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination. American Economic Review (2003). https://doi.org/10.3386/w9873
- [29]
-
[30]
Llama 2: Open Foundation and Fine-Tuned Chat Models
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S . & Bikel, D. Llama 2: Open Foundation and Fine-tuned Chat Models. arXiv preprint arXiv:2307.09288 (2023). 18
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[31]
Scaling Instruction-Finetuned Language Models
Chung, H.W., Hou, L., Longpre, S., Zoph, B., Tay, Y., Fedus, W., Li, Y., Wang, X., Dehghani, M., Brahma, S. & Webson, A.. Sca ling Instruction- Finetuned Language Models. arXiv preprint arXiv:2210.11416 (2022)
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[32]
Anil, R., Dai, A.M., Firat, O., Johnson, M., Lepikhin, D., Passos, A., Shakeri, S., Taropa, E., Bailey, P., Chen, Z. & Chu, E. PaLM 2 Technical Report. arXiv preprint arXiv:2305.10403 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[33]
Hanna, A., Denton, E., Smart, A. & Smith-Loud, J. Towards a Critical Race Methodology in Algorithmic Fairness. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (2020). Doi:10.1145/3351095.3372826
-
[34]
Field, A., Blodgett, S. L., Waseem, Z. & Tsvetkov, Y. A Survey of Race, Racism, and Anti -Racism in NLP. Proceedings of the 59 th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (2021). Doi:10.18653/v1/2021.acl-long.149
-
[35]
Fealing, K. H. & Incorvaia, A. D. Understanding Diversity: Overcoming the Small-N Problem. Harvard Data Science Review (2022). Available at: https://hdsr.mitpress.mit.edu/pub/vn6ib3o5/release/1. (Accessed: 17 th December 2023)
work page 2022
-
[36]
Mapping the Margins: Intersectionality, Identity Politics, and Violence Against Women of Color
Crenshaw, K.W. Mapping the Margins: Intersectionality, Identity Politics, and Violence Against Women of Color. Stanford Law R eview 43, 1241 (1991)
work page 1991
-
[37]
Kong, Y. Are “Intersectionally fair” AI Algorithms Really Fair to Women of color? A Philosophical Analysis. 2022 ACM Conference on Fairness, Accountability, and Transparency (2022). Doi:10.1145/3531146.3533114
-
[38]
Cho, S., Crenshaw, K. W. & McCall, L. Toward a Field of Intersectionality Studies: Theory, Applications, and Praxis. Signs: Journal of Women in Culture and Society 38, 785–810 (2013)
work page 2013
-
[39]
Ovalle, A., Subramonian, A., Gautam, V., Gee, G. & Chang, K. -W. Factoring the Matrix of Domination: A Critical Review and Reimagination of Intersectionality in AI Fairness. Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society (2023). Doi:10.1145/3600211.3604705
-
[40]
Steele, C. M. & Aronson, J. Stereotype Threat and the Intellectual Test Performance of African Americans. Journal of Personality and Social Psychology 69, 5 (1995), 797-811. https://doi.org/10.1037//0022-3514.69.5.797
-
[41]
Davies, P. G., Spencer, S. J., Quinn, D. M., & Gerhardstein, R. Consuming Images: How Television Commercials that Elicit Stereotype Threat Can Restrain Women Academically and Professionally . Personality and Social Psychology Bulletin 28, 12 (2002), 1615 -1628. https://doi.org/10.1177/014616702237644
-
[42]
Devine, P. G. Stereotypes and Prejudice: Their Automatic and Controlled Components. Journal of Personality and Social Psychology 56, 1 (1989), 5-
work page 1989
-
[43]
https://doi.org/10.1037//0022-3514.56.1.5
-
[44]
Elliott-Groves, E. & Fryberg, S. A. “A Future Denied” for Young Indigenous People: From Social Disruption to Possible Futures. Handbook of Indigenous Education 1–19 (2017). Doi:10.1007/978-981-10-1839-8_50-1
-
[45]
Shelby, R., Rismani, S., Henne, K., Moon, A., Rostamzadeh, N., Nicholas, P., Yilla -Akbari, N.M., Gallegos, J., Smart, A., Garcia, E. & Virk, G . Sociotechnical Harms of Algorithmic Systems: Scoping a Taxonomy for Harm Reduction. Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society (2023). Doi:10.1145/3600211.3604673
-
[46]
Lazar, S. & Nelson, A. AI Safety on Whose Terms? Science 381, 138–138 (2023)
work page 2023
-
[47]
Monroe-White, T., Marshall, B., & Contreras -Palacios, H. (2021, September). Waking up to Marginalization: Public Value Failures in Artificial Intelligence and Data Science. In Artificial Intelligence Diversity, Belonging, Equity, and Inclusion (pp. 7 -21). PMLR
work page 2021
- [48]
-
[49]
U.S. White House. FACT SHEET: Biden -Harris Administration Secures Voluntary Commitments from Leading Artificial Intelligence Companies to Manage the Risks Posed by AI. The White House (2023). Available at: https://www.whitehouse.gov/briefing-room/statements- releases/2023/07/21/fact-sheet-biden-harris-administration-secures-voluntary-commitments-from-lea...
work page 2023
-
[50]
Bargh, J. A. & Chartrand, T. L. Studying the Mind in the Middle: A Practical Guide to Priming and Automaticity. Handbook of Research Methods in Social and Personality Psychology, 2 (2000), 253-285
work page 2000
-
[51]
Bargh, J. A., Chen, M. & Burrows, L. Automaticity of Social Behavior: Direct Effects of Trait Construct and Stereotype Activation on Action. Journal of Personality and Social Psychology, 71 (1996), no. 2, 230
work page 1996
-
[52]
Aronson, J., Quinn, D. M. & Spencer, S. J. Stereotype Threat and the Academic Underperformance of Minorities and Women. Prejudice, Academic Press (1998), 83-103
work page 1998
-
[53]
Gonzales, P. M., Blanton, H. & Williams, K. J. The Effects of Stereotype Threat and Double -Minority Status on the Test Performance of Latino Women. Personality and Social Psychology Bulletin 28, 5 (2002), 659-670. https://doi.org/10.1177/0146167202288010
-
[54]
Office of Management and Budget
U.S. Office of Management and Budget. Initial Proposals for Updating OMB’s Race and Ethnicity Statistical Standards. Federal Register (2023). Available at: https://www.federalregister.gov/documents/2023/01/27/2023 -01635/initial-proposals-for-updating-ombs-race-and-ethnicity-statistical- standards (Accessed: 17th December 2023)
work page 2023
-
[55]
Bolukbasi, T., Chang K.W., Zou, J., Saligrama, V. & Kalai, A.T. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. Advances in Neural Information Processing Systems (2016)
work page 2016
-
[56]
Kozlowski, D., Murray, D.S., Bell, A., Hulsey, W., Larivière, V., Monroe -White, T. & Sugimoto, C.R. Avoiding Bias When Inferring Race Using Name-based Approaches. PLOS One (2022)
work page 2022
-
[57]
Florida Voter Registration Data (2017 and 2022)
Sood, G. Florida Voter Registration Data (2017 and 2022). (2022) doi:10.7910/DVN/UBIG3F
-
[58]
Le, T. T., Himmelstein, D. S., Hippen, A. A., Gazzara, M. R. & Greene, C. S. Analysis of Scientific Society Honors Reveals Disparities. Cell Systems 12, 900-906.e5 (2021). https://doi.org/10.1016/j.cels.2021.07.007
-
[59]
Dev, S., Sheng, E., Zhao, J., Amstutz, A., Sun, J., Hou, Y., Sanseverino, M., Kim, J., Nishi, A., Peng, N. & Chang, K.W . On Measures of Biases and Harms in NLP. In Findings of the Association for Computational Linguistics: AACL -IJCNLP (2022). 19
work page 2022
-
[60]
Luccioni, S., Akiki, C., Mitchell, M. & Jernite, Y. Stable Bias: Evaluating Societal Representations in Diffusion Models. Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2023)
work page 2023
-
[61]
Deng, B. & Watson, T. LGBTQ+ data availability. Brookings (2023). Available at: https://www.brookings.edu/articles/lgbtq-data-availability-what- we-can-learn-from-four-major-surveys/ (Accessed: 17th December 2023)
work page 2023
-
[62]
Huynh, Q.-L., Devos, T. & Smalarz, L. Perpetual Foreigner in One’s Own Land: Potential Implications for Identity and Psychological Adju stment. Journal of Social and Clinical Psychology 30, 133–162 (2011)
work page 2011
-
[63]
Hemmatian, B. & Varshney, L. R. Debiased Large Language Models Still Associate Muslims with Uniquely Violent Acts. PsyArXiv Prepints (2022). Doi:10.31234/osf.io/xpeka
-
[64]
Li, P. Recent Developments: Hitting the Ceiling: An Examination of Barriers to Success for Asian American Women. Berkeley Journal of Gender, Law & Justice 29, (2014)
work page 2014
-
[65]
Steketee, A., Williams, M. T., Valencia, B. T., Printz, D. & Hooper, L. M. Racial and Language Microaggressions in the School Ecology. Perspectives on Psychological Science 16, 1075–1098 (2021)
work page 2021
-
[66]
Aronson, B. A. The White Savior Industrial Complex: A Cultural Studies Analysis of a teacher educator, savior film, and futur e teachers. Journal of Critical Thought and Praxis 6, (2017)
work page 2017
-
[67]
Waugh, L. R. Marked and Unmarked: A Choice Between Unequals in Semiotic Structure. Semiotica 38, (1982)
work page 1982
- [68]
-
[69]
Solorzano, C. D. & Hernandez, C. T. Evaluating Machine Perception of Indigeneity: An Analysis of ChatGPT’s Perceptions of Ind igenous Roles in Diverse Scenarios. arXiv preprint arXiv:2310.09237 (2023) doi:10.13140/RG.2.2.30617.39520
-
[70]
Leavitt, P. A., Covarrubias, R., Perez, Y. A. & Fryberg, S. A. “Frozen in time”: The Impact of Native American Media Representations on Identity and Self‐understanding. Journal of Social Issues 71, 39–53 (2015)
work page 2015
-
[71]
An Indigenous Peoples’ History of the United States
Dunbar-Ortiz, R. An Indigenous Peoples’ History of the United States. Beacon Press (2014)
work page 2014
-
[72]
Schopmans, H. R. From Coded Bias to Existential Threat. Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society (2022). Doi:10.1145/3514094.3534161
-
[73]
A General Language Assistant as a Laboratory for Alignment
Askell, A., Bai, Y., Chen, A., Drain, D., Ganguli, D., Henighan, T., Jones, A., Joseph, N., Mann, B., DasSarma, N. & Elhage, N. A general language assistant as a laboratory for alignment. arXiv preprint arXiv:2112.00861 (2021)
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[74]
How We’ve Created a Helpful and Responsible Bard Experience for Teens
Doshi, T. How We’ve Created a Helpful and Responsible Bard Experience for Teens. Google: The Keyword – Product Updates (November 2023). Retrieved from: https://blog.google/products/bard/google-bard-expansion-teens/
work page 2023
-
[75]
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Ganguli, D., Lovitt, L., Kernion, J., Askell, A., Bai, Y., Kadavath, S., Mann, B., Perez, E., Schiefer, N., Ndousse, K. & Jones, A. Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned. arXiv preprint arXiv:2209.07858 (2022)
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[76]
OpenAI. GPT-4 Technical Report. arXiv preprint arXiv:2303.08774v4 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[77]
Bommasani, R., Klyman, K., Longpre, S., Kapoor, S., Maslej, N., Xiong, B., Zhang, D. & Liang, P. The Foundation Model Transpa rency Index. arXiv:2310.12941. Retrieved from https://arxiv.org/abs/2310.12941
-
[78]
Bridgland, V. M. E., Jones, P. J. & Bellet, B. W. A Meta-Analysis of the Efficacy of Trigger Warnings, Content Warnings, and Content Notes. Clinical Psychological Science (2022)
work page 2022
-
[79]
United States, Department of Justice, Civil Rights Division, Southern District of New York. 2022. National Coalition on Black Civic Participation vs. Wohl – Statement of Interest of the United States of America. United States Department of Justice (2022). https://www.justice.gov/d9/case- documents/attachments/2022/08/12/ncbp_v_wohl_us_soi_filed_8_12_22_ro_tag.pdf
work page 2022
-
[80]
Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., Wallach, H., Daumé, H. & Crawford, K. Datasheets for Datasets. Communications of the ACM 64, no. 12 (2021), 86-92
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.