Recognition: 2 theorem links
· Lean TheoremLLM Harms: A Taxonomy and Discussion
Pith reviewed 2026-05-17 00:23 UTC · model grok-4.3
The pith
A taxonomy of LLM harms across five lifecycle stages supports mitigation strategies and a dynamic auditing system for responsible development.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes five categories of harms addressed before, during, and after development of AI applications: pre-development, direct output, misuse and malicious application, and downstream application. By defining the risks in the current landscape, the taxonomy supports accountability, transparency, and navigation of bias when adapting LLMs for practical applications, and it proposes mitigation strategies and future directions for specific domains along with a dynamic auditing system to guide responsible development and integration in a standardized way.
What carries the argument
The taxonomy that classifies harms into pre-development, direct output, misuse and malicious application, and downstream application stages, which organizes the risks and enables the proposed mitigation and auditing steps.
Load-bearing premise
The listed categories comprehensively capture all relevant LLM harms without leaving out major types of damage or negative effects.
What would settle it
Discovery of a clear example of LLM-related harm that fits none of the categories pre-development, direct output, misuse and malicious application, or downstream application would show the taxonomy is incomplete.
read the original abstract
This study addresses categories of harm surrounding Large Language Models (LLMs) in the field of artificial intelligence. It addresses five categories of harms addressed before, during, and after development of AI applications: pre-development, direct output, Misuse and Malicious Application, and downstream application. By underscoring the need to define risks of the current landscape to ensure accountability, transparency and navigating bias when adapting LLMs for practical applications. It proposes mitigation strategies and future directions for specific domains and a dynamic auditing system guiding responsible development and integration of LLMs in a standardized proposal.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a taxonomy of LLM harms organized into five categories spanning pre-development, direct output, misuse and malicious application, and downstream application. It highlights the need to define these risks to promote accountability, transparency, and bias mitigation in LLM applications, and puts forward mitigation strategies for specific domains along with a dynamic auditing system to support responsible development and integration.
Significance. If the taxonomy is shown to be comprehensive and the auditing proposal is operationalized with clear criteria, the work could provide a useful organizing framework for discussions on LLM governance in the Computers and Society community. As a purely conceptual piece without empirical validation, incident analysis, or comparison to prior taxonomies, its contribution remains primarily discursive rather than prescriptive.
major comments (3)
- Abstract: The manuscript asserts five categories of harms but explicitly enumerates only four (pre-development, direct output, Misuse and Malicious Application, and downstream application). This internal inconsistency must be resolved by identifying the fifth category and explaining its distinct scope.
- Abstract and main text: No section or subsection describes the method used to construct the taxonomy (e.g., systematic literature mapping, incident database review, or expert elicitation). Because the central claim—that a standardized dynamic auditing system can guide responsible LLM integration—rests on the taxonomy comprehensively partitioning harms without major overlaps or omissions, the absence of a construction rationale is load-bearing.
- Abstract: The proposal for mitigation strategies and future directions is stated at a high level without concrete illustrations, domain-specific examples, or discussion of how the dynamic auditing system would handle boundary cases such as training-data extraction or emergent multi-turn behaviors.
minor comments (3)
- Abstract: The sentence 'It addresses five categories of harms addressed before, during, and after development' contains redundant wording that should be revised for clarity.
- Abstract: Category names show inconsistent capitalization ('Misuse and Malicious Application' versus the others); standardize formatting throughout.
- Abstract: The manuscript would benefit from explicit references to existing AI ethics taxonomies (e.g., those from NIST, EU AI Act documentation, or prior LLM-specific surveys) to situate the proposed framework.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments. We address each major comment point by point below, indicating where revisions will be made to strengthen the manuscript.
read point-by-point responses
-
Referee: Abstract: The manuscript asserts five categories of harms but explicitly enumerates only four (pre-development, direct output, Misuse and Malicious Application, and downstream application). This internal inconsistency must be resolved by identifying the fifth category and explaining its distinct scope.
Authors: We thank the referee for identifying this inconsistency. The abstract references harms 'before, during, and after development,' which corresponds to five categories: pre-development, during development, direct output, Misuse and Malicious Application, and downstream application. The during-development category (encompassing issues in training, fine-tuning, and data curation) was omitted from the enumerated list. We will revise the abstract to explicitly list and briefly scope all five categories. revision: yes
-
Referee: Abstract and main text: No section or subsection describes the method used to construct the taxonomy (e.g., systematic literature mapping, incident database review, or expert elicitation). Because the central claim—that a standardized dynamic auditing system can guide responsible LLM integration—rests on the taxonomy comprehensively partitioning harms without major overlaps or omissions, the absence of a construction rationale is load-bearing.
Authors: We acknowledge that the manuscript lacks an explicit description of the taxonomy construction process. The taxonomy was developed via synthesis of prior AI ethics literature, reported incidents, and governance discussions. We will add a dedicated subsection (likely in the introduction) outlining this rationale and the criteria used to ensure the five categories partition harms comprehensively with minimal overlaps. revision: yes
-
Referee: Abstract: The proposal for mitigation strategies and future directions is stated at a high level without concrete illustrations, domain-specific examples, or discussion of how the dynamic auditing system would handle boundary cases such as training-data extraction or emergent multi-turn behaviors.
Authors: We agree the mitigation strategies and dynamic auditing proposal are high-level. In revision we will add concrete domain-specific examples (e.g., healthcare and education) and a discussion of boundary cases, including how the auditing system would address training-data extraction and emergent multi-turn behaviors, to increase operational clarity. revision: yes
Circularity Check
No circularity: taxonomy asserted from domain knowledge without self-referential reduction.
full rationale
The paper presents a taxonomy of LLM harms organized around development stages (pre-development, direct output, misuse/malicious application, downstream application) and proposes mitigation strategies plus a dynamic auditing system. No equations, fitted parameters, or derivations appear in the provided abstract or description. Claims rest on general domain knowledge of AI application lifecycles rather than any self-definition, self-citation chain, or renaming of prior results. The central proposal does not reduce to its inputs by construction; the categories are stated as a partitioning without a quoted mechanism that would make completeness tautological. This is a standard non-circular discussion paper.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLMs can produce harms at distinct stages of development and application that can be usefully grouped into categories for mitigation.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
This study addresses five categories of harms... pre-development, direct output, Misuse and Malicious Application, and downstream application... dynamic auditing system
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Section IV presents the taxonomy... 4.1 Pre-Deployment Harms... 4.2 Direct Output Harms...
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
From Notepad AI to Social Media: How Can Text Style Transformation Mitigate Social Harm?
A framework transforms aggressive social media text into neutral styles while preserving semantics, measured by a new Emotion Drift Index to reduce online harm.
Reference graph
Works this paper leans on
-
[1]
OpenAI’s ChatGPT to hit 700 million weekly users, up 4x from last year
“OpenAI’s ChatGPT to hit 700 million weekly users, up 4x from last year.” Accessed: Sep. 24, 2025. [Online]. Available: https://www.cnbc.com/2025/08/04/openai -chatgpt- 700-million-users.html
work page 2025
-
[2]
ChatGPT continues to be one of the fastest-growing services ever | The Verge
“ChatGPT continues to be one of the fastest-growing services ever | The Verge.” Accessed: Aug. 13, 2025. [Online]. Available: https://www.theverge.com/2023/11/6/23948386/chatgpt-active-user-count-openai- developer-conference
work page 2025
-
[3]
P. Gmyrek et al. , “Generative AI and jobs,” Generative AI and jobs , 2025, doi: 10.54394/HETP0387
-
[4]
International AI Safety Report 2025 - GOV .UK
“International AI Safety Report 2025 - GOV .UK.” Accessed: Aug. 13, 2025. [Online]. Available: https://www.gov.uk/government/publications/international -ai-safety-report- 2025?utm_source=chatgpt.com 31
work page 2025
-
[6]
LLaMA: Open and Efficient Foundation Language Models
H. Touvron et al. , “LLaMA: Open and Efficient Foundation Language Models,” Feb. 2023, Accessed: Aug. 13, 2025. [Online]. Available: https://arxiv.org/pdf/2302.13971
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[7]
Office of the European Union L- and L
P. Office of the European Union L- and L. Luxembourg, “Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence and amending Regulations (EC) No 300/2008, (EU) No 167/2013, (EU) No 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and Directives 2014/90/EU, (EU...
work page 2024
-
[9]
E. Brynjolfsson, D. Li, and L. Raymond, “Generative AI at Work,” Q J Econ, vol. 140, no. 2, pp. 889–942, Apr. 2025, doi: 10.1093/QJE/QJAE044
-
[10]
Scaling Laws for Neural Language Models
J. Kaplan et al., “Scaling Laws for Neural Language Models,” Jan. 2020, Accessed: Aug. 13, 2025. [Online]. Available: https://arxiv.org/pdf/2001.08361
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[11]
Training Compute-Optimal Large Language Models
J. Hoffmann et al., “Training Compute-Optimal Large Language Models,” Adv Neural Inf Process Syst , vol. 35, Mar. 2022, Accessed: Aug. 13, 2025. [Online]. Available: https://arxiv.org/pdf/2203.15556
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[12]
An empirical analysis of compute -optimal large language model training - Google DeepMind
“An empirical analysis of compute -optimal large language model training - Google DeepMind.” Accessed: Aug. 13, 2025. [Online]. Available: https://deepmind.google/discover/blog/an-empirical-analysis-of-compute-optimal-large- language-model-training/?utm_source=chatgpt.com
work page 2025
-
[13]
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus, B. Zoph, and N. Shazeer, “Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity,” Journal of Machine Learning Research, vol. 23, pp. 1 –40, Jan. 2021, Accessed: Aug. 13, 2025. [Online]. Available: https://arxiv.org/pdf/2101.03961
work page internal anchor Pith review Pith/arXiv arXiv 2021
- [14]
-
[15]
Meta releases new AI model Llama 4 | Reuters
“Meta releases new AI model Llama 4 | Reuters.” Accessed: Sep. 24, 2025. [Online]. Available: https://www.reuters.com/technology/meta -releases-new-ai-model-llama-4- 2025-04-05/
work page 2025
-
[16]
OWASP Top 10 for Large Language Model Applications | OWASP Foundation
“OWASP Top 10 for Large Language Model Applications | OWASP Foundation.” Accessed: Aug. 13, 2025. [Online]. Available: https://owasp.org/www-project-top-10-for- large-language-model-applications/?utm_source=chatgpt.com
work page 2025
-
[17]
T. Gebru et al., “Datasheets for Datasets,” Commun ACM, vol. 64, no. 12, pp. 86–92, Mar. 2018, doi: 10.1145/3458723
-
[18]
E. M. Bender and B. Friedman, “Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science,” Trans Assoc Comput Linguist, vol. 6, pp. 587–604, Jan. 2018, doi: 10.1162/TACL_A_00041. 32
-
[19]
Ethical and social risks of harm from Language Models
L. Weidinger et al., “Ethical and social risks of harm from Language Models,” Dec. 2021, Accessed: Aug. 13, 2025. [Online]. Available: https://arxiv.org/pdf/2112.04359
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[20]
Training language models to follow instructions with human feedback
L. Ouyang et al., “Training language models to follow instructions with human feedback,” Adv Neural Inf Process Syst , vol. 35, Mar. 2022, Accessed: Aug. 13, 2025. [Online]. Available: https://arxiv.org/pdf/2203.02155
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[21]
Holistic Evaluation of Language Models (HELM)
“Holistic Evaluation of Language Models (HELM).” Accessed: Aug. 13, 2025. [Online]. Available: https://crfm.stanford.edu/helm/?utm_source=chatgpt.com
work page 2025
-
[22]
G. M. Raimondo et al., “NIST Trustworthy and Responsible AI NIST AI 600-1 Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile,” 2024, doi: 10.6028/NIST.AI.600-1
-
[23]
Taxonomy of Risks posed by Language Models,
L. Weidinger et al., “Taxonomy of Risks posed by Language Models,” ACM International Conference Proceeding Series , vol. 22, pp. 214 –229, Jun. 2022, doi: 10.1145/3531146.3533088;CSUBTYPE:STRING:CONFERENCE
work page doi:10.1145/3531146.3533088;csubtype:string:conference 2022
-
[24]
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
P. Lewis et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” Adv Neural Inf Process Syst , vol. 2020-December, May 2020, Accessed: Aug. 13, 2025. [Online]. Available: https://arxiv.org/pdf/2005.11401
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[25]
Making AI Less ‘Thirsty’: Uncovering and Addressing the Secret Water Footprint of AI Models,
P. Li, J. Yang, M. A. Islam, and S. Ren, “Making AI Less ‘Thirsty’: Uncovering and Addressing the Secret Water Footprint of AI Models,” Mar. 2025, Accessed: Aug. 13,
work page 2025
-
[27]
“Article 55: Obligations for Providers of General-Purpose AI Models with Systemic Risk | EU Artificial Intelligence Act.” Accessed: Aug. 13, 2025. [Online]. Available: https://artificialintelligenceact.eu/article/55/?utm_source=chatgpt.com
work page 2025
-
[28]
Frontier AI Safety Commitments, AI Seoul Summit 2024 - GOV .UK
“Frontier AI Safety Commitments, AI Seoul Summit 2024 - GOV .UK.” Accessed: Aug. 13, 2025. [Online]. Available: https://www.gov.uk/government/publications/frontier -ai- safety-commitments-ai-seoul-summit-2024?utm_source=chatgpt.com
work page 2024
-
[29]
Universal and Transferable Adversarial Attacks on Aligned Language Models
A. Zou, Z. Wang, N. Carlini, M. Nasr, J. Z. Kolter, and M. Fredrikson, “Universal and Transferable Adversarial Attacks on Aligned Language Models,” Jul. 2023, Accessed: Aug. 13, 2025. [Online]. Available: https://arxiv.org/pdf/2307.15043
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[30]
The PRISMA 2020 statement: An updated guideline for reporting systematic reviews,
M. J. Page et al. , “The PRISMA 2020 statement: An updated guideline for reporting systematic reviews,” BMJ, vol. 372, Mar. 2021, doi: 10.1136/BMJ.N71
-
[31]
On Protecting the Data Privacy of Large Language Models (LLMs): A Survey,
B. Yan et al., “On Protecting the Data Privacy of Large Language Models (LLMs): A Survey,” Mar. 2024, Accessed: Jul. 09, 2025. [Online]. Available: https://arxiv.org/pdf/2403.05156
-
[32]
Leaner Training, Lower Leakage: Revisiting Memorization in LLM Fine-Tuning with LoRA,
F. Wang and B. Li, “Leaner Training, Lower Leakage: Revisiting Memorization in LLM Fine-Tuning with LoRA,” Jun. 2025, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2506.20856
-
[33]
Unmasking and quantifying racial bias of large language models in medical report generation,
Y . Yang, X. Liu, Q. Jin, F. Huang, and Z. Lu, “Unmasking and quantifying racial bias of large language models in medical report generation,” Communications Medicine, vol. 4, no. 1, pp. 1 –6, Dec. 2024, doi: 10.1038/S43856 -024-00601- Z;SUBJMETA=308,692,700;KWRD=HEALTH+CARE,MEDICAL+RESEARCH. 33
-
[34]
Participation in the age of foundation models,
H. Suresh, E. Tseng, M. Young, M. L. Gray, E. Pierson, and K. Levy, “Participation in the age of foundation models,” 2024 ACM Conference on Fairness, Accountability, and Transparency, F AccT 2024 , vol. 1, pp. 1609 –1621, May 2024, doi: 10.1145/3630106.3658992
-
[35]
Right to be Forgotten in the Era of Large Language Models: Implications, Challenges, and Solutions,
D. Zhang et al., “Right to be Forgotten in the Era of Large Language Models: Implications, Challenges, and Solutions,” AI and Ethics, vol. 5, no. 3, pp. 2445 –2454, Jul. 2023, doi: 10.1007/s43681-024-00573-9
-
[36]
A. S. Al-Busaidi et al., “Redefining boundaries in innovation and knowledge domains: Investigating the impact of generative artificial intelligence on copyright and intellectual property rights,” Journal of Innovation & Knowledge, vol. 9, no. 4, p. 100630, Oct. 2024, doi: 10.1016/J.JIK.2024.100630
-
[37]
Earth Science Data Repositories: Implementing the CARE Principles,
M. O’brien et al., “Earth Science Data Repositories: Implementing the CARE Principles,” Data Sci J, vol. 23, 2024, doi: 10.5334/DSJ-2024-037
-
[38]
Privacy Auditing in Differential Private Machine Learning: The Current Trends,
I. Namatevs, K. Sudars, A. Nikulins, and K. Ozols, “Privacy Auditing in Differential Private Machine Learning: The Current Trends,” Applied Sciences 2025, Vol. 15, Page 647, vol. 15, no. 2, p. 647, Jan. 2025, doi: 10.3390/APP15020647
-
[39]
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems,
T. Cui et al. , “Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems,” Jan. 2024, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2401.05778
-
[40]
Learning Fair Representations through Uniformly Distributed Sensitive Attributes,
P. J. Kenfack, A. R. Rivera, A. M. Khan, and M. Mazzara, “Learning Fair Representations through Uniformly Distributed Sensitive Attributes,” Proceedings - 2023 IEEE Conference on Secure and Trustworthy Machine Learning, SaTML 2023, pp. 58–67, 2023, doi: 10.1109/SATML54575.2023.00014
-
[41]
A Survey on Concept Factorization: From Shallow to Deep Representation Learning,
Z. Zhang, Y . Zhang, M. Xu, L. Zhang, Y . Yang, and S. Yan, “A Survey on Concept Factorization: From Shallow to Deep Representation Learning,” Inf Process Manag, vol. 58, no. 3, Jul. 2020, doi: 10.1016/j.ipm.2021.102534
-
[42]
Benchmark Data Contamination of Large Language Models: A Survey,
C. Xu, S. Guan, D. Greene, and M.-T. Kechadi, “Benchmark Data Contamination of Large Language Models: A Survey,” Jun. 2024, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2406.04244
-
[43]
The Impossibility of Fair LLMs,
J. Anthis, K. Lum, M. Ekstrand, A. Feller, A. D’Amour, and C. Tan, “The Impossibility of Fair LLMs,” May 2024, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2406.03198
-
[44]
Copyright in AI Training Data: A Human-Centered Approach,
D. W. Opderbeck, “Copyright in AI Training Data: A Human-Centered Approach,” SSRN Electronic Journal, Dec. 2023, doi: 10.2139/SSRN.4679299
-
[45]
Detecting Personally Identifiable Information Through Natural Language Processing: A Step Forward,
L. Mainetti and A. Elia, “Detecting Personally Identifiable Information Through Natural Language Processing: A Step Forward,” Applied System Innovation 2025, Vol. 8, Page 55, vol. 8, no. 2, p. 55, Apr. 2025, doi: 10.3390/ASI8020055
-
[46]
On the Diversity of Synthetic Data and its Impact on Training Large Language Models,
H. Chen et al., “On the Diversity of Synthetic Data and its Impact on Training Large Language Models,” Oct. 2024, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2410.15226 34
-
[47]
Unveiling security, privacy, and ethical concerns of ChatGPT,
X. Wu, R. Duan, and J. Ni, “Unveiling security, privacy, and ethical concerns of ChatGPT,” Journal of Information and Intelligence, vol. 2, no. 2, pp. 102–115, Mar. 2024, doi: 10.1016/j.jiixd.2023.10.007
-
[48]
A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle,
H. Suresh and J. V . Guttag, “A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle,” ACM International Conference Proceeding Series , Dec. 2021, doi: 10.1145/3465416.3483305
-
[49]
Reconciling the contrasting narratives on the environmental impact of large language models,
S. Ren, B. Tomlinson, R. W. Black, and A. W. Torrance, “Reconciling the contrasting narratives on the environmental impact of large language models,” Sci Rep, vol. 14, no. 1, pp. 1 –8, Dec. 2024, doi: 10.1038/S41598 -024-76682- 6;SUBJMETA=4066,4069,685,704,844;KWRD=ENERGY+EFFICIENCY ,SUSTAINA BILITY
-
[50]
Reconciling the contrasting narratives on the environmental impact of large language models,
S. Ren, B. Tomlinson, R. W. Black, and A. W. Torrance, “Reconciling the contrasting narratives on the environmental impact of large language models,” Sci Rep, vol. 14, no. 1, pp. 1–8, Dec. 2024, doi: 10.1038/S41598-024-76682-6;SUBJMETA
-
[51]
Reaching carbon neutrality requires energy -efficient training of AI,
L. Guan, “Reaching carbon neutrality requires energy -efficient training of AI,” Nature, vol. 626, no. 7997, p. 33, Feb. 2024, doi: 10.1038/D41586-024-00200-X;SUBJMETA
-
[52]
The carbon emissions of writing and illustrating are lower for AI than for humans,
B. Tomlinson, R. W. Black, D. J. Patterson, and A. W. Torrance, “The carbon emissions of writing and illustrating are lower for AI than for humans,” Sci Rep, vol. 14, no. 1, pp. 1–8, Dec. 2024, doi: 10.1038/S41598-024-54271-X;SUBJMETA
-
[53]
How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference - Google Search
“How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference - Google Search.” Accessed: Aug. 04, 2025. [Online]. Available: https://www.google.com/search?q=How+Hungry+is+AI%3F+Benchmarking+Energy% 2C+Water%2C+and+Carbon+Footprint+of+LLM+Inference&rlz=1C1UEAD_enUS102 4US1024&oq=How+Hungry+is+AI%3F+Benchmarking+Energy%2C+Water%2C+an...
work page 2025
-
[54]
G. Zheng and A. Brintrup, “Enhancing Supply Chain Visibility with Generative AI: An Exploratory Case Study on Relationship Prediction in Knowledge Graphs,” Dec. 2024, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2412.03390
-
[55]
Making AI Less ‘Thirsty’: Uncovering and Addressing the Secret Water Footprint of AI Models,
P. Li, J. Yang, M. A. Islam, and S. Ren, “Making AI Less ‘Thirsty’: Uncovering and Addressing the Secret Water Footprint of AI Models,” Apr. 2023, Accessed: Aug. 04,
work page 2023
-
[56]
Available: https://arxiv.org/pdf/2304.03271
[Online]. Available: https://arxiv.org/pdf/2304.03271
-
[57]
Quantifying the Energy Consumption and Carbon Emissions of LLM Inference via Simulations,
M. Özcan, P. Wiesner, P. Weiß, and O. Kao, “Quantifying the Energy Consumption and Carbon Emissions of LLM Inference via Simulations,” Jul. 2025, Accessed: Aug. 04,
work page 2025
-
[58]
Available: https://arxiv.org/pdf/2507.11417
[Online]. Available: https://arxiv.org/pdf/2507.11417
-
[59]
Life-Cycle Emissions of AI Hardware: A Cradle-To-Grave Approach and Generational Trends,
I. Schneider et al., “Life-Cycle Emissions of AI Hardware: A Cradle-To-Grave Approach and Generational Trends,” Feb. 2025, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2502.01671
-
[60]
Carbon- and Precedence-Aware Scheduling for Data Processing Clusters,
A. Lechowicz, R. Shenoy, N. Bashir, M. Hajiesmaili, A. Wierman, and C. Delimitrou, “Carbon- and Precedence-Aware Scheduling for Data Processing Clusters,” Feb. 2025, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2502.09717 35
-
[61]
(PDF) Ghost Work in Modern Business: Opportunities and Challenges
“(PDF) Ghost Work in Modern Business: Opportunities and Challenges.” Accessed: Aug. 04, 2025. [Online]. Available: https://www.researchgate.net/publication/390517385_Ghost_Work_in_Modern_Busines s_Opportunities_and_Challenges
-
[62]
The Glamorisation of Unpaid Labour: AI and its Influencers,
N. M. Nwachukwu, J. S. Roberts, and L. N. Montoya, “The Glamorisation of Unpaid Labour: AI and its Influencers,” Jul. 2023, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2308.02399
-
[63]
Playing Devil’s Advocate: Unmasking Toxicity and Vulnerabilities in Large Vision -Language Models,
A. Erol, T. Padhi, A. Saha, U. Kursuncu, and M. E. Aktas, “Playing Devil’s Advocate: Unmasking Toxicity and Vulnerabilities in Large Vision -Language Models,” Jan. 2025, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2501.09039
-
[64]
NBER WORKING PAPER SERIES GENERATIVE AI AND FIRM V ALUES,
A. L. Eisfeldt, G. Schubert, and M. Ben Zhang, “NBER WORKING PAPER SERIES GENERATIVE AI AND FIRM V ALUES,” 2023, Accessed: Aug. 04, 2025. [Online]. Available: http://www.nber.org/papers/w31222
work page 2023
-
[65]
K. Holden and M. Harsh, “On pipelines, readiness and annotative labour: Political geographies of AI and data infrastructures in Africa,” Polit Geogr, vol. 113, p. 103150, Aug. 2024, doi: 10.1016/J.POLGEO.2024.103150
-
[66]
Bias and Fairness in Large Language Models: A Survey,
I. O. Gallegos et al. , “Bias and Fairness in Large Language Models: A Survey,” Computational Linguistics , vol. 50, no. 3, pp. 1097 –1179, Sep. 2023, doi: 10.1162/coli_a_00524
-
[67]
Sociodemographic biases in medical decision making by large language models,
M. Omar et al., “Sociodemographic biases in medical decision making by large language models,” Nat Med, vol. 31, no. 6, pp. 1873 –1881, Jun. 2025, doi: 10.1038/S41591 -025- 03626- 6;SUBJMETA=3935,478,692,700;KWRD=MEDICAL+ETHICS,PUBLIC+HEALTH
-
[68]
Could a Conversational AI Identify Offensive Language?,
D. A. da Silva et al. , “Could a Conversational AI Identify Offensive Language?,” Information 2021, Vol. 12, Page 418 , vol. 12, no. 10, p. 418, Oct. 2021, doi: 10.3390/INFO12100418
-
[69]
‘They only care to show us the wheelchair’: Disability Representation in Text -to-Image AI Models,
K. A. Mack, R. Qadri, R. Denton, S. K. Kane, and C. L. Bennett, “‘They only care to show us the wheelchair’: Disability Representation in Text -to-Image AI Models,” Conference on Human Factors in Computing Systems - Proceedings, vol. 23, no. 24, May 2024, doi: 10.1145/3613904.3642166/SUPPL_FILE/3613904.3642166-TALK-VIDEO.VTT
work page doi:10.1145/3613904.3642166/suppl_file/3613904.3642166-talk-video.vtt 2024
-
[70]
Surprising gender biases in GPT,
R. A. Fulgu and V . Capraro, “Surprising gender biases in GPT,” Computers in Human Behavior Reports, vol. 16, p. 100533, Dec. 2024, doi: 10.1016/J.CHBR.2024.100533
-
[71]
Risks of Cultural Erasure in Large Language Models,
R. Qadri, A. M. Davani, K. Robinson, and V . Prabhakaran, “Risks of Cultural Erasure in Large Language Models,” Jan. 2025, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2501.01056
-
[72]
Mitigate One, Skew Another? Tackling Intersectional Biases in Text-to- Image Models,
P. Shukla et al., “Mitigate One, Skew Another? Tackling Intersectional Biases in Text-to- Image Models,” May 2025, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2505.17280
-
[73]
How Far Can We Extract Diverse Perspectives from Large Language Models?,
S. A. Hayati, M. Lee, D. Rajagopal, and D. Kang, “How Far Can We Extract Diverse Perspectives from Large Language Models?,” EMNLP 2024 - 2024 Conference on 36 Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 5336–5366, Nov. 2023, doi: 10.18653/v1/2024.emnlp-main.306
-
[74]
H. Shankar, V . S. P, T. Cavale, P. Kumaraguru, and A. Chakraborthy, “Sometimes the Model doth Preach: Quantifying Religious Bias in Open LLMs through Demographic Analysis in Asian Nations,” Mar. 2025, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2503.07510
-
[75]
Examining Identity Drift in Conversations of LLM Agents,
J. Choi, Y . Hong, M. Kim, and B. Kim, “Examining Identity Drift in Conversations of LLM Agents,” Dec. 2024, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2412.00804
-
[76]
G. P. Georgiou, “ChatGPT Exhibits Bias Toward Developed Countries Over Developing Ones, as Indicated by a Sentiment Analysis Approach,” J Lang Soc Psychol, vol. 44, no. 1, pp. 132–141, Jan. 2025, doi: 10.1177/0261927X241298337
-
[77]
Unequal V oices: How LLMs Construct Constrained Queer Narratives,
A. Ghosal, A. Gupta, and V . Srikumar, “Unequal V oices: How LLMs Construct Constrained Queer Narratives,” Jul. 2025, Accessed: Aug. 04, 2025. [Online]. Available: https://www.arxiv.org/pdf/2507.15585
-
[78]
Seeing Like an AI: How LLMs Apply (and Misapply) Wikipedia Neutrality Norms,
J. Ashkinaze, R. Guan, L. Kurek, E. Adar, C. Budak, and E. Gilbert, “Seeing Like an AI: How LLMs Apply (and Misapply) Wikipedia Neutrality Norms,” Proceedings of Preprint, vol. 1, Jul. 2024, doi: XXXXXXX.XXXXXXX
work page 2024
-
[79]
Quantifying Multilingual Performance of Large Language Models Across Languages,
Z. Li et al., “Quantifying Multilingual Performance of Large Language Models Across Languages,” Proceedings of the AAAI Conference on Artificial Intelligence , vol. 39, no. 27, pp. 28186–28194, Apr. 2024, doi: 10.1609/aaai.v39i27.35038
-
[80]
Diminishing Stereotype Bias in Image Generation Model using Reinforcemenlent Learning Feedback,
X. Chen and V . Foussereau, “Diminishing Stereotype Bias in Image Generation Model using Reinforcemenlent Learning Feedback,” Jun. 2024, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2407.09551
-
[81]
Targeted Data Augmentation for bias mitigation,
A. Mikołajczyk-Bareła, M. Ferlin, and M. Grochowski, “Targeted Data Augmentation for bias mitigation,” Aug. 2023, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2308.11386
-
[82]
X. Wu, S. Li, H. T. Wu, Z. Tao, and Y . Fang, “Does RAG Introduce Unfairness in LLMs? Evaluating Fairness in Retrieval -Augmented Generation Systems,” Proceedings - International Conference on Computational Linguistics, COLING , vol. Part F206484 -1, pp. 10021 –10036, Sep. 2024, Accessed: Aug. 04, 2025. [Online]. Available: https://arxiv.org/pdf/2409.19804
-
[83]
T. Zack et al., “Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study,” Lancet Digit Health , vol. 6, no. 1, pp. e12 –e22, Jan. 2024, doi: 10.1016/S2589-7500(23)00225-X
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.