pith. sign in

arxiv: 2601.06116 · v5 · pith:UVTI4ZJ4new · submitted 2026-01-03 · 💻 cs.AI · cs.CL· cs.CY

The Homogenization Problem in LLMs: Towards Meaningful Diversity in AI Safety

Pith reviewed 2026-05-21 16:58 UTC · model grok-4.3

classification 💻 cs.AI cs.CLcs.CY
keywords homogenizationAI safetylarge language modelsdiversitybias amplificationnormativityxeno-reproductiongender bias
0
0 comments X

The pith

Large language models homogenize outputs by reproducing and amplifying training biases, so AI safety must center on preserving meaningful diversity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that generative AI models reproduce human biases from training data and amplify them through mechanisms like mode collapse, leading to homogenization that harms minoritized groups while limiting everyone by narrowing the range of outputs. If correct, this would make addressing the loss of diversity a core AI safety priority rather than a secondary issue. The authors introduce a framework that lets stakeholders encode their own contexts and value systems to characterize homogenization in LLMs. They demonstrate the approach through an experiment detecting gender bias in Claude 3.5 Haiku on open-ended story prompts. Drawing on normativity to define the problem and xeno-reproduction as tasks that promote diversity, the work seeks to open collaborative research on advancing variety in AI.

Core claim

Generative AI models reproduce the human biases in their training data and further amplify them through mechanisms such as mode collapse. The loss of diversity produces homogenization, which not only harms the minoritized but impoverishes everyone. We argue homogenization should be a central concern in AI safety. To meaningfully characterize homogenization in Large Language Models (LLMs), we introduce a framework that allows stakeholders to encode their context and value system. We illustrate our approach with an experiment that surfaces gender bias in an LLM (Claude 3.5 Haiku) on an open-ended story prompt. Building from queer theory, we formalize homogenization in terms of normativity. Our

What carries the argument

The stakeholder-encoding framework for characterizing homogenization, which lets users define normativity from their values and uses xeno-reproduction tasks to promote diversity.

Load-bearing premise

That concepts of normativity and xeno-reproduction drawn from queer and feminist theory provide a rigorous and actionable formalization for measuring and mitigating homogenization in current LLMs.

What would settle it

If the stakeholder framework applied to Claude 3.5 Haiku and similar models shows no better detection or mitigation of homogenization than standard bias tests across repeated story-generation experiments, that would indicate the approach does not meaningfully characterize the problem.

Figures

Figures reproduced from arXiv: 2601.06116 by Ian Rios-Sialer.

Figure 1
Figure 1. Figure 1: Illustration of how system cores and orientations evolve through trajectories. In the example above, our system has sub-community representation structures. We can calculate each compliance by asking a judge LLM [144] to rate from [0,1] based on whether the community subgroup is explicitly represented in the string of text. Though initially ambiguous, the phrasing of the prompt may invite stereotyping, bia… view at source ↗
read the original abstract

Generative AI models reproduce the human biases in their training data and further amplify them through mechanisms such as mode collapse. The loss of diversity produces homogenization, which not only harms the minoritized but impoverishes everyone. We argue homogenization should be a central concern in AI safety. To meaningfully characterize homogenization in Large Language Models (LLMs), we introduce a framework that allows stakeholders to encode their context and value system. We illustrate our approach with an experiment that surfaces gender bias in an LLM (Claude 3.5 Haiku) on an open-ended story prompt. Building from queer theory, we formalize homogenization in terms of normativity. Borrowing language from feminist theory, we introduce the concept of xeno-reproduction as a class of tasks for mitigating homogenization by promoting diversity. Our work opens a collaborative line of research that seeks to understand and advance diversity in AI.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper argues that homogenization in LLMs—arising from reproduction and amplification of biases in training data via mechanisms such as mode collapse—should be treated as a central AI safety concern. It introduces a conceptual framework allowing stakeholders to encode their context and value systems in order to characterize homogenization, drawing on queer theory to formalize it in terms of normativity and on feminist theory to define xeno-reproduction as a class of mitigation tasks. The approach is illustrated by an informal experiment that surfaces gender bias in Claude 3.5 Haiku on an open-ended story prompt.

Significance. If the framework can be equipped with reproducible operational mappings from stakeholder values to concrete LLM metrics, the work would usefully expand AI safety discourse to treat loss of diversity as a first-class issue and could support more context-sensitive evaluation of model outputs. The manuscript's explicit call for collaborative, interdisciplinary research on this topic is a constructive contribution.

major comments (2)
  1. [Framework introduction] Framework section: the claim that the framework 'allows stakeholders to encode their context and value system' to meaningfully characterize homogenization rests on formalizing the target via normativity and xeno-reproduction, yet no explicit mapping is supplied from these concepts to computable quantities such as output diversity statistics, token-distribution entropy, or prompt-response pair metrics. This absence is load-bearing for the central claim of actionable characterization.
  2. [Illustrative experiment] Illustrative experiment: the gender-bias demonstration on Claude 3.5 Haiku is described only as an illustration and supplies neither quantitative metrics, error analysis, baseline comparisons, nor a worked example of how a stakeholder would encode a specific value system to produce the reported output. This leaves the reproducibility and generality of the encoding procedure untested.
minor comments (2)
  1. [Abstract] The abstract would benefit from a short clause noting that the current contribution is conceptual and that operationalization and empirical validation remain future work.
  2. [References] Adding precise citations to the specific queer-theory and feminist-theory sources invoked would clarify the provenance of the borrowed terminology.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and insightful comments, which help clarify the positioning of our conceptual contribution. We agree that the manuscript would benefit from greater explicitness regarding the illustrative nature of the experiment and potential pathways to operationalization. We respond to each major comment below and indicate planned revisions.

read point-by-point responses
  1. Referee: Framework section: the claim that the framework 'allows stakeholders to encode their context and value system' to meaningfully characterize homogenization rests on formalizing the target via normativity and xeno-reproduction, yet no explicit mapping is supplied from these concepts to computable quantities such as output diversity statistics, token-distribution entropy, or prompt-response pair metrics. This absence is load-bearing for the central claim of actionable characterization.

    Authors: We appreciate this observation. The framework is intentionally conceptual at this stage, using normativity (from queer theory) to define homogenization as the enforcement of dominant norms and xeno-reproduction (from feminist theory) to outline mitigation tasks that promote divergence from those norms. Stakeholders are invited to encode values by interpreting these concepts relative to their own contexts, rather than through a fixed computational procedure supplied in the paper. We acknowledge that the absence of explicit mappings to metrics such as entropy or diversity statistics limits immediate actionability. We will revise the framework section to include a brief discussion of example operationalizations (e.g., relating normativity to reduced output variance on value-laden prompts) while preserving the interdisciplinary, non-prescriptive character of the work. revision: yes

  2. Referee: Illustrative experiment: the gender-bias demonstration on Claude 3.5 Haiku is described only as an illustration and supplies neither quantitative metrics, error analysis, baseline comparisons, nor a worked example of how a stakeholder would encode a specific value system to produce the reported output. This leaves the reproducibility and generality of the encoding procedure untested.

    Authors: The demonstration is presented as an informal illustration to show how the framework can surface homogenization in practice, not as a rigorous empirical evaluation. We agree that it lacks quantitative metrics, error analysis, baselines, and a detailed encoding walkthrough, which restricts claims about reproducibility. We will revise the relevant section to state more explicitly that the example is illustrative only, to discuss its limitations, and to sketch one hypothetical worked example of value encoding (e.g., a stakeholder prioritizing gender diversity specifying prompt constraints). A full reproducible protocol would require additional empirical work beyond the current scope. revision: partial

Circularity Check

1 steps flagged

Homogenization characterization defined via stakeholder value encoding that the framework itself elicits

specific steps
  1. self definitional [Abstract]
    "To meaningfully characterize homogenization in Large Language Models (LLMs), we introduce a framework that allows stakeholders to encode their context and value system. ... Building from queer theory, we formalize homogenization in terms of normativity."

    The framework is defined as the mechanism for encoding stakeholder value systems to characterize homogenization, yet homogenization itself is formalized in terms of normativity that depends on those same value systems. This makes the target quantity (homogenization) a function of the inputs the framework elicits, reducing the characterization to a definitional loop by construction.

full rationale

The paper's central move introduces a framework whose purpose is to let stakeholders encode context and value systems in order to characterize homogenization. This creates a self-definitional structure: what counts as homogenization (via normativity) is determined by the same value-system inputs the framework is designed to solicit. No equations or fitted parameters are present, but the load-bearing claim reduces to this definitional loop rather than an independent mapping to LLM observables. The gender-bias illustration remains an informal demonstration and does not break the loop. No self-citations or imported uniqueness theorems appear in the provided text.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on untested assumptions imported from social theory and on the premise that stakeholder-encoded values can be operationalized without introducing new circularity or selection effects.

axioms (2)
  • domain assumption Loss of diversity through homogenization harms the minoritized and impoverishes everyone
    Stated directly in the abstract as justification for elevating homogenization to a central AI safety concern.
  • ad hoc to paper Queer theory supplies a useful formalization of homogenization in terms of normativity
    The abstract explicitly borrows this framing to characterize the problem.
invented entities (1)
  • xeno-reproduction no independent evidence
    purpose: A class of tasks for mitigating homogenization by promoting diversity
    New term introduced by borrowing language from feminist theory; no independent empirical handle provided in the abstract.

pith-pipeline@v0.9.0 · 5676 in / 1348 out tokens · 76314 ms · 2026-05-21T16:58:39.639666+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

149 extracted references · 149 canonical work pages

  1. [1]

    Ai suggestions homogenize writ- ing toward western styles and diminish cul- tural nuances

    Dhruv Agarwal, Mor Naaman, and Aditya Vashistha. Ai suggestions homogenize writ- ing toward western styles and diminish cul- tural nuances. InProceedings of the 2025 CHI Conference on Human Factors in Com- puting Systems, CHI ’25, page 1–21. ACM, April 2025

  2. [2]

    Duke University Press, 2006

    Sara Ahmed.Queer Phenomenology: Ori- entations, Objects, Others. Duke University Press, 2006

  3. [3]

    Un- derstanding hallucinations in diffusion mod- els through mode interpolation.URL https://arxiv

    Sumukh K Aithal, Pratyush Maini, Zachary C Lipton, and J Zico Kolter. Un- derstanding hallucinations in diffusion mod- els through mode interpolation.URL https://arxiv. org/abs/2406.09358, 2406, 2024

  4. [4]

    The wel- fare effects of social media

    Hunt Allcott, Luca Braghieri, Sarah Eich- meyer, and Matthew Gentzkow. The wel- fare effects of social media. 110(3):629–676, 2020

  5. [5]

    The gendered, epistemic injustices of generative ai.Australian Feminist Studies, 40(123):1– 21, 2025

    Isobel Barry and Elise Stephenson. The gendered, epistemic injustices of generative ai.Australian Feminist Studies, 40(123):1– 21, 2025

  6. [6]

    The importance of rare events and other outliers in global strategy research.Global Strategy Journal, 12:697–713, 03 2022

    Paul Beamish and Vanessa Hasse. The importance of rare events and other outliers in global strategy research.Global Strategy Journal, 12:697–713, 03 2022

  7. [7]

    International ai safety re- port

    Yoshua Bengio, Sören Mindermann, Daniel Privitera, et al. International ai safety re- port. Technical Report DSIT 2025/001, UK Department for Science, Innovation and Technology, January 2025. First Interna- tional AI Safety Report, published January 2025

  8. [8]

    Verso, 2017

    Franco Berardi.Futurability: The Age of Impotence and the Horizon of Possibility. Verso, 2017

  9. [9]

    J.-F. Bercher. Escort entropies and diver- gences and related canonical distribution. Physics Letters A, 375(33):2969–2973, Au- gust 2011

  10. [10]

    PhD thesis, Queen Mary University of Lon- don, 2025

    S Berns.Diversity in Generative Machine Learning to Enhance Creative Applications. PhD thesis, Queen Mary University of Lon- don, 2025

  11. [11]

    Bridg- ing generative deep learning and compu- tational creativity

    Sebastian Berns and Simon Colton. Bridg- ing generative deep learning and compu- tational creativity. InProceedings of the 11th International Conference on Computa- tional Creativity (ICCC’20), pages 406–409, 2020

  12. [12]

    Towards mode bal- ancing of generative models via diversity weights, 2023

    Sebastian Berns, Simon Colton, and Chris- tian Guckelsberger. Towards mode bal- ancing of generative models via diversity weights, 2023

  13. [13]

    Bettcher.Beyond Personhood: An Essay in Trans Philosophy

    T.M. Bettcher.Beyond Personhood: An Essay in Trans Philosophy. University of Minnesota Press, 2025

  14. [14]

    The power of outliers in research: What actually works, and does it matter?Pravaha, 30(1):84–91, 2024

    Dila Ram Bhandari, Kapil Shah, and Aayan Bhandari. The power of outliers in research: What actually works, and does it matter?Pravaha, 30(1):84–91, 2024. 10

  15. [15]

    Creel, Ananya Kumar, Dan Jurafsky, and Percy S Liang

    Rishi Bommasani, Kathleen A. Creel, Ananya Kumar, Dan Jurafsky, and Percy S Liang. Picking on the same person: Does algorithmic monoculture lead to outcome homogenization? In S. Koyejo, S. Mo- hamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors,Advances in Neural In- formation Processing Systems, volume 35, pages 3663–3678. Curran Associates, Inc., 2022

  16. [16]

    Bowker and Susan Leigh Star

    Geoffrey C. Bowker and Susan Leigh Star. Sorting Things Out: Classification and Its Consequences. Inside Technology. MIT Press, Cambridge, MA; London, England,

  17. [17]

    Also available as MIT Press paperback, 2000, ISBN 978-0-262- 52295-3; eISBN 978-0-262-26907-0

    First edition. Also available as MIT Press paperback, 2000, ISBN 978-0-262- 52295-3; eISBN 978-0-262-26907-0

  18. [18]

    The magnitude of categories of texts enriched by language models, 1 2025

    Tai-Danae Bradley and Juan Pablo Vi- gneaux. The magnitude of categories of texts enriched by language models, 1 2025

  19. [19]

    Active di- vergence with generative deep learning– a survey and taxonomy.arXiv preprint arXiv:2107.05599, 2021

    Terence Broad, Sebastian Berns, Simon Colton, and Mick Grierson. Active di- vergence with generative deep learning– a survey and taxonomy.arXiv preprint arXiv:2107.05599, 2021

  20. [20]

    Bucknall

    Benjamin S. Bucknall. Current and near- term ai as a potential existential risk factor, 2022

  21. [21]

    Alphasage: Structure-aware alpha mining via gflownets for robust ex- ploration, 2025

    Binqi Chen, Hongjun Ding, Ning Shen, Jin- sheng Huang, Taian Guo, Luchen Liu, and Ming Zhang. Alphasage: Structure-aware alpha mining via gflownets for robust ex- ploration, 2025

  22. [22]

    Cambridge University Press, 2022

    EugeniaCheng.The Joy of Abstraction: An Exploration of Math, Category Theory, and Life. Cambridge University Press, 2022

  23. [23]

    Diversity in stable gans: A systematic review of mode collapse mit- igation strategies.Engineering Reports, 7(6):e70209, 2025

    Matthew Cobbinah, Henry Nunoo- Mensah, Prince Ebenezer Adjei, Francisca Adoma Acheampong, Isaac Acquah, Eric Tutu Tchao, Andrew Selasi Agbemenu, Jerry John Kponyo, and Emmanuel Abaidoo. Diversity in stable gans: A systematic review of mode collapse mit- igation strategies.Engineering Reports, 7(6):e70209, 2025

  24. [24]

    Narrative responsi- bility and artificial intelligence: How ai challenges human responsibility and sense- making.AI & SOCIETY, 38(6):2437–2450, 2023

    Mark Coeckelbergh. Narrative responsi- bility and artificial intelligence: How ai challenges human responsibility and sense- making.AI & SOCIETY, 38(6):2437–2450, 2023

  25. [25]

    Adam Cole, Gregor Petrikovič, and Mick Grierson. Me vs. you: Wrestling with ai’s limits through queer experimental filmmak- ing. InProceedings of the 2025 Conference on Creativity and Cognition, pages 836–841, 2025

  26. [26]

    The philosophy of outliers: reintegrating rare events into biological science.Integrative and Comparative Biology, 61(6):2191–2198, 2021

    Chelsea N Cook, Angela R Freeman, James C Liao, and Lisa A Mangiamele. The philosophy of outliers: reintegrating rare events into biological science.Integrative and Comparative Biology, 61(6):2191–2198, 2021

  27. [27]

    A comprehensive taxonomy of hallucinations in large language models, 2025

    Manuel Cossio. A comprehensive taxonomy of hallucinations in large language models, 2025

  28. [28]

    Re- pair and redress: A research program for algorithmic futures, 2025

    Jenny L Davis and Apryl Williams. Re- pair and redress: A research program for algorithmic futures, 2025

  29. [29]

    The tail at scale.Communications of the ACM, 56(2):74–80, 2013

    Jeffrey Dean and Luiz André Barroso. The tail at scale.Communications of the ACM, 56(2):74–80, 2013

  30. [30]

    Mea- suring grammatical diversity from small corpora: Derivational entropy rates, mean length of utterances, and annotation invari- ance, 2024

    Fermin Moscoso del Prado Martin. Mea- suring grammatical diversity from small corpora: Derivational entropy rates, mean length of utterances, and annotation invari- ance, 2024

  31. [31]

    Hype and heavy tails: A closer look at data breaches.Journal of Cybersecurity, 2(1):3–14, 12 2016

    Benjamin Edwards, Steven Hofmeyr, and Stephanie Forrest. Hype and heavy tails: A closer look at data breaches.Journal of Cybersecurity, 2(1):3–14, 12 2016

  32. [32]

    Information geometry for maximum diversity distributions, 2024

    Shinto Eguchi. Information geometry for maximum diversity distributions, 2024. 11

  33. [33]

    Springer, Berlin, Heidelberg, 2 edition, 2005

    Matthias Ehrgott.Multicriteria Optimiza- tion. Springer, Berlin, Heidelberg, 2 edition, 2005

  34. [34]

    Challenges in creative generative models for music: a divergence maximization perspective.arXiv preprint arXiv:2211.08856, 2022

    Philippe Esling et al. Challenges in creative generative models for music: a divergence maximization perspective.arXiv preprint arXiv:2211.08856, 2022

  35. [35]

    A survey of diversity quantifi- cation in natural language processing: The why, what, where and how, 2025

    Louis Estève, Marie-Catherine de Marn- effe, Nurit Melnik, Agata Savary, and Olha Kanishcheva. A survey of diversity quantifi- cation in natural language processing: The why, what, where and how, 2025

  36. [36]

    Facebook response: Sri lanka human rights impact assessment, 2021

    Facebook. Facebook response: Sri lanka human rights impact assessment, 2021

  37. [37]

    Aspi- rational affordances of ai, 2025

    Sina Fazelpour and Meica Magnani. Aspi- rational affordances of ai, 2025

  38. [38]

    Don’t throw away your pre- trained model, 2025

    Shangbin Feng, Wenhao Yu, Yike Wang, Hongming Zhang, Yulia Tsvetkov, and Dong Yu. Don’t throw away your pre- trained model, 2025

  39. [39]

    Closing the curious case of neural text degeneration, 2023

    Matthew Finlayson, John Hewitt, Alexan- der Koller, Swabha Swayamdipta, and Ashish Sabharwal. Closing the curious case of neural text degeneration, 2023

  40. [40]

    The vendi score: A diversity evaluation metric for machine learning, 2023

    Dan Friedman and Adji Bousso Dieng. The vendi score: A diversity evaluation metric for machine learning, 2023

  41. [41]

    Generative ai and the politics of visibility.Big Data & Society, 11(2):20539517241252131, 2024

    Tarleton Gillespie. Generative ai and the politics of visibility.Big Data & Society, 11(2):20539517241252131, 2024

  42. [42]

    Trystan S. Goetze. Hermeneutical dissent and the species of hermeneutical injustice. Hypatia, 33(1):73–90, 2018

  43. [43]

    Pauseartificialintelligencere- search? understanding ai policy challenges

    AviGoldfarb. Pauseartificialintelligencere- search? understanding ai policy challenges. Canadian Journal of Economics/Revue canadienne d’économique, 57(2):363–377, 2024

  44. [44]

    Duke University Press, Durham, NC, 2005

    Gayatri Gopinath.Impossible Desires: Queer Diasporas and South Asian Public Cultures. Duke University Press, Durham, NC, 2005

  45. [45]

    Algorith- mic realism: expanding the boundaries of algorithmic thought

    Ben Green and Salomé Viljoen. Algorith- mic realism: expanding the boundaries of algorithmic thought. InProceedings of the 2020 conference on fairness, accountability, and transparency, pages 19–31, 2020

  46. [46]

    Beyond the norm: A survey of synthetic data generation for rare events, 2025

    Jingyi Gu, Xuan Zhang, and Guiling Wang. Beyond the norm: A survey of synthetic data generation for rare events, 2025

  47. [47]

    Benchmarking linguistic diversity of large language models, 2025

    Yanzhu Guo, Guokan Shang, and Chloé Clavel. Benchmarking linguistic diversity of large language models, 2025

  48. [48]

    The curi- ous decline of linguistic diversity: Training language models on synthetic text, 2024

    Yanzhu Guo, Guokan Shang, Michalis Vazirgiannis, and Chloé Clavel. The curi- ous decline of linguistic diversity: Training language models on synthetic text, 2024

  49. [49]

    Bias in large language models: Origin, evaluation, and mitigation, 2024

    Yufei Guo, Muzhe Guo, Juntao Su, Zhou Yang, Mengqiu Zhu, Hongfei Li, Mengyang Qiu, and Shuo Shuo Liu. Bias in large language models: Origin, evaluation, and mitigation, 2024

  50. [50]

    Why ai ethics needs con- ceptual engineers, September 2023

    Jeremy Hadfield. Why ai ethics needs con- ceptual engineers, September 2023. Imagi- naries (Substack)

  51. [51]

    Polity Press, 04 2024

    Byung-Chul Han.The Crisis of Narration. Polity Press, 04 2024

  52. [52]

    What is ai safety? what do we want it to be?, 2025

    Jacqueline Harding and Cameron Domenico Kirk-Giannini. What is ai safety? what do we want it to be?, 2025

  53. [53]

    Defeating nondeterminism in llm inference

    Horace He and Thinking Machines Lab. Defeating nondeterminism in llm inference. Thinking Machines Lab: Connectionism, 2025

  54. [54]

    Hester.Xenofeminism

    H. Hester.Xenofeminism. Theory Redux. Polity Press, 2018

  55. [55]

    The diversity–innovation paradox in science

    Bas Hofstra, Vivek V Kulkarni, Sebastian Munoz-Najar Galvez, Bryan He, Dan Ju- rafsky, and Daniel A McFarland. The diversity–innovation paradox in science. Proceedings of the National Academy of Sci- ences, 117(17):9284–9291, 2020. 12

  56. [56]

    A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions.ACM Transactions on Information Systems, 43(2):1–55, January 2025

    Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qianglong Chen, Weihua Peng, Xi- aocheng Feng, Bing Qin, and Ting Liu. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions.ACM Transactions on Information Systems, 43(2):1–55, January 2025

  57. [57]

    Generative bias: widespread, unex- pected, and uninterpretable biases in gen- erative models and their implications.AI & SOCIETY, pages 1–13, 2025

    Linus Ta-Lun Huang and Tsung-Ren Huang. Generative bias: widespread, unex- pected, and uninterpretable biases in gen- erative models and their implications.AI & SOCIETY, pages 1–13, 2025

  58. [58]

    The gan is dead; long live the gan! a modern gan base- line

    Yiwen Huang, Aaron Gokaslan, Volodymyr Kuleshov, and James Tompkin. The gan is dead; long live the gan! a modern gan base- line. In A. Globerson, L. Mackey, D. Bel- grave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural In- formation Processing Systems, volume 37, pages44177–44215.CurranAssociates, Inc., 2024

  59. [59]

    Open-endedness is essen- tial for artificial superhuman intelligence, 2024

    Edward Hughes, Michael Dennis, Jack Parker-Holder, Feryal Behbahani, Aditi Mavalankar, Yuge Shi, Tom Schaul, and Tim Rocktaschel. Open-endedness is essen- tial for artificial superhuman intelligence, 2024

  60. [60]

    Voice and ai: The subaltern’s challenge, August 2024

    Atif Hussain. Voice and ai: The subaltern’s challenge, August 2024. Medium

  61. [61]

    Llm output homogenization is task dependent, 2025

    Shomik Jain, Jack Lanchantin, Maximil- ian Nickel, Karen Ullrich, Ashia Wilson, and Jamelle Watson-Daniels. Llm output homogenization is task dependent, 2025

  62. [62]

    Technologies of humility

    Sheila Jasanoff. Technologies of humility. Nature, 450(7166):33–33, 2007

  63. [63]

    Artificial hivemind: The open-ended homogeneity of language mod- els (and beyond), 2025

    Liwei Jiang, Yuanjun Chai, Margaret Li, Mickel Liu, Raymond Fok, Nouha Dziri, Yu- lia Tsvetkov, Maarten Sap, Alon Albalak, and Yejin Choi. Artificial hivemind: The open-ended homogeneity of language mod- els (and beyond), 2025

  64. [64]

    Feng Ju, Zeyu Qin, Rui Min, Zhitao He, Lingpeng Kong, and Yi R. Fung. Reasoning path divergence: A new metric and cura- tion strategy to unlock llm diverse thinking, 2025

  65. [65]

    Daniel Jurafsky and James H. Martin. Speech and Language Processing. Pearson Prentice Hall, 2nd edition, 2009

  66. [66]

    Vempala, and Edwin Zhang

    Adam Tauman Kalai, Ofir Nachum, San- tosh S. Vempala, and Edwin Zhang. Why language models hallucinate, 2025

  67. [67]

    On characterizations for lan- guage generation: Interplay of hallucina- tions, breadth, and stability, 2025

    Alkis Kalavasis, Anay Mehrotra, and Grig- oris Velegkas. On characterizations for lan- guage generation: Interplay of hallucina- tions, breadth, and stability, 2025

  68. [68]

    On the limits of language generation: Trade-offs between hallucina- tion and mode collapse, 2025

    Alkis Kalavasis, Anay Mehrotra, and Grig- oris Velegkas. On the limits of language generation: Trade-offs between hallucina- tion and mode collapse, 2025

  69. [69]

    Two types of ai ex- istential risk: Decisive and accumulative

    Atoosa Kasirzadeh. Two types of ai ex- istential risk: Decisive and accumulative. 2025

  70. [70]

    Taxonomizing and measuring representa- tional harms: A look at image tagging, 2023

    Jared Katzman, Angelina Wang, Morgan Scheuerman, Su Lin Blodgett, Kristen Laird, Hanna Wallach, and Solon Barocas. Taxonomizing and measuring representa- tional harms: A look at image tagging, 2023

  71. [71]

    human- like

    Kelly Kendro, Jeffrey Maloney, and Scott Jarvis. Do llms produce texts with “human- like” lexical diversity?Unpublished manuscript / ResearchGate, 2025

  72. [72]

    Language generation in the limit

    Jon Kleinberg and Sendhil Mullainathan. Language generation in the limit. InAd- vances in Neural Information Processing Systems 37 (NeurIPS 2024), 2024

  73. [73]

    Algorithmic black swans

    Noam Kolt. Algorithmic black swans. 101:1177–1240, 2024

  74. [74]

    Chain of thought moni- torability: A new and fragile opportunity for ai safety, 2025

    Tomek Korbak, Mikita Balesni, Eliza- beth Barnes, Yoshua Bengio, Joe Benton, Joseph Bloom, Mark Chen, Alan Cooney, 13 Allan Dafoe, Anca Dragan, Scott Em- mons, Owain Evans, David Farhi, Ryan Greenblatt, Dan Hendrycks, Marius Hobb- hahn, Evan Hubinger, Geoffrey Irving, Erik Jenner, Daniel Kokotajlo, Victoria Krakovna, Shane Legg, David Lindner, David Luan,...

  75. [75]

    Andrew Kyle Lampinen, Stephanie C. Y. Chan, and Katherine Hermann. Learned feature representations are biased by com- plexity, learning order, position, and more, 2024

  76. [76]

    Representation biases: will we achieve com- plete understanding by analyzing represen- tations?arXiv preprint arXiv:2507.22216, 2025

    Andrew Kyle Lampinen, Stephanie CY Chan, Yuxuan Li, and Katherine Hermann. Representation biases: will we achieve com- plete understanding by analyzing represen- tations?arXiv preprint arXiv:2507.22216, 2025

  77. [77]

    Generating diverse hypotheses for inductive reasoning

    Kang-il Lee, Hyukhun Koh, Dongryeol Lee, Seunghyun Yoon, Minsung Kim, and Ky- omin Jung. Generating diverse hypotheses for inductive reasoning. In Luis Chiruzzo, Alan Ritter, and Lu Wang, editors,Proceed- ings of the 2025 Conference of the Nations of the Americas Chapter of the Associa- tion for Computational Linguistics: Human Language Technologies (Volu...

  78. [78]

    Messi H. J. Lee. Examining the robustness of homogeneity bias to hyperparameter ad- justments in gpt-4, 2025

  79. [79]

    Nov- elty search and the problem with objectives

    Joel Lehman and Kenneth O Stanley. Nov- elty search and the problem with objectives. InGenetic programming theory and prac- tice IX, pages 37–56. Springer, 2011

  80. [80]

    Entropy and diversity: The axiomatic approach, 2024

    Tom Leinster. Entropy and diversity: The axiomatic approach, 2024

Showing first 80 references.