pith. sign in

arxiv: 2505.19353 · v1 · submitted 2025-05-25 · 💻 cs.AI · cs.CL· cs.CY· cs.SE

Architectures of Error: A Philosophical Inquiry into AI and Human Code Generation

Pith reviewed 2026-05-19 12:58 UTC · model grok-4.3

classification 💻 cs.AI cs.CLcs.CYcs.SE
keywords AI code generationerror architectureshuman-AI collaborationepistemic distinctionsemantic coherencesoftware securitygenerative modelsphilosophy of AI
0
0 comments X p. Extension

The pith

Distinct architectures of error distinguish human and AI code generation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to establish a clear separation between the error profiles of human programmers and those of generative AI systems used for coding. It does so by framing these as different architectures rooted in cognitive versus stochastic processes. A sympathetic reader would care because this separation could reshape how collaboration between humans and AI is managed in software projects. The analysis explores how these differences interact and change as technology advances, offering tools for both philosophical reflection and practical engineering decisions.

Core claim

By examining the common vulnerability to error in code generation, distinct architectures of error can be articulated that reveal fundamentally different causal origins between human-cognitive and artificial-stochastic processes. Grounded in relevant philosophical approaches, this distinction raises important questions about semantic coherence, security robustness, epistemic limits, and control mechanisms within human-AI collaborative software development, with levels of abstraction providing insight into their interactions and potential evolution.

What carries the argument

Architectures of Error, a framework for contrasting the sources and structures of mistakes in code from human versus artificial origins.

If this is right

  • If the distinction holds, collaborative development must address unique challenges to semantic coherence in combined outputs.
  • Security measures should account for the stochastic nature of AI errors separately from human ones.
  • Understanding epistemic limits helps in setting appropriate reliance on AI-generated code.
  • Control mechanisms can be refined to manage the specific error types from each contributor.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This could lead to specialized debugging tools that treat AI-generated code differently from human-written code.
  • The framework might extend to other areas of human-AI interaction beyond coding, such as content creation.
  • Advancements in AI could alter these error architectures, requiring ongoing reassessment of the distinction.

Load-bearing premise

The origins of errors in human code generation and AI code generation are fundamentally different in nature and can be separated using philosophical analysis.

What would settle it

An empirical study that finds no meaningful difference in the causal origins of errors between human-written and AI-generated code segments would falsify the central distinction.

read the original abstract

With the rise of generative AI (GenAI), Large Language Models are increasingly employed for code generation, becoming active co-authors alongside human programmers. Focusing specifically on this application domain, this paper articulates distinct ``Architectures of Error'' to ground an epistemic distinction between human and machine code generation. Examined through their shared vulnerability to error, this distinction reveals fundamentally different causal origins: human-cognitive versus artificial-stochastic. To develop this framework and substantiate the distinction, the analysis draws critically upon Dennett's mechanistic functionalism and Rescher's methodological pragmatism. I argue that a systematic differentiation of these error profiles raises critical philosophical questions concerning semantic coherence, security robustness, epistemic limits, and control mechanisms in human-AI collaborative software development. The paper also utilizes Floridi's levels of abstraction to provide a nuanced understanding of how these error dimensions interact and may evolve with technological advancements. This analysis aims to offer philosophers a structured framework for understanding GenAI's unique epistemological challenges, shaped by these architectural foundations, while also providing software engineers a basis for more critically informed engagement.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper articulates a framework of 'Architectures of Error' to establish an epistemic distinction between human-cognitive and artificial-stochastic causal origins of errors in generative AI code generation. Drawing on Dennett's mechanistic functionalism and Rescher's methodological pragmatism, with Floridi's levels of abstraction for interaction analysis, it claims this differentiation raises questions about semantic coherence, security robustness, epistemic limits, and control in human-AI collaborative software development.

Significance. If the distinction is shown to follow rigorously from the cited sources rather than being presupposed, the framework could supply philosophers with a structured lens for GenAI epistemological issues and give software engineers a basis for more critical engagement with error profiles in code co-generation. The bridging intent between philosophy and engineering practice is a positive feature.

major comments (2)
  1. [Framework development (drawing on Dennett and Rescher)] The central claim that the distinction between human-cognitive and artificial-stochastic error origins is substantiated by Dennett's mechanistic functionalism and Rescher's methodological pragmatism lacks an explicit derivation. Functionalism permits equivalent mechanistic descriptions of biological and artificial processes, yet the manuscript does not demonstrate why AI code-generation errors must be treated as irreducibly stochastic while human ones remain cognitive.
  2. [Application of Floridi's levels of abstraction] Floridi's levels of abstraction are invoked to provide a nuanced understanding of error dimension interactions, but the text does not show how this resolves the grounding gap for the fundamental causal-origin distinction, leaving the framework dependent on an interpretive premise rather than a derived result.
minor comments (2)
  1. [Introduction and abstract] The term 'Architectures of Error' is introduced as a novel framing but would benefit from an early operational definition or schematic to clarify its relation to the cited philosophers before implications are drawn.
  2. [Philosophical analysis sections] References to specific passages in Dennett, Rescher, and Floridi could be expanded with direct quotations or page citations to strengthen the interpretive links.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The comments identify areas where the philosophical grounding of the 'Architectures of Error' framework can be made more explicit, and we will undertake revisions to address these points directly.

read point-by-point responses
  1. Referee: [Framework development (drawing on Dennett and Rescher)] The central claim that the distinction between human-cognitive and artificial-stochastic error origins is substantiated by Dennett's mechanistic functionalism and Rescher's methodological pragmatism lacks an explicit derivation. Functionalism permits equivalent mechanistic descriptions of biological and artificial processes, yet the manuscript does not demonstrate why AI code-generation errors must be treated as irreducibly stochastic while human ones remain cognitive.

    Authors: We accept that the manuscript would benefit from a more explicit derivation. Although the text draws on Dennett to note that functionalism permits multiple realizations without equating intentional human cognition to stochastic token prediction, and on Rescher to contrast pragmatic error resolution in human problem-solving with probabilistic sampling, the step-by-step linkage to the irreducibly stochastic character of AI errors is not laid out with sufficient clarity. In the revised version we will add a dedicated subsection that derives the distinction by showing how Dennett's intentional stance applies to human cognitive errors but cannot be sustained for LLM outputs, while Rescher's methodological pragmatism highlights the non-cognitive, distribution-based nature of artificial errors even under mechanistic description. revision: yes

  2. Referee: [Application of Floridi's levels of abstraction] Floridi's levels of abstraction are invoked to provide a nuanced understanding of error dimension interactions, but the text does not show how this resolves the grounding gap for the fundamental causal-origin distinction, leaving the framework dependent on an interpretive premise rather than a derived result.

    Authors: We agree that the manuscript invokes Floridi's levels primarily to analyze interactions among error dimensions without sufficiently demonstrating how the levels themselves help derive the causal-origin distinction. In the revision we will expand the relevant discussion to show explicitly that adopting the computational level of abstraction isolates the stochastic sampling mechanism in generative models, while the intentional level preserves the cognitive character of human error, thereby using the levels to ground rather than merely illustrate the distinction. revision: yes

Circularity Check

0 steps flagged

No significant circularity; framework grounded in external philosophical sources

full rationale

The paper articulates 'Architectures of Error' by drawing upon Dennett's mechanistic functionalism and Rescher's methodological pragmatism to substantiate the claimed epistemic distinction between human-cognitive and artificial-stochastic error origins, supplemented by Floridi's levels of abstraction. This is an interpretive application of independent external citations rather than any self-referential definition, fitted input renamed as prediction, or self-citation load-bearing step. No equations, parameters, or internal reductions appear in the abstract or described derivation chain; the central claims remain open to external philosophical benchmarks and do not collapse to the paper's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 3 axioms · 1 invented entities

The central claim rests on interpretive applications of three philosophers' ideas and the introduction of a new conceptual lens without new data, formal proofs, or empirical validation.

axioms (3)
  • domain assumption Dennett's mechanistic functionalism can be used to model human cognitive error in code generation
    Invoked to ground the human-cognitive side of the error distinction.
  • domain assumption Rescher's methodological pragmatism applies to evaluating AI-generated code errors
    Used to ground the artificial-stochastic side of the error distinction.
  • domain assumption Floridi's levels of abstraction provide a way to analyze interactions between error dimensions and their evolution
    Cited for nuanced understanding of how error profiles interact with technological advancements.
invented entities (1)
  • Architectures of Error no independent evidence
    purpose: To ground an epistemic distinction between human and machine code generation
    New conceptual framework introduced to organize the analysis of error profiles.

pith-pipeline@v0.9.0 · 5719 in / 1563 out tokens · 137863 ms · 2026-05-19T12:58:36.979342+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

  1. [1]

    Abbassi, A.A., Silva, L.D., Nikanjam, A., Khomh, F. (2025). Unveiling inefficien- cies in llm-generated code: Toward a comprehensive taxonomy. Retrieved from https://arxiv.org/abs/2503.06327

  2. [2]

    Baquero, C. (2025). The last solo programmers. https://cacm.acm.org/blogcacm/ the-last-solo-programmers/. ([Accessed 26-04-2025])

  3. [3]

    (2024, nov 25)

    Barassi, V. (2024, nov 25). Toward a Theory of AI Errors: Making Sense of Halluci- nations, Catastrophic Failures, and the Fallacy of Generative AI. Harvard Data Science Review(Special Issue 5), , https://doi.org/10.1162/99608f92.ad8ebbd4 (https://hdsr.mitpress.mit.edu/pub/1yo82mqa) 48

  4. [4]

    (2016, March)

    Beschastnikh, I., Wang, P., Brun, Y., Ernst, M.D. (2016, March). Debugging dis- tributed systems: Challenges and options for validation and debugging. Queue, 14(2), 91–110, https://doi.org/10.1145/2927299.2940294 Retrieved from https://doi.org/10.1145/2927299.2940294

  5. [5]

    (2025, Apr 07)

    Bianchini, F. (2025, Apr 07). Generative artificial intelligence: A concept in progress. Philosophy & Technology, 38(2), 46, https://doi.org/10.1007/s13347-025-00875 -8 Retrieved from https://doi.org/10.1007/s13347-025-00875-8

  6. [6]

    (2018).Hybrid metaheuristics: Powerful tools for optimization (1st ed.)

    Blum, C., & Raidl, G.R. (2018).Hybrid metaheuristics: Powerful tools for optimization (1st ed.). Springer Publishing Company, Incorporated

  7. [7]

    (2021, Sep 25)

    Caporuscio, C. (2021, Sep 25). Introspection and belief: Failures of introspective belief formation. Review of Philosophy and Psychology , , https://doi.org/10.1007/ s13164-021-00585-y Retrieved from https://doi.org/10.1007/s13164-021-00585- y

  8. [8]

    Dennett, D. (1971). Intentional systems. Journal of Philosophy, 68(February), 87–106, https://doi.org/10.2307/2025382

  9. [9]

    Dennett, D. (2017). From bacteria to bach and back: The evolution of minds

  10. [10]

    (2008, Sep 01)

    Floridi, L. (2008, Sep 01). The method of levels of abstraction. Minds and Machines , 18(3), 303-329, https://doi.org/10.1007/s11023-008-9113-7 Retrieved from https://doi.org/10.1007/s11023-008-9113-7 49

  11. [11]

    (2019, Mar 01)

    Floridi, L. (2019, Mar 01). What the near future of artificial intelligence could be. Phi- losophy & Technology, 32(1), 1-15, https://doi.org/10.1007/s13347-019-00345-y Retrieved from https://doi.org/10.1007/s13347-019-00345-y

  12. [12]

    Hosseini, P., Castro, I., Ghinassi, I., Purver, M. (2024). Efficient solutions for an intriguing failure of llms: Long context window does not mean llms can analyze long sequences flawlessly. Retrieved from https://arxiv.org/abs/2408.01866

  13. [13]

    Huang, D., Xie, X., Zhang, J., Chen, J., Bu, Q., Cui, H. (2024). Bias testing and mitigation in llm-based code generation. Retrieved from https://arxiv.org/abs/2309.14345

  14. [14]

    Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., . . . Liu, T. (2025, January). A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM Transactions on Informa- tion Systems , 43(2), 1–55, https://doi.org/10.1145/3703155 Retrieved from http://dx.doi.org/10.1145/3703155

  15. [15]

    Huynh, N., & Lin, B. (2025). Large language models for code generation: A compre- hensive survey of challenges, techniques, evaluation, and applications. Retrieved from https://arxiv.org/abs/2503.01245

  16. [16]

    Kabir, S., Udo-Imeh, D.N., Kou, B., Zhang, T. (2024). Is stack overflow obsolete? an empirical study of the characteristics of chatgpt answers to stack overflow questions. Proceedings of the 2024 chi conference on human factors in computing systems. New York, NY, USA: Association for Computing Machinery. Retrieved from https://doi.org/10.1145/3613904.3642596 50

  17. [17]

    (2024, Mar 05)

    Kuo, C.-H., & Prat, C.S. (2024, Mar 05). Computer programmers show distinct, expertise-dependent brain responses to violations in form and meaning when reading code. Scientific Reports, 14(1), 5404, https://doi.org/10.1038/s41598 -024-56090-6 Retrieved from https://doi.org/10.1038/s41598-024-56090-6

  18. [18]

    McKendrick, J. (2025). Will AI replace software engineers? It depends on who you ask — zdnet.com. https://www.zdnet.com/article/will-ai-replace-software-engineers -it-depends-on-who-you-ask/. ([Accessed 16-05-2025])

  19. [19]

    Deepmind, G. (2025). AlphaEvolve : A coding agent for scientific and algorithmic discovery

  20. [20]

    (2025, January)

    Ouyang, S., Zhang, J.M., Harman, M., Wang, M. (2025, January). An empirical study of the non-determinism of chatgpt in code generation. ACM Trans. Softw. Eng. Methodol. , 34(2), , https://doi.org/10.1145/3697010 Retrieved from https://doi.org/10.1145/3697010

  21. [21]

    Pierce, B.C. (2002). Types and programming languages (1st ed.). The MIT Press

  22. [22]

    Rescher, N. (1992). Rationality: A philosophical inquiry into the nature and the rationale of reason. the clarendon library of logic and philosophy. Philosophy and Rhetoric, 25(1), 82–84, 51

  23. [23]

    Rescher, N. (2003). Epistemology: An introduction to the theory of knowledge . State University of New York Press

  24. [24]

    Rescher, N. (2017). Value reasoning: On the pragmatic rationality of evaluation . Springer International Publishing

  25. [25]

    Robeyns, M., Szummer, M., Aitchison, L. (2025). A self-improving coding agent. Retrieved from https://arxiv.org/abs/2504.15228

  26. [26]

    Simon, J. (2015). Distributed epistemic responsibility in a hyperconnected era. In L. Floridi (Ed.), The onlife manifesto: Being human in a hyperconnected era (pp. 145–159). Cham: Springer International Publishing. Retrieved from https://doi.org/10.1007/978-3-319-04093-6 17

  27. [27]

    (1984, August)

    Thompson, K. (1984, August). Reflections on trusting trust. Commun. ACM , 27(8), 761–763, https://doi.org/10.1145/358198.358210 Retrieved from https://doi.org/10.1145/358198.358210

  28. [28]

    Ulfsnes, R., Moe, N.B., Stray, V., Skarpen, M. (2024). Transforming software devel- opment with generative ai: Empirical insights on collaboration and workflow. In A. Nguyen-Duc, P. Abrahamsson, & F. Khomh (Eds.), Generative ai for effec- tive software development (pp. 219–234). Cham: Springer Nature Switzerland. Retrieved from https://doi.org/10.1007/978...

  29. [29]

    (2021, Jun 01)

    Zednik, C. (2021, Jun 01). Solving the black box problem: A normative framework for explainable artificial intelligence. Philosophy & Technology , 34(2), 265-288, https://doi.org/10.1007/s13347-019-00382-7 Retrieved from https://doi.org/10.1007/s13347-019-00382-7 52