arxiv: 2603.22106 · v4 · submitted 2026-03-23 · 💻 cs.SE

Recognition: no theorem link

From Technical Debt to Cognitive and Intent Debt: Rethinking Software Health in the Age of AI

Margaret-Anne Storey

Authors on Pith no claims yet

Pith reviewed 2026-05-15 00:32 UTC · model grok-4.3

classification 💻 cs.SE

keywords software healthtechnical debtcognitive debtintent debtgenerative AIsoftware engineeringAI-assisted developmentsoftware maintenance

0 comments

The pith

Generative AI accelerates coding but risks accumulating cognitive debt in teams and intent debt in missing rationale.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that AI-generated code outpaces human understanding, creating two new debt types alongside traditional technical debt: cognitive debt as the loss of shared mental models across a team, and intent debt as the absence of explicit goals and constraints. It proposes a Triple Debt Model to diagnose and manage how these three debts interact and affect overall software health. Practitioners need this shift in focus because unaddressed cognitive and intent debts could make AI-assisted systems increasingly unsafe or expensive to evolve. The model outlines ways to identify each debt type and suggests mitigation steps while raising open questions for the field.

Core claim

The paper proposes a Triple Debt Model in which software health depends on three interacting forms of debt: technical debt residing in the code, cognitive debt as the erosion of shared understanding and mental models within the team, and intent debt as the lack of externalized rationale, goals, and constraints that humans and AI agents require to change the system safely.

What carries the argument

The Triple Debt Model, which treats technical debt in code, cognitive debt in people, and intent debt in externalized knowledge as distinct yet interacting properties whose balance determines software health.

If this is right

AI code generation increases the relative importance of cognitive and intent debt over pure technical debt.
Cognitive debt can be diagnosed through assessments of team mental models and shared understanding of the system.
Intent debt can be mitigated by externalizing goals, constraints, and rationale alongside generated code.
The model reframes practitioner priorities toward managing all three debt types rather than code quality alone.
Open points of debate remain around practical methods for measuring and reducing cognitive and intent debt.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Teams may need new documentation practices that capture intent at the moment of AI code generation rather than after the fact.
Quantitative metrics for cognitive debt could be developed by tracking how quickly new team members can safely modify the codebase.
The model suggests software tools should prioritize capturing and querying intent alongside code changes in AI workflows.

Load-bearing premise

Cognitive debt and intent debt exist as distinct, diagnosable properties whose interactions with technical debt meaningfully determine software health.

What would settle it

Longitudinal studies of AI-assisted projects that measure team comprehension, rationale availability, and maintenance costs and find no added risks when cognitive and intent debts are ignored.

read the original abstract

Generative AI is accelerating software development, but may quietly shift where the most significant risks lie. As AI generates code faster than teams can understand it, two under appreciated forms of debt accumulate: cognitive debt, the erosion of shared understanding across a team, and intent debt, the absence of externalized rationale that developers and AI agents need to work safely with code. This article proposes a Triple Debt Model for reasoning about software health, built around three interacting debt types: technical debt in code, cognitive debt in people, and intent debt in externalized knowledge. Cognitive debt is a team-level, project-level property reflecting the erosion of shared understanding across a software system over time, leading to increasingly inadequate shared mental models for reasoning about and safely changing the system. Intent debt refers to the absence or erosion of explicit rationale, goals, and constraints that guide how humans and agents evolve the system. We discuss how generative AI changes the relative importance of these debt types, how each can be diagnosed and mitigated, and surface points of debate for practitioners.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proposes a Triple Debt Model with new cognitive and intent debt categories to address AI-era software risks, but it stays at high-level definitions without evidence or metrics.

read the letter

The main takeaway is that this is a conceptual framing paper. It adds cognitive debt (erosion of shared team understanding) and intent debt (missing externalized rationale for code) to technical debt, claiming generative AI shifts their relative importance for overall software health. The abstract and discussion lay this out clearly as a way to reason about risks when AI generates code faster than teams can keep up with it mentally or document it properly.

Referee Report

3 major / 1 minor

Summary. The paper proposes a Triple Debt Model for software health in the age of generative AI, consisting of technical debt (in code), cognitive debt (erosion of shared team understanding and mental models), and intent debt (absence of externalized rationale, goals, and constraints). It argues that AI accelerates code production while shifting risks toward the latter two debt types, which interact with technical debt, and discusses diagnosis, mitigation, and practitioner debates.

Significance. If the distinctions and interactions hold, the framework could provide a useful lens for software engineering practitioners to anticipate maintenance risks in AI-augmented development, emphasizing human and knowledge aspects alongside code quality. The conceptual nature offers potential to stimulate discussion but currently lacks grounding that would elevate its impact.

major comments (3)

[Abstract / Triple Debt Model introduction] Abstract and introduction: Cognitive debt is defined only as 'erosion of shared understanding across a team' and intent debt as 'absence or erosion of explicit rationale'; these lack operational definitions, metrics, thresholds, or diagnostic procedures, rendering the claims of distinctness, diagnosability, and meaningful interactions with technical debt untestable as stated.
[Discussion of AI changes to debt types] Section discussing generative AI's impact on relative importance: The argument that AI shifts risks toward cognitive and intent debt is presented qualitatively without reference to specific empirical studies, benchmarks, or data on AI-generated code comprehension or maintenance outcomes, leaving the central shift claim unsupported.
[Diagnosis and mitigation section] The mitigation and diagnosis discussion: No concrete methods, tools, examples, or validation approaches are provided for identifying or reducing cognitive/intent debt, which is load-bearing for the model's claimed practical utility in reasoning about software health.

minor comments (1)

[Abstract] The abstract and full text would benefit from clearer separation between definitional statements and any implied testable predictions to avoid circularity in the model presentation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the scope of our conceptual framework. The manuscript is a position paper proposing the Triple Debt Model rather than an empirical study; we will revise to strengthen definitional clarity, add practitioner heuristics, and explicitly frame the AI-shift argument as a hypothesis grounded in logical implications and existing SE literature, while noting the need for future validation.

read point-by-point responses

Referee: [Abstract / Triple Debt Model introduction] Abstract and introduction: Cognitive debt is defined only as 'erosion of shared understanding across a team' and intent debt as 'absence or erosion of explicit rationale'; these lack operational definitions, metrics, thresholds, or diagnostic procedures, rendering the claims of distinctness, diagnosability, and meaningful interactions with technical debt untestable as stated.

Authors: We agree the current definitions are intentionally high-level to introduce the framework. In revision we will expand the model section with preliminary indicators (e.g., for cognitive debt: divergence in team mental models measured via code walkthroughs or architecture reviews; for intent debt: missing decision logs or constraint specifications), reference related SE concepts such as knowledge vaporization and rationale capture, and add an explicit statement that full operationalization, metrics, and thresholds remain open research questions. This preserves the conceptual contribution while making interactions more amenable to future testing. revision: yes
Referee: [Discussion of AI changes to debt types] Section discussing generative AI's impact on relative importance: The argument that AI shifts risks toward cognitive and intent debt is presented qualitatively without reference to specific empirical studies, benchmarks, or data on AI-generated code comprehension or maintenance outcomes, leaving the central shift claim unsupported.

Authors: The shift claim follows from the established capability of current generative models to produce code faster than teams can review and internalize it, combined with documented challenges in code comprehension and maintenance. We will revise to cite relevant studies on AI code review effectiveness and maintainability (e.g., recent work on LLM-generated code smells and developer comprehension experiments) and to label the argument explicitly as a hypothesis requiring empirical confirmation. Because the paper is a rethinking framework rather than a data-driven analysis, we do not believe new primary data is required at this stage, but the added references and caveats will address the concern. revision: partial
Referee: [Diagnosis and mitigation section] The mitigation and diagnosis discussion: No concrete methods, tools, examples, or validation approaches are provided for identifying or reducing cognitive/intent debt, which is load-bearing for the model's claimed practical utility in reasoning about software health.

Authors: We acknowledge this gap. The revised manuscript will expand the diagnosis and mitigation section with concrete, practitioner-oriented examples: for cognitive debt, structured architecture reviews and shared mental-model elicitation sessions; for intent debt, mandatory decision-record templates and AI-assisted rationale extraction prompts. We will also outline lightweight validation approaches such as before/after team surveys on system understanding. These additions will be framed as initial heuristics rather than validated instruments, consistent with the conceptual nature of the work. revision: yes

Circularity Check

0 steps flagged

No circularity: Triple Debt Model is a conceptual proposal with independent framing

full rationale

The paper proposes the Triple Debt Model as a new framework for software health by introducing cognitive debt (erosion of shared team understanding) and intent debt (absence of externalized rationale) alongside technical debt. These are presented as definitional categories to reason about AI-induced risks, with separate discussion of diagnosis, mitigation, and relative importance shifts. No equations, fitted parameters, self-citations, or derivations are present that reduce any claim to its own inputs by construction. The model stands as an organizing lens rather than a derived result, making the analysis self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on the assumption that software health can be usefully decomposed into these three interacting debt types without prior empirical support or formal definitions.

axioms (1)

domain assumption Software health is determined by interacting technical, cognitive, and intent debts
Invoked as the basis for the Triple Debt Model without justification or evidence

invented entities (2)

cognitive debt no independent evidence
purpose: To represent erosion of shared team understanding
Newly postulated concept introduced to explain AI-related risks
intent debt no independent evidence
purpose: To represent absence of externalized rationale and goals
Newly postulated concept introduced to explain AI-related risks

pith-pipeline@v0.9.0 · 5475 in / 1254 out tokens · 52905 ms · 2026-05-15T00:32:43.743332+00:00 · methodology

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

When AI Models Become Dependencies: Studying the Evolution of Pre-Trained Model Reuse in Downstream Software Systems
cs.SE 2026-04 unverdicted novelty 7.0

Pre-trained models are added late in projects, accumulate rather than get replaced, and change three times less often than libraries, with distinct documentation driven by capability needs and testing uncertainty.
Toward a Risk Assessment Framework for Institutional DeFi: A Nine-Dimension Approach
cs.DC 2026-05 unverdicted novelty 6.0

A nine-dimension risk framework for institutional DeFi adds three new dimensions to prior taxonomies and shows that five of twelve 2024-2026 incidents, including the two most systemic, require at least one of the new ...
Decision-Oriented Programming with Aporia
cs.HC 2026-04 conditional novelty 6.0

Aporia makes design decisions explicit and interactive in AI-assisted programming, leading to higher engagement and 5x fewer mental model disagreements with code in a 14-person user study compared to a baseline agent.
More Is Different: Toward a Theory of Emergence in AI-Native Software Ecosystems
cs.SE 2026-04 unverdicted novelty 5.0

AI-native software ecosystems exhibit emergent behaviors best explained by complex adaptive systems theory, requiring new ecosystem-level monitoring and seven testable propositions that may extend or replace Lehman's laws.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages · cited by 4 Pith papers · 1 internal anchor

[1]

Grasping AI Reliance in Program Comprehension and Coding through the AIRELI Persona Taxonomy

Alakmeh, T., Anderson, N., Jackson, V., Vaz Pereira, G., Akirmak, U., Estey, A., Prikladnicki, R., van der Hoek, A., Storey, M.A., & Fritz, T. Grasping AI Reliance in Program Comprehension and Coding through the AIRELI Persona Taxonomy. To appear at IEEE ICPC 2026, Rio de Janeiro, Brazil. https://aireli.hasel.dev/download/aireli-preprint.pdf Beck, K

work page 2026
[2]

arXiv preprint arXiv:2601.02200

Code for Machines, Not Just Humans: Quantifying AI-Friendliness with Code Health Metrics. arXiv preprint arXiv:2601.02200. Clark, H.H., Brennan, S. E

work page arXiv
[3]

Communications of the ACM 31, 11 (1988), 1268–1287

A field study of the software design process for large systems. Communications of the ACM 31, 11 (1988), 1268–1287. Evans, E

work page 1988
[4]

Manning Publications

The Programmer's Brain: What Every Programmer Needs to Know About Cognition. Manning Publications. Herbsleb, J. D. (2007, May). Global software engineering: The future of socio-technical coordination. In future of software engineering (FOSE'07) (pp. 188-198). IEEE. Hicks CM, Lee CS, Foster-Marks K

work page 2007
[5]

DOI: 10.31234/osf.io/2gej5 Hou, X., Zhao, Y., Liu, Y., Yang, Z., Wang, K., Li, L., Luo, X., Lo, D., Grundy, J., and Wang, H

work page doi:10.31234/osf.io/2gej5
[6]

arXiv:2507.21280 Naur, P

work page arXiv
[7]

https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions Peng, S., Kalliamvakou, E., Cihon, P., and Demirer, M

Documenting Architecture Decisions. https://cognitect.com/blog/2011/11/15/documenting-architecture-decisions Peng, S., Kalliamvakou, E., Cihon, P., and Demirer, M

work page 2011
[8]

The Impact of AI on Developer Productivity: Evidence from GitHub Copilot

The Impact of AI on Developer Productivity: Evidence from GitHub Copilot. arXiv preprint arXiv:2302.06590. Petre, M., & Shaw, M. (2025). Contrasting to spark creativity in software development teams. IEEE Software . https://doi.org/10.1109/MS.2025.3538670 Shaw, S. D., and Nave, G. “Thinking-Fast, Slow, and Artificial: How AI is Reshaping Human Reasoning a...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1109/ms.2025.3538670 2025
[9]

arXiv preprint arXiv:2602.10540

Theory of Troubleshooting: The Developer's Cognitive Experience of Overcoming Confusion. arXiv preprint arXiv:2602.10540. ACM Trans. Softw. Eng. Methodol. Accepted March

work page arXiv
[10]

https://doi.org/10.1145/3800945 Tornhill, A

work page doi:10.1145/3800945
[11]

Skills Rot At Machine Speed? AI Is Changing How Developers Learn And Think. Forbes. https://www.forbes.com/councils/forbestechcouncil/2025/04/28/skills-rot-at-machine-speed-ai-is-changing-h ow-developers-learn-and-think/ Ulloa, M., Butler, J.L., Haniyur, S., Miller, C., Amos, B., Sarkar, A. and Storey, M.A

work page 2025
[12]

and Vans, A.M

arXiv:2510.02504 Von Mayrhauser, A. and Vans, A.M

work page arXiv
[13]

https://simonwillison.net/2026/Feb/15/cognitive-debt/ Wynne, M., Hellesøy, A., and Tooke, S

Cognitive Debt. https://simonwillison.net/2026/Feb/15/cognitive-debt/ Wynne, M., Hellesøy, A., and Tooke, S

work page 2026