Rethinking Software Engineering for Agentic AI Systems
Pith reviewed 2026-05-10 15:55 UTC · model grok-4.3
The pith
Abundant AI-generated code is shifting software engineering from manual authorship to orchestration, verification, and human-AI collaboration.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper's central claim is that code is transitioning from a scarce, carefully crafted artifact to an abundant and increasingly disposable commodity as a result of LLMs and agentic AI systems. Consequently, software engineering must reorganize around three core competencies: effective orchestration of multi-agent systems, rigorous verification of AI-generated outputs, and structured human-AI collaboration. The authors propose a conceptual framework that details required transformations in curricula, development tooling, lifecycle processes, and governance models, while arguing that engineers' roles are elevated rather than diminished toward system-level design, semantic validation, and, in
What carries the argument
The shift of code from scarce artifact to abundant disposable commodity, which necessitates reorganization around orchestration of multi-agent systems, verification of AI outputs, and human-AI collaboration.
If this is right
- Curricula will need to emphasize skills in agent orchestration and output verification over traditional coding proficiency.
- Development tools must incorporate support for prompt traceability, multi-agent workflow management, and automated verification pipelines.
- Lifecycle processes will shift toward verification-first approaches with explicit checkpoints for AI-generated components.
- Governance models will require new structures for accountable oversight and responsibility assignment in AI-augmented teams.
- Professional practice will elevate engineers to roles focused on semantic validation and system-level design decisions.
Where Pith is reading between the lines
- This reorganization could create measurable new performance indicators, such as the ratio of AI-generated code that is discarded versus retained after human review.
- The same abundance logic might apply to adjacent creative domains like UI design or technical documentation, suggesting parallel shifts in those fields.
- Long-term workforce studies could track whether demand for traditional programmers declines or merely redirects toward verification and orchestration specialists.
- A testable extension would be to monitor whether verification bottlenecks actually limit productivity gains from AI code generation in real projects.
Load-bearing premise
The assumption that LLM-driven code generation will make manually written code scarce and disposable enough to force a fundamental reorganization of the entire software engineering discipline rather than incremental additions to existing practices.
What would settle it
Empirical data from large-scale software repositories showing that the proportion of manually authored, long-maintained code remains dominant over time despite widespread LLM adoption, with no measurable increase in code regeneration or disposal rates.
Figures
read the original abstract
The rapid proliferation of large language models (LLMs) and agentic AI systems has created an unprecedented abundance of automatically generated code, challenging the traditional software engineering paradigm centered on manual authorship. This paper examines whether the discipline should be reoriented around orchestration, verification, and human-AI collaboration, and what implications this shift holds for education, tools, processes, and professional practice. Drawing on a structured synthesis of relevant literature and emerging industry perspectives, we analyze four key dimensions: the evolving role of the engineer in agentic workflows, verification as a critical quality bottleneck, observed impacts on productivity and maintainability, and broader implications for the discipline. Our analysis indicates that code is transitioning from a scarce, carefully crafted artifact to an abundant and increasingly disposable commodity. As a result, software engineering must reorganize around three core competencies: effective orchestration of multi-agent systems, rigorous verification of AI-generated outputs, and structured human-AI collaboration. We propose a conceptual framework outlining the transformations required across curricula, development tooling, lifecycle processes, and governance models. Rather than diminishing the role of engineers, this shift elevates their responsibilities toward system-level design, semantic validation, and accountable oversight. The paper concludes by highlighting key research challenges, including verification-first lifecycles, prompt traceability, and the long-term evolution of the engineering workforce.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper argues that the proliferation of LLMs and agentic AI systems is shifting code from a scarce, manually crafted artifact to an abundant, disposable commodity. Drawing on a structured synthesis of literature and industry perspectives, it analyzes four dimensions—the evolving role of the engineer, verification as a quality bottleneck, impacts on productivity and maintainability, and broader disciplinary implications—and concludes that software engineering must reorganize around orchestration of multi-agent systems, rigorous verification of AI outputs, and structured human-AI collaboration. A conceptual framework is proposed for transformations in curricula, tooling, lifecycle processes, and governance, while elevating engineers to system-level design and oversight roles.
Significance. If the premise of a fundamental transition holds, the work could meaningfully guide adaptation of the software engineering discipline to AI-generated code abundance by framing new priorities and research challenges such as verification-first lifecycles and prompt traceability. The structured synthesis of literature and emerging perspectives is a clear strength, providing a timely foundation for discussion even as a position paper.
major comments (2)
- [Abstract] Abstract: The central claim that code is 'transitioning from a scarce, carefully crafted artifact to an abundant and increasingly disposable commodity' is load-bearing for the reorganization argument, yet the abstract supplies no specific quantitative metrics, longitudinal data, or cited case studies from the literature synthesis to demonstrate that this shift is fundamental rather than incremental.
- [Analysis of observed impacts on productivity and maintainability] Analysis of observed impacts on productivity and maintainability: The discussion of productivity and maintainability effects underpins the assertion that existing practices are insufficient, but without reported metrics, specific examples of reduced maintenance burden, or evidence that current verification methods cannot scale, the call for reorganization around three new core competencies remains normative rather than empirically anchored.
minor comments (2)
- [Abstract] The four key dimensions are enumerated in the abstract but their mapping to subsequent sections is not made explicit, which would improve traceability of the argument.
- Terms such as 'agentic workflows' and 'prompt traceability' appear without early definitions, potentially reducing accessibility for readers outside the immediate subfield.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback, which underscores the importance of grounding our position paper's claims in specific evidence from the literature. We have revised the manuscript to incorporate additional quantitative references, metrics, and examples drawn from the cited studies, thereby strengthening the empirical anchoring of the central arguments while preserving the conceptual and forward-looking character of the work.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that code is 'transitioning from a scarce, carefully crafted artifact to an abundant and increasingly disposable commodity' is load-bearing for the reorganization argument, yet the abstract supplies no specific quantitative metrics, longitudinal data, or cited case studies from the literature synthesis to demonstrate that this shift is fundamental rather than incremental.
Authors: We agree that the abstract would benefit from more concrete support for this central claim. In the revised version, we have added brief references to key findings from the literature, such as quantitative data on the exponential growth in AI-generated code contributions (citing specific studies on repository analyses) and industry reports on code generation volumes. We have also included a pointer to the full synthesis in the body of the paper. This makes the abstract more informative without altering its length significantly. revision: yes
-
Referee: [Analysis of observed impacts on productivity and maintainability] Analysis of observed impacts on productivity and maintainability: The discussion of productivity and maintainability effects underpins the assertion that existing practices are insufficient, but without reported metrics, specific examples of reduced maintenance burden, or evidence that current verification methods cannot scale, the call for reorganization around three new core competencies remains normative rather than empirically anchored.
Authors: This is a fair assessment. The original section synthesized qualitative and quantitative insights from prior work but did not always highlight specific metrics explicitly. We have revised it to incorporate concrete examples, including reported productivity improvements (e.g., from controlled studies showing time savings) and maintainability issues (such as higher defect rates in unverified AI code). For verification scalability, we cite evidence from research on the challenges of testing LLM outputs at scale. While the proposal for reorganization has a normative component inherent to position papers, these additions provide a more empirically grounded foundation for the argument. revision: partial
Circularity Check
No circularity: interpretive position paper with no derivations or fitted predictions
full rationale
This is a position paper synthesizing literature on LLMs and agentic AI impacts. It contains no equations, mathematical derivations, fitted parameters, predictions, or self-referential reductions. The central claim that code is becoming an abundant commodity leading to reorganization follows interpretively from stated premises and external references, without reducing to inputs by construction. No self-citation load-bearing steps, uniqueness theorems, or ansatzes are present. The analysis is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The rapid proliferation of large language models and agentic AI systems has created an unprecedented abundance of automatically generated code that challenges the traditional software engineering paradigm centered on manual authorship.
Forward citations
Cited by 1 Pith paper
-
Agentic Agile-V: From Vibe Coding to Verified Engineering in Software and Hardware Development
Agentic Agile-V uses Agile-V as backbone and a Specify-Constrain-Orchestrate-Prove-Evolve-Verify loop to convert AI agent conversations into traceable engineering artifacts with acceptance evidence.
Reference graph
Works this paper leans on
-
[1]
C. Sadowski and T. Zimmermann, eds.,Rethinking Productivity in Software Engineering. Apress, Springer Nature, 2019
work page 2019
-
[2]
F. Gul and M. I. Khan, “Investigating the influence of continuous integration on software quality and developer productivity,”Spectrum of Engineering Sciences, vol. 3, no. 5, pp. 283–295, 2025
work page 2025
-
[3]
Dronetest-copilot: AGI-powered automated detection and repair of flaws in drone autotest suites,
Z. Liang, L. Hu, Q. Fan, B. Yuan, Q. Zhang, D. Zou, and H. Jin, “Dronetest-copilot: AGI-powered automated detection and repair of flaws in drone autotest suites,”IEEE Transactions on Network Science and Engineering, 2026
work page 2026
-
[4]
Large language models for software engineering: Survey and open problems,
A. Fan, B. Gokkaya,et al., “Large language models for software engineering: Survey and open problems,” in Proceedings of the IEEE/ACM International Conference on Software Engineering: Future of Software Engineer- ing (ICSE-FoSE), pp. 31–53, 2023
work page 2023
-
[5]
Large language models in software engineering: Automation, collaboration, and challenges,
C. Zhong, “Large language models in software engineering: Automation, collaboration, and challenges,”Ad- vances in Engineering Technology Research, 2025
work page 2025
-
[6]
Agentic AI for software: Thoughts from the software engineering community,
A. Roychoudhury, “Agentic AI for software: Thoughts from the software engineering community,” 2025
work page 2025
-
[7]
Agentic workflows in software engineering: Survey of prompt, fine-tuning, and multi-agent paradigms,
J. Wang, M. Liu, and K. Zhao, “Agentic workflows in software engineering: Survey of prompt, fine-tuning, and multi-agent paradigms,”ACM Computing Surveys, 2024
work page 2024
-
[8]
A dual perspective review on large language models and code verification,
G. Dolcetti and E. Iotti, “A dual perspective review on large language models and code verification,”Frontiers of Computer Science, 2025
work page 2025
-
[9]
LLM-driven verification assistance: Bridging code, coverage and collaboration,
A. Mohan, “LLM-driven verification assistance: Bridging code, coverage and collaboration,”International Jour- nal of Science and Research Archive, 2025
work page 2025
-
[10]
A dual perspective review on large language models and code verification,
W. Li, T. Brown, and E. Davis, “A dual perspective review on large language models and code verification,” Journal of Systems and Software, 2024
work page 2024
-
[11]
Comprehensive evaluation of large language models on software engineering tasks,
X. Chen, Y . Zhang, and H. Li, “Comprehensive evaluation of large language models on software engineering tasks,” inProceedings of the International Conference on Software Engineering (ICSE), 2023
work page 2023
-
[12]
R. Gupta, S. Patel, and A. Kumar, “Global expert survey on AI-augmented software development: Productivity, limitations, and role evolution,”IEEE Transactions on Software Engineering, 2024
work page 2024
-
[13]
J. D. Weisz, S. Kumar,et al., “Examining the use and impact of an AI code assistant on developer productivity and experience in the enterprise,” inCHI Extended Abstracts, 2024
work page 2024
-
[14]
Echoes of AI: Investigating the downstream effects of AI assistants on software maintainability,
M. Borg, D. Hewett,et al., “Echoes of AI: Investigating the downstream effects of AI assistants on software maintainability,” 2025
work page 2025
-
[15]
M. Degerli, “Revisiting software engineering education in the era of large language models: A curriculum adap- tation and academic integrity framework,” 2026
work page 2026
-
[16]
AI-driven innovations in software engineering: A review of current practices and future directions,
M. Alenezi and M. Akour, “AI-driven innovations in software engineering: A review of current practices and future directions,”Applied Sciences, vol. 15, no. 3, p. 1344, 2025
work page 2025
-
[17]
MOSAICO: Management, orchestration and supervision of AI-agent commu- nities,
A. Rossi, F. Chen, and J. Müller, “MOSAICO: Management, orchestration and supervision of AI-agent commu- nities,” inProceedings of the International Symposium on Software Testing and Analysis (ISSTA), 2024. 12 SWE in Agentic AI
work page 2024
-
[18]
HAI-Eval: Measuring human-AI synergy in collaborative coding,
D. Smith, M. Garcia, and T. Nguyen, “HAI-Eval: Measuring human-AI synergy in collaborative coding,” in Proceedings of the ACM SIGSOFT FSE, 2024
work page 2024
-
[19]
Quality assurance of LLM-generated code: Addressing non-functional quality characteristics,
R. Thompson, C. Lee, and N. Ali, “Quality assurance of LLM-generated code: Addressing non-functional quality characteristics,”IEEE Software, 2024
work page 2024
-
[20]
Vulnerability detection: From formal verification to LLMs and hybrid ap- proaches,
I. Petrov, J. Sanchez, and Y . Wu, “Vulnerability detection: From formal verification to LLMs and hybrid ap- proaches,”Computers & Security, 2024
work page 2024
-
[21]
Rethinking autonomy: Preventing failures in AI-driven software engi- neering,
B. Carter, P. Okoro, and K. Yamamoto, “Rethinking autonomy: Preventing failures in AI-driven software engi- neering,” inProceedings of the IEEE/ACM International Conference on Automated Software Engineering (ASE), 2024
work page 2024
-
[22]
Coding with AI: From industrial practices to future education,
H. Nielsen, L. Rodriguez, and M. Tanaka, “Coding with AI: From industrial practices to future education,” Journal of Computing Sciences in Colleges, 2024
work page 2024
-
[23]
Lost in code generation: Reimagining the role of software models in AI-driven SE,
K. Anderson, M. Fischer, and S. O’Connor, “Lost in code generation: Reimagining the role of software models in AI-driven SE,” inProceedings of the ACM/IEEE International Conference on Model Driven Engineering Languages & Systems (MODELS), 2024
work page 2024
-
[24]
From code writers to code curators: A CEFR-inspired framework,
P. Dubois, L. Schmidt, and R. Patel, “From code writers to code curators: A CEFR-inspired framework,”IEEE Transactions on Education, 2024
work page 2024
-
[25]
Generative AI and empirical software engineering: A paradigm shift,
T. Evans, Q. Zhao, and J. Bennett, “Generative AI and empirical software engineering: A paradigm shift,” Empirical Software Engineering, 2024
work page 2024
-
[26]
Redefining the software engineering profession for AI,
M. Russinovich and S. Hanselman, “Redefining the software engineering profession for AI,”Communications of the ACM, vol. 69, no. 4, pp. 41–44, 2026
work page 2026
-
[27]
Copiloting the future: How generative AI transforms software engineer- ing,
L. Banh, F. Holldack, and G. Strobel, “Copiloting the future: How generative AI transforms software engineer- ing,”Information and Software Technology, vol. 183, p. 107751, 2025
work page 2025
-
[28]
E. D. L. Cruz, H. Le, K. Meduri, G. S. Nadella, and H. Gonaygunta, “Redefining the programmer: Human–AI collaboration, LLMs, and security in modern software engineering,”Computers, Materials & Continua, vol. 85, no. 2, pp. 3569–3582, 2025
work page 2025
-
[29]
AI agents and agentic AI—navigating a plethora of concepts for future manu- facturing,
Y . Ren, Y . Liu, T. Ji, and X. Xu, “AI agents and agentic AI—navigating a plethora of concepts for future manu- facturing,”Journal of Manufacturing Systems, 2025
work page 2025
-
[30]
A dual perspective review on large language models and code verifica- tion,
V . Casola, A. Ferrara, and S. Marchesin, “A dual perspective review on large language models and code verifica- tion,”Frontiers in Computer Science, vol. 7, p. 1655469, 2025
work page 2025
-
[31]
Application of AI to formal methods—an analysis of current trends,
J. Heidrich, A. Pretschner, and L. Luthmann, “Application of AI to formal methods—an analysis of current trends,”Empirical Software Engineering, vol. 30, p. 10729, 2025
work page 2025
-
[32]
Security degradation in iterative AI code generation: A systematic analysis of the paradox,
M. Fakihet al., “Security degradation in iterative AI code generation: A systematic analysis of the paradox,” in Proc. IEEE Int. Symp. Technology and Society (ISTAS), 2025. 13
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.