Rethinking Software Engineering for Agentic AI Systems

Mamdouh Alenezi

arxiv: 2604.10599 · v1 · submitted 2026-04-12 · 💻 cs.SE

Rethinking Software Engineering for Agentic AI Systems

Mamdouh Alenezi This is my paper

Pith reviewed 2026-05-10 15:55 UTC · model grok-4.3

classification 💻 cs.SE

keywords agentic AIsoftware engineeringLLM code generationAI verificationmulti-agent systemshuman-AI collaborationcode disposabilitysoftware lifecycle

0 comments

The pith

Abundant AI-generated code is shifting software engineering from manual authorship to orchestration, verification, and human-AI collaboration.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that large language models are turning code into an abundant and disposable commodity rather than a scarce, hand-crafted product. If this holds, the discipline must reorganize its core practices around managing multi-agent AI systems, rigorously checking AI outputs, and structuring effective human oversight of those systems. This change would affect how engineers are educated, what tools they rely on, how projects are run, and what skills define professional competence. A reader would care because it directly addresses whether traditional coding work will shrink or transform into higher-level system design and accountability roles.

Core claim

The paper's central claim is that code is transitioning from a scarce, carefully crafted artifact to an abundant and increasingly disposable commodity as a result of LLMs and agentic AI systems. Consequently, software engineering must reorganize around three core competencies: effective orchestration of multi-agent systems, rigorous verification of AI-generated outputs, and structured human-AI collaboration. The authors propose a conceptual framework that details required transformations in curricula, development tooling, lifecycle processes, and governance models, while arguing that engineers' roles are elevated rather than diminished toward system-level design, semantic validation, and, in

What carries the argument

The shift of code from scarce artifact to abundant disposable commodity, which necessitates reorganization around orchestration of multi-agent systems, verification of AI outputs, and human-AI collaboration.

If this is right

Curricula will need to emphasize skills in agent orchestration and output verification over traditional coding proficiency.
Development tools must incorporate support for prompt traceability, multi-agent workflow management, and automated verification pipelines.
Lifecycle processes will shift toward verification-first approaches with explicit checkpoints for AI-generated components.
Governance models will require new structures for accountable oversight and responsibility assignment in AI-augmented teams.
Professional practice will elevate engineers to roles focused on semantic validation and system-level design decisions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This reorganization could create measurable new performance indicators, such as the ratio of AI-generated code that is discarded versus retained after human review.
The same abundance logic might apply to adjacent creative domains like UI design or technical documentation, suggesting parallel shifts in those fields.
Long-term workforce studies could track whether demand for traditional programmers declines or merely redirects toward verification and orchestration specialists.
A testable extension would be to monitor whether verification bottlenecks actually limit productivity gains from AI code generation in real projects.

Load-bearing premise

The assumption that LLM-driven code generation will make manually written code scarce and disposable enough to force a fundamental reorganization of the entire software engineering discipline rather than incremental additions to existing practices.

What would settle it

Empirical data from large-scale software repositories showing that the proportion of manually authored, long-maintained code remains dominant over time despite widespread LLM adoption, with no measurable increase in code regeneration or disposal rates.

Figures

Figures reproduced from arXiv: 2604.10599 by Mamdouh Alenezi.

**Figure 1.** Figure 1: The Inversion of the Engineering Value. The most fundamental shift in engineering practice is from writing code to precisely specifying what should be built and why (see [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗

read the original abstract

The rapid proliferation of large language models (LLMs) and agentic AI systems has created an unprecedented abundance of automatically generated code, challenging the traditional software engineering paradigm centered on manual authorship. This paper examines whether the discipline should be reoriented around orchestration, verification, and human-AI collaboration, and what implications this shift holds for education, tools, processes, and professional practice. Drawing on a structured synthesis of relevant literature and emerging industry perspectives, we analyze four key dimensions: the evolving role of the engineer in agentic workflows, verification as a critical quality bottleneck, observed impacts on productivity and maintainability, and broader implications for the discipline. Our analysis indicates that code is transitioning from a scarce, carefully crafted artifact to an abundant and increasingly disposable commodity. As a result, software engineering must reorganize around three core competencies: effective orchestration of multi-agent systems, rigorous verification of AI-generated outputs, and structured human-AI collaboration. We propose a conceptual framework outlining the transformations required across curricula, development tooling, lifecycle processes, and governance models. Rather than diminishing the role of engineers, this shift elevates their responsibilities toward system-level design, semantic validation, and accountable oversight. The paper concludes by highlighting key research challenges, including verification-first lifecycles, prompt traceability, and the long-term evolution of the engineering workforce.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a position paper that organizes existing ideas on LLMs in software engineering into three competencies but offers no new data or derivations to show the shift is fundamental rather than incremental.

read the letter

The paper's main point is that LLMs and agentic systems are flooding the field with generated code, turning it into an abundant commodity that forces software engineering to reorganize around orchestration of multi-agent systems, verification of outputs, and human-AI collaboration. It frames this as elevating engineers to system-level oversight rather than replacing them, with implications for curricula, tools, processes, and governance. The abstract and synthesis lay out four dimensions of change and end with open research challenges like prompt traceability and verification-first lifecycles. That structure is the clearest part of the work. It pulls together recent literature on AI-assisted development and gives a readable outline of what might need to adjust in education and practice. The claim that engineers' roles get upgraded to semantic validation and accountable oversight follows logically from the premises without overpromising technical novelty. The soft spot is the evidence base for the scale of the transition. The central assertion that manually authored code will become scarce and disposable rests on general references to productivity impacts and industry perspectives, but the paper supplies no specific metrics, longitudinal data, or case studies showing reduced maintenance loads or why current verification methods cannot extend to handle the volume. Without those anchors, the reorganization reads as a normative recommendation rather than a conclusion forced by demonstrated facts. The stress-test concern about the premise being untested holds up on the text. This kind of paper suits readers who track how AI might reshape SE training and roles over the next decade. It is not a source for new experiments or formal results, but it could help frame discussions. I would send it to peer review so referees can test the framework against additional literature and suggest where concrete examples or data would strengthen the argument.

Referee Report

2 major / 2 minor

Summary. The paper argues that the proliferation of LLMs and agentic AI systems is shifting code from a scarce, manually crafted artifact to an abundant, disposable commodity. Drawing on a structured synthesis of literature and industry perspectives, it analyzes four dimensions—the evolving role of the engineer, verification as a quality bottleneck, impacts on productivity and maintainability, and broader disciplinary implications—and concludes that software engineering must reorganize around orchestration of multi-agent systems, rigorous verification of AI outputs, and structured human-AI collaboration. A conceptual framework is proposed for transformations in curricula, tooling, lifecycle processes, and governance, while elevating engineers to system-level design and oversight roles.

Significance. If the premise of a fundamental transition holds, the work could meaningfully guide adaptation of the software engineering discipline to AI-generated code abundance by framing new priorities and research challenges such as verification-first lifecycles and prompt traceability. The structured synthesis of literature and emerging perspectives is a clear strength, providing a timely foundation for discussion even as a position paper.

major comments (2)

[Abstract] Abstract: The central claim that code is 'transitioning from a scarce, carefully crafted artifact to an abundant and increasingly disposable commodity' is load-bearing for the reorganization argument, yet the abstract supplies no specific quantitative metrics, longitudinal data, or cited case studies from the literature synthesis to demonstrate that this shift is fundamental rather than incremental.
[Analysis of observed impacts on productivity and maintainability] Analysis of observed impacts on productivity and maintainability: The discussion of productivity and maintainability effects underpins the assertion that existing practices are insufficient, but without reported metrics, specific examples of reduced maintenance burden, or evidence that current verification methods cannot scale, the call for reorganization around three new core competencies remains normative rather than empirically anchored.

minor comments (2)

[Abstract] The four key dimensions are enumerated in the abstract but their mapping to subsequent sections is not made explicit, which would improve traceability of the argument.
Terms such as 'agentic workflows' and 'prompt traceability' appear without early definitions, potentially reducing accessibility for readers outside the immediate subfield.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which underscores the importance of grounding our position paper's claims in specific evidence from the literature. We have revised the manuscript to incorporate additional quantitative references, metrics, and examples drawn from the cited studies, thereby strengthening the empirical anchoring of the central arguments while preserving the conceptual and forward-looking character of the work.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that code is 'transitioning from a scarce, carefully crafted artifact to an abundant and increasingly disposable commodity' is load-bearing for the reorganization argument, yet the abstract supplies no specific quantitative metrics, longitudinal data, or cited case studies from the literature synthesis to demonstrate that this shift is fundamental rather than incremental.

Authors: We agree that the abstract would benefit from more concrete support for this central claim. In the revised version, we have added brief references to key findings from the literature, such as quantitative data on the exponential growth in AI-generated code contributions (citing specific studies on repository analyses) and industry reports on code generation volumes. We have also included a pointer to the full synthesis in the body of the paper. This makes the abstract more informative without altering its length significantly. revision: yes
Referee: [Analysis of observed impacts on productivity and maintainability] Analysis of observed impacts on productivity and maintainability: The discussion of productivity and maintainability effects underpins the assertion that existing practices are insufficient, but without reported metrics, specific examples of reduced maintenance burden, or evidence that current verification methods cannot scale, the call for reorganization around three new core competencies remains normative rather than empirically anchored.

Authors: This is a fair assessment. The original section synthesized qualitative and quantitative insights from prior work but did not always highlight specific metrics explicitly. We have revised it to incorporate concrete examples, including reported productivity improvements (e.g., from controlled studies showing time savings) and maintainability issues (such as higher defect rates in unverified AI code). For verification scalability, we cite evidence from research on the challenges of testing LLM outputs at scale. While the proposal for reorganization has a normative component inherent to position papers, these additions provide a more empirically grounded foundation for the argument. revision: partial

Circularity Check

0 steps flagged

No circularity: interpretive position paper with no derivations or fitted predictions

full rationale

This is a position paper synthesizing literature on LLMs and agentic AI impacts. It contains no equations, mathematical derivations, fitted parameters, predictions, or self-referential reductions. The central claim that code is becoming an abundant commodity leading to reorganization follows interpretively from stated premises and external references, without reducing to inputs by construction. No self-citation load-bearing steps, uniqueness theorems, or ansatzes are present. The analysis is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that LLM-driven code generation is rapidly making manual authorship obsolete, without new empirical support or independent evidence supplied in the abstract.

axioms (1)

domain assumption The rapid proliferation of large language models and agentic AI systems has created an unprecedented abundance of automatically generated code that challenges the traditional software engineering paradigm centered on manual authorship.
This premise is stated in the opening sentence of the abstract and underpins the entire analysis and proposed reorganization.

pith-pipeline@v0.9.0 · 5527 in / 1414 out tokens · 52251 ms · 2026-05-10T15:55:13.048896+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Agentic Agile-V: From Vibe Coding to Verified Engineering in Software and Hardware Development
cs.SE 2026-05 unverdicted novelty 4.0

Agentic Agile-V uses Agile-V as backbone and a Specify-Constrain-Orchestrate-Prove-Evolve-Verify loop to convert AI agent conversations into traceable engineering artifacts with acceptance evidence.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages · cited by 1 Pith paper

[1]

Sadowski and T

C. Sadowski and T. Zimmermann, eds.,Rethinking Productivity in Software Engineering. Apress, Springer Nature, 2019

work page 2019
[2]

Investigating the influence of continuous integration on software quality and developer productivity,

F. Gul and M. I. Khan, “Investigating the influence of continuous integration on software quality and developer productivity,”Spectrum of Engineering Sciences, vol. 3, no. 5, pp. 283–295, 2025

work page 2025
[3]

Dronetest-copilot: AGI-powered automated detection and repair of flaws in drone autotest suites,

Z. Liang, L. Hu, Q. Fan, B. Yuan, Q. Zhang, D. Zou, and H. Jin, “Dronetest-copilot: AGI-powered automated detection and repair of flaws in drone autotest suites,”IEEE Transactions on Network Science and Engineering, 2026

work page 2026
[4]

Large language models for software engineering: Survey and open problems,

A. Fan, B. Gokkaya,et al., “Large language models for software engineering: Survey and open problems,” in Proceedings of the IEEE/ACM International Conference on Software Engineering: Future of Software Engineer- ing (ICSE-FoSE), pp. 31–53, 2023

work page 2023
[5]

Large language models in software engineering: Automation, collaboration, and challenges,

C. Zhong, “Large language models in software engineering: Automation, collaboration, and challenges,”Ad- vances in Engineering Technology Research, 2025

work page 2025
[6]

Agentic AI for software: Thoughts from the software engineering community,

A. Roychoudhury, “Agentic AI for software: Thoughts from the software engineering community,” 2025

work page 2025
[7]

Agentic workflows in software engineering: Survey of prompt, fine-tuning, and multi-agent paradigms,

J. Wang, M. Liu, and K. Zhao, “Agentic workflows in software engineering: Survey of prompt, fine-tuning, and multi-agent paradigms,”ACM Computing Surveys, 2024

work page 2024
[8]

A dual perspective review on large language models and code verification,

G. Dolcetti and E. Iotti, “A dual perspective review on large language models and code verification,”Frontiers of Computer Science, 2025

work page 2025
[9]

LLM-driven verification assistance: Bridging code, coverage and collaboration,

A. Mohan, “LLM-driven verification assistance: Bridging code, coverage and collaboration,”International Jour- nal of Science and Research Archive, 2025

work page 2025
[10]

A dual perspective review on large language models and code verification,

W. Li, T. Brown, and E. Davis, “A dual perspective review on large language models and code verification,” Journal of Systems and Software, 2024

work page 2024
[11]

Comprehensive evaluation of large language models on software engineering tasks,

X. Chen, Y . Zhang, and H. Li, “Comprehensive evaluation of large language models on software engineering tasks,” inProceedings of the International Conference on Software Engineering (ICSE), 2023

work page 2023
[12]

Global expert survey on AI-augmented software development: Productivity, limitations, and role evolution,

R. Gupta, S. Patel, and A. Kumar, “Global expert survey on AI-augmented software development: Productivity, limitations, and role evolution,”IEEE Transactions on Software Engineering, 2024

work page 2024
[13]

Examining the use and impact of an AI code assistant on developer productivity and experience in the enterprise,

J. D. Weisz, S. Kumar,et al., “Examining the use and impact of an AI code assistant on developer productivity and experience in the enterprise,” inCHI Extended Abstracts, 2024

work page 2024
[14]

Echoes of AI: Investigating the downstream effects of AI assistants on software maintainability,

M. Borg, D. Hewett,et al., “Echoes of AI: Investigating the downstream effects of AI assistants on software maintainability,” 2025

work page 2025
[15]

Revisiting software engineering education in the era of large language models: A curriculum adap- tation and academic integrity framework,

M. Degerli, “Revisiting software engineering education in the era of large language models: A curriculum adap- tation and academic integrity framework,” 2026

work page 2026
[16]

AI-driven innovations in software engineering: A review of current practices and future directions,

M. Alenezi and M. Akour, “AI-driven innovations in software engineering: A review of current practices and future directions,”Applied Sciences, vol. 15, no. 3, p. 1344, 2025

work page 2025
[17]

MOSAICO: Management, orchestration and supervision of AI-agent commu- nities,

A. Rossi, F. Chen, and J. Müller, “MOSAICO: Management, orchestration and supervision of AI-agent commu- nities,” inProceedings of the International Symposium on Software Testing and Analysis (ISSTA), 2024. 12 SWE in Agentic AI

work page 2024
[18]

HAI-Eval: Measuring human-AI synergy in collaborative coding,

D. Smith, M. Garcia, and T. Nguyen, “HAI-Eval: Measuring human-AI synergy in collaborative coding,” in Proceedings of the ACM SIGSOFT FSE, 2024

work page 2024
[19]

Quality assurance of LLM-generated code: Addressing non-functional quality characteristics,

R. Thompson, C. Lee, and N. Ali, “Quality assurance of LLM-generated code: Addressing non-functional quality characteristics,”IEEE Software, 2024

work page 2024
[20]

Vulnerability detection: From formal verification to LLMs and hybrid ap- proaches,

I. Petrov, J. Sanchez, and Y . Wu, “Vulnerability detection: From formal verification to LLMs and hybrid ap- proaches,”Computers & Security, 2024

work page 2024
[21]

Rethinking autonomy: Preventing failures in AI-driven software engi- neering,

B. Carter, P. Okoro, and K. Yamamoto, “Rethinking autonomy: Preventing failures in AI-driven software engi- neering,” inProceedings of the IEEE/ACM International Conference on Automated Software Engineering (ASE), 2024

work page 2024
[22]

Coding with AI: From industrial practices to future education,

H. Nielsen, L. Rodriguez, and M. Tanaka, “Coding with AI: From industrial practices to future education,” Journal of Computing Sciences in Colleges, 2024

work page 2024
[23]

Lost in code generation: Reimagining the role of software models in AI-driven SE,

K. Anderson, M. Fischer, and S. O’Connor, “Lost in code generation: Reimagining the role of software models in AI-driven SE,” inProceedings of the ACM/IEEE International Conference on Model Driven Engineering Languages & Systems (MODELS), 2024

work page 2024
[24]

From code writers to code curators: A CEFR-inspired framework,

P. Dubois, L. Schmidt, and R. Patel, “From code writers to code curators: A CEFR-inspired framework,”IEEE Transactions on Education, 2024

work page 2024
[25]

Generative AI and empirical software engineering: A paradigm shift,

T. Evans, Q. Zhao, and J. Bennett, “Generative AI and empirical software engineering: A paradigm shift,” Empirical Software Engineering, 2024

work page 2024
[26]

Redefining the software engineering profession for AI,

M. Russinovich and S. Hanselman, “Redefining the software engineering profession for AI,”Communications of the ACM, vol. 69, no. 4, pp. 41–44, 2026

work page 2026
[27]

Copiloting the future: How generative AI transforms software engineer- ing,

L. Banh, F. Holldack, and G. Strobel, “Copiloting the future: How generative AI transforms software engineer- ing,”Information and Software Technology, vol. 183, p. 107751, 2025

work page 2025
[28]

Redefining the programmer: Human–AI collaboration, LLMs, and security in modern software engineering,

E. D. L. Cruz, H. Le, K. Meduri, G. S. Nadella, and H. Gonaygunta, “Redefining the programmer: Human–AI collaboration, LLMs, and security in modern software engineering,”Computers, Materials & Continua, vol. 85, no. 2, pp. 3569–3582, 2025

work page 2025
[29]

AI agents and agentic AI—navigating a plethora of concepts for future manu- facturing,

Y . Ren, Y . Liu, T. Ji, and X. Xu, “AI agents and agentic AI—navigating a plethora of concepts for future manu- facturing,”Journal of Manufacturing Systems, 2025

work page 2025
[30]

A dual perspective review on large language models and code verifica- tion,

V . Casola, A. Ferrara, and S. Marchesin, “A dual perspective review on large language models and code verifica- tion,”Frontiers in Computer Science, vol. 7, p. 1655469, 2025

work page 2025
[31]

Application of AI to formal methods—an analysis of current trends,

J. Heidrich, A. Pretschner, and L. Luthmann, “Application of AI to formal methods—an analysis of current trends,”Empirical Software Engineering, vol. 30, p. 10729, 2025

work page 2025
[32]

Security degradation in iterative AI code generation: A systematic analysis of the paradox,

M. Fakihet al., “Security degradation in iterative AI code generation: A systematic analysis of the paradox,” in Proc. IEEE Int. Symp. Technology and Society (ISTAS), 2025. 13

work page 2025

[1] [1]

Sadowski and T

C. Sadowski and T. Zimmermann, eds.,Rethinking Productivity in Software Engineering. Apress, Springer Nature, 2019

work page 2019

[2] [2]

Investigating the influence of continuous integration on software quality and developer productivity,

F. Gul and M. I. Khan, “Investigating the influence of continuous integration on software quality and developer productivity,”Spectrum of Engineering Sciences, vol. 3, no. 5, pp. 283–295, 2025

work page 2025

[3] [3]

Dronetest-copilot: AGI-powered automated detection and repair of flaws in drone autotest suites,

Z. Liang, L. Hu, Q. Fan, B. Yuan, Q. Zhang, D. Zou, and H. Jin, “Dronetest-copilot: AGI-powered automated detection and repair of flaws in drone autotest suites,”IEEE Transactions on Network Science and Engineering, 2026

work page 2026

[4] [4]

Large language models for software engineering: Survey and open problems,

A. Fan, B. Gokkaya,et al., “Large language models for software engineering: Survey and open problems,” in Proceedings of the IEEE/ACM International Conference on Software Engineering: Future of Software Engineer- ing (ICSE-FoSE), pp. 31–53, 2023

work page 2023

[5] [5]

Large language models in software engineering: Automation, collaboration, and challenges,

C. Zhong, “Large language models in software engineering: Automation, collaboration, and challenges,”Ad- vances in Engineering Technology Research, 2025

work page 2025

[6] [6]

Agentic AI for software: Thoughts from the software engineering community,

A. Roychoudhury, “Agentic AI for software: Thoughts from the software engineering community,” 2025

work page 2025

[7] [7]

Agentic workflows in software engineering: Survey of prompt, fine-tuning, and multi-agent paradigms,

J. Wang, M. Liu, and K. Zhao, “Agentic workflows in software engineering: Survey of prompt, fine-tuning, and multi-agent paradigms,”ACM Computing Surveys, 2024

work page 2024

[8] [8]

A dual perspective review on large language models and code verification,

G. Dolcetti and E. Iotti, “A dual perspective review on large language models and code verification,”Frontiers of Computer Science, 2025

work page 2025

[9] [9]

LLM-driven verification assistance: Bridging code, coverage and collaboration,

A. Mohan, “LLM-driven verification assistance: Bridging code, coverage and collaboration,”International Jour- nal of Science and Research Archive, 2025

work page 2025

[10] [10]

A dual perspective review on large language models and code verification,

W. Li, T. Brown, and E. Davis, “A dual perspective review on large language models and code verification,” Journal of Systems and Software, 2024

work page 2024

[11] [11]

Comprehensive evaluation of large language models on software engineering tasks,

X. Chen, Y . Zhang, and H. Li, “Comprehensive evaluation of large language models on software engineering tasks,” inProceedings of the International Conference on Software Engineering (ICSE), 2023

work page 2023

[12] [12]

Global expert survey on AI-augmented software development: Productivity, limitations, and role evolution,

R. Gupta, S. Patel, and A. Kumar, “Global expert survey on AI-augmented software development: Productivity, limitations, and role evolution,”IEEE Transactions on Software Engineering, 2024

work page 2024

[13] [13]

Examining the use and impact of an AI code assistant on developer productivity and experience in the enterprise,

J. D. Weisz, S. Kumar,et al., “Examining the use and impact of an AI code assistant on developer productivity and experience in the enterprise,” inCHI Extended Abstracts, 2024

work page 2024

[14] [14]

Echoes of AI: Investigating the downstream effects of AI assistants on software maintainability,

M. Borg, D. Hewett,et al., “Echoes of AI: Investigating the downstream effects of AI assistants on software maintainability,” 2025

work page 2025

[15] [15]

Revisiting software engineering education in the era of large language models: A curriculum adap- tation and academic integrity framework,

M. Degerli, “Revisiting software engineering education in the era of large language models: A curriculum adap- tation and academic integrity framework,” 2026

work page 2026

[16] [16]

AI-driven innovations in software engineering: A review of current practices and future directions,

M. Alenezi and M. Akour, “AI-driven innovations in software engineering: A review of current practices and future directions,”Applied Sciences, vol. 15, no. 3, p. 1344, 2025

work page 2025

[17] [17]

MOSAICO: Management, orchestration and supervision of AI-agent commu- nities,

A. Rossi, F. Chen, and J. Müller, “MOSAICO: Management, orchestration and supervision of AI-agent commu- nities,” inProceedings of the International Symposium on Software Testing and Analysis (ISSTA), 2024. 12 SWE in Agentic AI

work page 2024

[18] [18]

HAI-Eval: Measuring human-AI synergy in collaborative coding,

D. Smith, M. Garcia, and T. Nguyen, “HAI-Eval: Measuring human-AI synergy in collaborative coding,” in Proceedings of the ACM SIGSOFT FSE, 2024

work page 2024

[19] [19]

Quality assurance of LLM-generated code: Addressing non-functional quality characteristics,

R. Thompson, C. Lee, and N. Ali, “Quality assurance of LLM-generated code: Addressing non-functional quality characteristics,”IEEE Software, 2024

work page 2024

[20] [20]

Vulnerability detection: From formal verification to LLMs and hybrid ap- proaches,

I. Petrov, J. Sanchez, and Y . Wu, “Vulnerability detection: From formal verification to LLMs and hybrid ap- proaches,”Computers & Security, 2024

work page 2024

[21] [21]

Rethinking autonomy: Preventing failures in AI-driven software engi- neering,

B. Carter, P. Okoro, and K. Yamamoto, “Rethinking autonomy: Preventing failures in AI-driven software engi- neering,” inProceedings of the IEEE/ACM International Conference on Automated Software Engineering (ASE), 2024

work page 2024

[22] [22]

Coding with AI: From industrial practices to future education,

H. Nielsen, L. Rodriguez, and M. Tanaka, “Coding with AI: From industrial practices to future education,” Journal of Computing Sciences in Colleges, 2024

work page 2024

[23] [23]

Lost in code generation: Reimagining the role of software models in AI-driven SE,

K. Anderson, M. Fischer, and S. O’Connor, “Lost in code generation: Reimagining the role of software models in AI-driven SE,” inProceedings of the ACM/IEEE International Conference on Model Driven Engineering Languages & Systems (MODELS), 2024

work page 2024

[24] [24]

From code writers to code curators: A CEFR-inspired framework,

P. Dubois, L. Schmidt, and R. Patel, “From code writers to code curators: A CEFR-inspired framework,”IEEE Transactions on Education, 2024

work page 2024

[25] [25]

Generative AI and empirical software engineering: A paradigm shift,

T. Evans, Q. Zhao, and J. Bennett, “Generative AI and empirical software engineering: A paradigm shift,” Empirical Software Engineering, 2024

work page 2024

[26] [26]

Redefining the software engineering profession for AI,

M. Russinovich and S. Hanselman, “Redefining the software engineering profession for AI,”Communications of the ACM, vol. 69, no. 4, pp. 41–44, 2026

work page 2026

[27] [27]

Copiloting the future: How generative AI transforms software engineer- ing,

L. Banh, F. Holldack, and G. Strobel, “Copiloting the future: How generative AI transforms software engineer- ing,”Information and Software Technology, vol. 183, p. 107751, 2025

work page 2025

[28] [28]

Redefining the programmer: Human–AI collaboration, LLMs, and security in modern software engineering,

E. D. L. Cruz, H. Le, K. Meduri, G. S. Nadella, and H. Gonaygunta, “Redefining the programmer: Human–AI collaboration, LLMs, and security in modern software engineering,”Computers, Materials & Continua, vol. 85, no. 2, pp. 3569–3582, 2025

work page 2025

[29] [29]

AI agents and agentic AI—navigating a plethora of concepts for future manu- facturing,

Y . Ren, Y . Liu, T. Ji, and X. Xu, “AI agents and agentic AI—navigating a plethora of concepts for future manu- facturing,”Journal of Manufacturing Systems, 2025

work page 2025

[30] [30]

A dual perspective review on large language models and code verifica- tion,

V . Casola, A. Ferrara, and S. Marchesin, “A dual perspective review on large language models and code verifica- tion,”Frontiers in Computer Science, vol. 7, p. 1655469, 2025

work page 2025

[31] [31]

Application of AI to formal methods—an analysis of current trends,

J. Heidrich, A. Pretschner, and L. Luthmann, “Application of AI to formal methods—an analysis of current trends,”Empirical Software Engineering, vol. 30, p. 10729, 2025

work page 2025

[32] [32]

Security degradation in iterative AI code generation: A systematic analysis of the paradox,

M. Fakihet al., “Security degradation in iterative AI code generation: A systematic analysis of the paradox,” in Proc. IEEE Int. Symp. Technology and Society (ISTAS), 2025. 13

work page 2025