pith. sign in

arxiv: 2605.23135 · v1 · pith:N4PT7R5Xnew · submitted 2026-05-22 · 💻 cs.SE

The Impact of AI Coding Assistants on Software Engineering: A Longitudinal Study

Pith reviewed 2026-05-25 04:07 UTC · model grok-4.3

classification 💻 cs.SE
keywords AI coding assistantssoftware engineeringdeveloper experienceproductivitylongitudinal studysupervisory engineering workproductivity-experience paradox
0
0 comments X

The pith

Longitudinal surveys reveal AI coding assistants shift engineers toward supervisory verification tasks while eroding aspects of developer experience.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tracks professional software engineers through questionnaires six months apart and finds consistent reports of reduced time on tasks like writing code. It identifies a shift from creation-focused work to verification and proposes supervisory engineering work as the new category of directing, evaluating, and correcting AI outputs. Productivity perceptions remain largely positive and stable, yet the matched group shows a near-doubling in reports of worsened developer experience, especially declining flow state and rising cognitive load. The findings indicate that AI assistants alter both the structure of software engineering work and how engineers experience performing it.

Core claim

Through a longitudinal mixed-methods investigation with 158 participants at the first time point and a matched cohort of 95 at the second, the study documents that 82 percent of engineers report spending less time writing code and a broader reallocation away from creation toward verification activities. It introduces supervisory engineering work as the emerging category encompassing direction, evaluation, and correction of AI-generated output. Productivity perceptions hold steady with 84 percent reporting improvement at both points, but among matched participants the share reporting worsened developer experience in at least one dimension rises from 14 percent to 27 percent, with flow state,

What carries the argument

the matched longitudinal cohort of 95 participants combined with the newly proposed category of supervisory engineering work

If this is right

  • Engineers spend measurably less time on code creation and more on verification of AI output.
  • A distinct category of supervisory engineering work emerges that requires new skills in evaluating and correcting AI suggestions.
  • Perceived productivity gains remain stable even as reports of diminished flow and higher cognitive load increase.
  • Feedback loops improve while other dimensions of developer experience decline.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Training programs may need to emphasize verification and oversight skills rather than pure coding fluency.
  • Team processes could be redesigned to protect flow state during AI-assisted work sessions.
  • Longer-term studies might track whether the observed experience erosion stabilizes or leads to higher turnover.

Load-bearing premise

Self-reported changes in time allocation, productivity, and developer experience accurately reflect the effects of AI assistants without meaningful distortion from participant self-selection or attrition.

What would settle it

Objective measurement of task durations or cognitive load in a controlled setting that shows no reduction in coding time or no erosion of flow state would contradict the self-reported patterns.

Figures

Figures reproduced from arXiv: 2605.23135 by Annie Vella, Kelly Blincoe.

Figure 1
Figure 1. Figure 1: AI coding assistant adoption rates at Q1 and Q2. Top 10 tools by maximum usage at either time point shown. [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Distribution of initial impressions of AI coding assistants at Q1. [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Distribution of anticipated disappointment if AI coding assistants were removed at Q2. [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Primary concern distribution at Q1 and Q2. “Other” excluded from analysis. [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Task focus shifts across time points. at Q2), with 82% of participants reporting less time by Q2 and only 2% reporting more. Refactoring code and testing also showed means below neutral, with many reporting spending less time on these activities. In contrast, designing and debugging were near neutral at both time points. Reviewing code was the only task with means above the neutral midpoint at both time po… view at source ↗
Figure 6
Figure 6. Figure 6: Perceived developer experience impact across time points. [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Developer experience perception transitions between Q1 and Q2. [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Perceived productivity impact across time points. [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Productivity perception transitions between Q1 and Q2. [PITH_FULL_IMAGE:figures/full_fig_p019_9.png] view at source ↗
read the original abstract

AI coding assistants have become prolific in recent years. Through a longitudinal mixed-methods investigation, we examined how professional software engineers perceive the effects of AI coding assistants in regard to task focus, developer experience, and productivity. Two questionnaires were administered six months apart, yielding 158 eligible participants at the first time point, 101 at the second, and a matched longitudinal cohort of 95. Participants reported spending less time on most development tasks, with 82% reporting less on writing code. We find broader shift in focus from creation to verification activities. We propose a new category of work we term supervisory engineering work, encompassing the direction, evaluation, and correction of AI output. We also identified a productivity-experience paradox: productivity perceptions held stable, with 84% reporting improvement at both time points, yet among matched participants, the proportion reporting worsened developer experience in at least one dimension nearly doubled from 14% to 27%, with flow state and cognitive load eroding while feedback loops improved. These findings suggest that AI coding assistants are impacting both the nature of software engineering work and how engineers experience it.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper reports results from a longitudinal mixed-methods survey of professional software engineers using AI coding assistants. Two questionnaires administered six months apart yielded 158 participants at T1 and 101 at T2, with a matched cohort of 95. Key claims include reduced time on coding tasks (82% report less time writing code), a shift toward verification activities, introduction of 'supervisory engineering work' as a new category, stable productivity perceptions (84% reporting improvement at both time points), and a 'productivity-experience paradox' in which the proportion of matched participants reporting worsened developer experience in at least one dimension rose from 14% to 27%, with declines in flow state and cognitive load but gains in feedback loops.

Significance. If the longitudinal within-subject changes are robust, the work provides rare empirical evidence on how AI assistants alter both task allocation and subjective developer experience over time. The mixed-methods design and matched cohort allow direct observation of change rather than cross-sectional snapshots, which is a strength for claims about evolving impacts. The proposed 'supervisory engineering work' category could usefully frame future research on verification and oversight activities.

major comments (2)
  1. [Abstract / Results (matched cohort)] Abstract and Results section on the matched cohort: the central productivity-experience paradox rests on the within-subject increase from 14% to 27% reporting worsened developer experience. With 63 dropouts between the initial 158 and the matched 95, the manuscript does not report baseline (T1) comparisons of DX or productivity perceptions between retained and attrited participants, nor any inverse-probability weighting or sensitivity analysis. This omission leaves differential attrition as a plausible alternative explanation for the observed shift.
  2. [Methods] Methods section: the abstract and results report percentages and cohort sizes but provide no details on questionnaire item wording, validation, statistical tests used for the 14%-to-27% change, or controls for AI usage intensity, experience level, or other covariates. Without these, it is not possible to assess whether the reported DX erosion is robust or sensitive to measurement choices.
minor comments (2)
  1. [Abstract / Discussion] The definition and operationalization of 'supervisory engineering work' is introduced in the abstract but would benefit from an explicit coding scheme or example items in the methods or results to allow replication.
  2. [Results] Table or figure presenting the matched-cohort DX changes should include exact item wording, response scales, and confidence intervals or p-values for the reported proportions.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their careful review and for identifying key areas where additional transparency is needed regarding the longitudinal cohort and methodological details. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract / Results (matched cohort)] Abstract and Results section on the matched cohort: the central productivity-experience paradox rests on the within-subject increase from 14% to 27% reporting worsened developer experience. With 63 dropouts between the initial 158 and the matched 95, the manuscript does not report baseline (T1) comparisons of DX or productivity perceptions between retained and attrited participants, nor any inverse-probability weighting or sensitivity analysis. This omission leaves differential attrition as a plausible alternative explanation for the observed shift.

    Authors: We acknowledge that differential attrition is a plausible alternative explanation and that the manuscript does not include baseline comparisons or sensitivity analyses. The survey was administered anonymously to protect participant privacy and encourage candid responses on sensitive topics such as developer experience; consequently, no identifying information exists that would permit direct comparison of T1 DX or productivity scores between the 95 matched participants and the 63 who did not complete T2. We will revise the manuscript to state this limitation explicitly, to present the matched-cohort results with appropriate caveats, and to note that inverse-probability weighting was not feasible given the data collected. No new empirical analysis can be added. revision: partial

  2. Referee: [Methods] Methods section: the abstract and results report percentages and cohort sizes but provide no details on questionnaire item wording, validation, statistical tests used for the 14%-to-27% change, or controls for AI usage intensity, experience level, or other covariates. Without these, it is not possible to assess whether the reported DX erosion is robust or sensitive to measurement choices.

    Authors: We agree that the Methods section requires expansion for reproducibility. In the revised manuscript we will add: (1) verbatim wording of the developer-experience and productivity items, (2) information on any pilot testing or validation steps performed, (3) the exact statistical procedure used to evaluate the paired change from 14% to 27% (McNemar’s test for paired proportions), and (4) exploratory analyses that control for self-reported AI usage intensity and years of professional experience. The primary analysis remained descriptive to preserve power in the modest matched sample; these additions will clarify the robustness of the findings. revision: yes

standing simulated objections not resolved
  • Baseline (T1) comparisons between retained and attrited participants cannot be performed because the survey was anonymous and no identifying or linking data were collected.

Circularity Check

0 steps flagged

No circularity: empirical survey with direct self-reports and no derivations

full rationale

This is a longitudinal mixed-methods survey study relying on participant questionnaires at two time points. Claims about task time shifts, supervisory engineering work, and the productivity-experience paradox are presented as direct summaries of self-reported data from the matched cohort of 95. No equations, first-principles derivations, fitted parameters, or mathematical predictions appear in the abstract or described structure. The productivity-experience paradox is an observed pattern in the responses, not a constructed equivalence. Attrition and self-selection are methodological limitations but do not constitute circularity in any derivation chain. The paper is self-contained against external benchmarks as an empirical report.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claims rest on survey methodology assumptions and introduce one new conceptual entity without external validation.

axioms (1)
  • domain assumption Participant self-reports on time allocation, productivity, and experience dimensions are reliable indicators of actual changes induced by AI tools.
    The study depends entirely on questionnaire responses without objective performance metrics or observational data.
invented entities (1)
  • supervisory engineering work no independent evidence
    purpose: Categorize the activities of directing, evaluating, and correcting AI-generated code.
    Proposed as a new category based on observed shifts in task focus.

pith-pipeline@v0.9.0 · 5719 in / 1239 out tokens · 27685 ms · 2026-05-25T04:07:58.650152+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages · 3 internal anchors

  1. [1]

    Adam Alami and Neil Ernst. 2025. Human and Machine: How Software Engineers Perceive and Engage with AI-Assisted Code Reviews Compared to Their Peers. In2025 IEEE/ACM 18th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE). 63–74. doi:10.1109/CHASE66643.2025.00016

  2. [2]

    Sebastian Baltes and Paul Ralph. 2022. Sampling in software engineering research: a critical review and guidelines.Empirical Software Engineering 27, 4 (April 2022). doi:10.1007/s10664-021-10072-8

  3. [3]

    Christian Bird, Denae Ford, Thomas Zimmermann, Nicole Forsgren, Eirini Kalliamvakou, Travis Lowdermilk, and Idan Gazit. 2022. Taking Flight with Copilot: Early insights and opportunities of AI-powered pair-programming tools.Queue20, 6 (Dec. 2022), 35–57. doi:10.1145/3582083

  4. [4]

    Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology.Qualitative Research in Psychology3, 2 (Jan. 2006), 77–101. doi:10.1191/1478088706qp063oa

  5. [5]

    Virginia Braun and Victoria Clarke. 2019. Reflecting on reflexive thematic analysis.Qualitative Research in Sport, Exercise and Health11, 4 (Aug. 2019), 589–597. doi:10.1080/2159676x.2019.1628806

  6. [6]

    Brooks. 1987. No Silver Bullet Essence and Accidents of Software Engineering.Computer20, 4 (April 1987), 10–19. doi:10.1109/mc.1987.1663532

  7. [7]

    Jenna Butler, Jina Suh, Sankeerti Haniyur, and Constance Hadley. 2025. Dear Diary: A Randomized Controlled Trial of Generative AI Coding Tools in the Workplace. In2025 IEEE/ACM 47th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). 319–329. doi:10.1109/ICSE-SEIP66354.2025.00034

  8. [8]

    Sayan Chatterjee, Ching Louis Liu, Gareth Rowland, and Tim Hogarth. 2024. The Impact of AI Tool on Engineering at ANZ Bank: An Empirical Study on GitHub Copilot within Corporate Environment. doi:10.48550/arXiv.2402.05636

  9. [9]

    Tianyi Chen. 2024. The Impact of AI-Pair Programmers on Code Quality and Developer Satisfaction: Evidence from TiMi studio. In2024 International Conference on Generative Artificial Intelligence and Information Security (GAIIS). 201–205. doi:10.1145/3665348.3665383

  10. [10]

    Creswell and Vicki L

    John W. Creswell and Vicki L. Plano Clark. 2018.Designing and Conducting Mixed Methods Research(third ed.). SAGE, Thousand Oaks, California

  11. [11]

    Zheyuan (Kevin) Cui, Mert Demirer, Sonia Jaffe, Leon Musolff, Sida Peng, and Tobias Salz. 2025. The Effects of Generative AI on High-Skilled Work: Evidence from Three Field Experiments with Software Developers. doi:10.2139/ssrn.4945566

  12. [12]

    Sarah D’Angelo, Ambar Murillo, Satish Chandra, and Andrew Macvean. 2024. What Do Developers Want From AI?IEEE Software41, 3 (May 2024), 11–15. doi:10.1109/MS.2024.3363538

  13. [13]

    Paul Denny, Viraj Kumar, and Nasser Giacaman. 2023. Conversing with Copilot: Exploring Prompt Engineering for Solving CS1 Problems Using Natural Language. InProceedings of the 54th ACM Technical Symposium on Computer Science Education. Association for Computing Machinery, 1136–1142. doi:10.1145/3545945.3569823

  14. [14]

    Kayla DePalma, Izabel Miminoshvili, Chiara Henselder, Kate Moss, and Eman Abdullah AlOmar. 2024. Exploring ChatGPT’s code refactoring capabilities: An empirical study.Expert Systems with Applications249 (Sept. 2024), 123602. doi:10.1016/j.eswa.2024.123602

  15. [15]

    2025.State of AI-assisted Software Development

    DORA. 2025.State of AI-assisted Software Development. Technical Report. Google. https://dora.dev/research/2025/dora-report/

  16. [16]

    Manuel Hoffmann, Sam Boysel, Frank Nagle, Sida Peng, and Kevin Xu. 2025. Generative AI and the Nature of Work. doi:10.2139/ssrn.5007084 Manuscript submitted to ACM The Impact of AI Coding Assistants on Software Engineering: A Longitudinal Study 27

  17. [17]

    Brian Houck, Travis Lowdermilk, Cody Beyer, Steven Clarke, and Ben Hanrahan. 2025. The SPACE of AI: Real-World Lessons on AI’s Impact on Developers. doi:10.48550/arXiv.2508.00178

  18. [18]

    Anna Y. Q. Huang, Cheng-Yan Lin, Sheng-Yi Su, and Stephen J. H. Yang. 2025. The impact of GenAI-enabled coding hints on students’ programming performance and cognitive load in an SRL-based Python course.British Journal of Educational Technology56, 5 (2025), 1942–1972. doi:10.1111/bjet. 13589

  19. [19]

    Sarah Inman, Ambar Murillo, Sarah D’Angelo, Adam Brown, and Collin Green. 2025. Seamful AI for Creative Software Engineering.IEEE Software 42 (2025). doi:10.1109/MS.2025.3534085

  20. [20]

    SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

    Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik Narasimhan. 2023. SWE-bench: Can Language Models Resolve Real-World GitHub Issues? doi:10.48550/arXiv.2310.06770

  21. [21]

    Nils Knoth, Antonia Tolzin, Andreas Janson, and Jan Marco Leimeister. 2024. AI literacy and its implications for prompt engineering strategies. Computers and Education: Artificial Intelligence6 (2024), 100225. doi:10.1016/j.caeai.2024.100225

  22. [22]

    Will I be replaced?

    Mohammad Amin Kuhail, Sujith Samuel Mathew, Ashraf Khalil, Jose Berengueres, and Syed Jawad Hussain Shah. 2024. "Will I be replaced?" Assessing ChatGPT’s effect on software development and programmer perceptions of AI tools.Science of Computer Programming235 (July 2024), 103111. doi:10.1016/j.scico.2024.103111

  23. [23]

    Anand Kumar, Vishal Khare, Deepak Sharma, Satyam Kumar, Vijay Saini, Anshul Yadav, Sachendra Jain, Ankit Rana, Pratham Verma, Vaibhav Meena, and Avinash Edubilli. 2025. Intuition to Evidence: Measuring AI’s True Impact on Developer Productivity. doi:10.48550/arXiv.2509.19708

  24. [24]

    Eve Martina Lange, Åsa Cajander, and Maria Normark. 2025. Exploring Flow in IT Professionals’ Use of AI-Integrated Tools: Insights from Interviews. InArtificial Intelligence in HCI, Helmut Degen and Stavroula Ntoa (Eds.). Springer Nature Switzerland, Cham, 44–58. doi:10.1007/978-3-031-93429-2_3

  25. [25]

    Liang, Chenyang Yang, and Brad A

    Jenny T. Liang, Chenyang Yang, and Brad A. Myers. 2024. A Large-Scale Survey on the Usability of AI Programming Assistants: Successes and Challenges. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering (ICSE ’24). Association for Computing Machinery, New York, NY, USA, 1–13. doi:10.1145/3597503.3608128

  26. [26]

    Meyer, Laura E

    André N. Meyer, Laura E. Barton, Gail C. Murphy, Thomas Zimmermann, and Thomas Fritz. 2017. The Work Life of Developers: Activities, Switches, and Perceived Productivity.IEEE Transactions on Software Engineering43, 12 (2017), 1178–1193. doi:10.1109/TSE.2017.2656886

  27. [27]

    Meyer, Thomas Fritz, Gail C

    André N. Meyer, Thomas Fritz, Gail C. Murphy, and Thomas Zimmermann. 2014. Software developers’ perceptions of productivity. InProceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014). Association for Computing Machinery, New York, NY, USA, 19–29. doi:10.1145/2635868.2635892

  28. [28]

    Hal Mooz and Kevin Forsberg. 2006. 10.2.1 The Dual Vee - Illuminating the Management of Complexity.INCOSE International Symposium16, 1 (2006), 1368–1381. doi:10.1002/j.2334-5837.2006.tb02819.x

  29. [29]

    Desmarais, and Zhen Ming (Jack) Jiang

    Arghavan Moradi Dakhel, Vahid Majdinasab, Amin Nikanjam, Foutse Khomh, Michel C. Desmarais, and Zhen Ming (Jack) Jiang. 2023. GitHub Copilot AI pair programmer: Asset or Liability?Journal of Systems and Software203 (Sept. 2023), 111734. doi:10.1016/j.jss.2023.111734

  30. [30]

    Hussein Mozannar, Gagan Bansal, Adam Fourney, and Eric Horvitz. 2024. Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI ’24). Association for Computing Machinery, New York, NY, USA, 1–16. doi:10.1145/3613904.3641936

  31. [31]

    Emerson Murphy-Hill, Ciera Jaspan, Caitlin Sadowski, David Shepherd, Michael Phillips, Collin Winter, Andrea Knight, Edward Smith, and Matthew Jorde. 2021. What Predicts Software Developers’ Productivity?IEEE Transactions on Software Engineering47, 3 (March 2021), 582–594. doi:10.1109/TSE.2019.2900308

  32. [32]

    Nhan Nguyen and Sarah Nadi. 2022. An empirical evaluation of GitHub copilot’s code suggestions. InProceedings of the 19th International Conference on Mining Software Repositories. ACM, Pittsburgh Pennsylvania, 1–5. doi:10.1145/3524842.3528470

  33. [33]

    Abi Noda, Margaret-Anne Storey, Nicole Forsgren, and Michaela Greiler. 2023. DevEx: What Actually Drives Productivity: The developer-centric approach to measuring and improving productivity.Queue21, 2 (May 2023), Pages 20:35–Pages 20:53. doi:10.1145/3595878

  34. [34]

    Ruchika Pandey, Prabhat Singh, Raymond Wei, and Shaila Shankar. 2024. Transforming Software Development: Evaluating the Efficiency and Challenges of GitHub Copilot in Real-World Projects. arXiv:2406.17910 [cs.SE] https://arxiv.org/abs/2406.17910

  35. [35]

    Sida Peng, Eirini Kalliamvakou, Peter Cihon, and Mert Demirer. 2023. The Impact of AI on Developer Productivity: Evidence from GitHub Copilot. arXiv:2302.06590 [cs.SE] https://arxiv.org/abs/2302.06590

  36. [36]

    Neil Perry, Megha Srivastava, Deepak Kumar, and Dan Boneh. 2023. Do Users Write More Insecure Code with AI Assistants?. InProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (CCS ’23). Association for Computing Machinery, New York, NY, USA, 2785–2799. doi:10.1145/3576915.3623157

  37. [37]

    Gustavo Pinto, Cleidson De Souza, Thayssa Rocha, Igor Steinmacher, Alberto Souza, and Edward Monteiro. 2024. Developer Experiences with a Contextualized AI Coding Assistant: Usability, Expectations, and Outcomes. InProceedings of the IEEE/ACM 3rd International Conference on AI Engineering - Software Engineering for AI (CAIN ’24). Association for Computing...

  38. [38]

    Reeves, Juho Leinonen, Stephen MacNeil, Arisoa S

    James Prather, Brent N. Reeves, Juho Leinonen, Stephen MacNeil, Arisoa S. Randrianasolo, Brett A. Becker, Bailey Kimmel, Jared Wright, and Ben Briggs. 2024. The Widening Gap: The Benefits and Harms of Generative AI for Novice Programmers. InProceedings of the 2024 ACM Conference on International Computing Education Research, Vol. 1. ACM, 469–486. doi:10.1...

  39. [39]

    Sanka Rasnayaka, Guanlin Wang, Ridwan Shariffdeen, and Ganesh Neelakanta Iyer. 2024. An Empirical Study on Usage and Perceptions of LLMs in a Software Engineering Project. doi:10.48550/arXiv.2401.16186 Manuscript submitted to ACM 28 Vella and Blincoe

  40. [40]

    Abdul Razzaq, Jim Buckley, Qin Lai, Tingting Yu, and Goetz Botterweck. 2024. A Systematic Literature Review on the Influence of Enhanced Developer Experience on Developers’ Productivity: Factors, Practices, and Recommendations.ACM Comput. Surv.57, 1 (Oct. 2024), 13:1–13:46. doi:10.1145/3687299

  41. [41]

    Agnia Sergeyuk, Yaroslav Golubev, Timofey Bryksin, and Iftekhar Ahmed. 2025. Using AI-based coding assistants in practice: State of affairs, perceptions, and ways forward.Information and Software Technology178 (Feb. 2025), 107610. doi:10.1016/j.infsof.2024.107610

  42. [42]

    Stack Overflow. 2024. AI | 2024 Stack Overflow Developer Survey. https://survey.stackoverflow.co/2024/ai

  43. [43]

    Stack Overflow. 2025. 2025 Stack Overflow Developer Survey. https://survey.stackoverflow.co/2025/

  44. [44]

    Margaret-Anne Storey, Thomas Zimmermann, Christian Bird, Jacek Czerwonka, Brendan Murphy, and Eirini Kalliamvakou. 2021. Towards a Theory of Software Developer Job Satisfaction and Perceived Productivity.IEEE Transactions on Software Engineering47, 10 (Oct. 2021), 2125–2142. doi:10.1109/TSE.2019.2944354

  45. [45]

    Viktoria Stray, Elias Goldmann Brandtzæg, Viggo Tellefsen Wivestad, Astri Barbala, and Nils Brede Moe. 2025. Developer Productivity With and Without GitHub Copilot: A Longitudinal Mixed-Methods Case Study. doi:10.48550/arXiv.2509.20353

  46. [46]

    Vaillant, Felipe Deveza de Almeida, Paulo Anselmo M

    Thiago S. Vaillant, Felipe Deveza de Almeida, Paulo Anselmo M. S. Neto, Cuiyun Gao, Jan Bosch, and Eduardo Santana de Almeida. 2024. Developers’ Perceptions on the Impact of ChatGPT in Software Development: A Survey. http://arxiv.org/abs/2405.12195

  47. [47]

    Glassman

    Priyan Vaithilingam, Tianyi Zhang, and Elena L. Glassman. 2022. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. InCHI Conference on Human Factors in Computing Systems Extended Abstracts. ACM, New Orleans LA USA, 1–7. doi:10.1145/3491101.3519665

  48. [48]

    The Impact of AI Coding Assistants on Software Engineering: A Longitudinal Study

    Annie Vella and Kelly Blincoe. 2026. Replication package for “The Impact of AI Coding Assistants on Software Engineering: A Longitudinal Study”. https://doi.org/10.5281/zenodo.18821767

  49. [49]

    Thomas Weber, Maximilian Brandmaier, Albrecht Schmidt, and Sven Mayer. 2024. Significant Productivity Gains through Programming with Large Language Models.Proc. ACM Hum.-Comput. Interact.8, EICS (June 2024), 256:1–256:29. doi:10.1145/3661145

  50. [50]

    Weisz, Shraddha Kumar, Michael Muller, Karen-Ellen Browne, Arielle Goldberg, Ellice Heintze, and Shagun Bajpai

    Justin D. Weisz, Shraddha Kumar, Michael Muller, Karen-Ellen Browne, Arielle Goldberg, Ellice Heintze, and Shagun Bajpai. 2024. Examining the Use and Impact of an AI Code Assistant on Developer Productivity and Experience in the Enterprise. doi:10.48550/arXiv.2412.06603

  51. [51]

    Medappa, Murat M

    Feiyang Xu, Poonacha K. Medappa, Murat M. Tunc, Martijn Vroegindeweij, and Jan C. Fransoo. 2025. AI-assisted Programming May Decrease the Productivity of Experienced Developers by Increasing Maintenance Burden. doi:10.48550/arXiv.2510.10165

  52. [52]

    Burak Yetiştiren, Işık Özsoy, Miray Ayerdem, and Eray Tüzün. 2023. Evaluating the Code Quality of AI-Assisted Code Generation Tools: An Empirical Study on GitHub Copilot, Amazon CodeWhisperer, and ChatGPT. arXiv:2304.10778 [cs.SE] https://arxiv.org/abs/2304.10778

  53. [53]

    Han Zhang, Shiyi Wang, and Zijian Li. 2025. The Neurophysiological Paradox of AI-Induced Frustration: A Multimodal Study of Heart Rate Variability, Affective Responses, and Creative Output.Brain Sciences15, 6 (May 2025), 565. doi:10.3390/brainsci15060565 Manuscript submitted to ACM