pith. machine review for the scientific record. sign in

arxiv: 2603.28592 · v2 · submitted 2026-03-30 · 💻 cs.SE

Recognition: no theorem link

Debt Behind the AI Boom: A Large-Scale Empirical Study of AI-Generated Code in the Wild

Authors on Pith no claims yet

Pith reviewed 2026-05-14 21:24 UTC · model grok-4.3

classification 💻 cs.SE
keywords AI-generated codetechnical debtcode quality issuesempirical studyGitHub repositoriesstatic analysiscode smellssoftware maintenance
0
0 comments X

The pith

AI-generated code introduces persistent technical debt, with 22.7% of issues still surviving in the latest repository versions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tracks hundreds of thousands of AI-written commits across thousands of GitHub repositories to see what happens to the problems they introduce after integration. It finds that code smells dominate the issues introduced, that more than 15 percent of AI commits add at least one problem, and that nearly a quarter of those problems remain in the code at the latest version. A sympathetic reader would care because this suggests AI assistants are adding lasting maintenance burdens rather than temporary problems that developers clean up right away.

Core claim

By constructing a dataset of 302.6k verified AI-authored commits from 6,299 repositories and applying static analysis before and after each commit, the study attributes 484,366 distinct issues to the AI changes. Code smells make up 89.3 percent of these, over 15 percent of commits from each assistant introduce issues, and 22.7 percent of the issues survive to the current repository state.

What carries the argument

Lifecycle tracking of AI-introduced issues via pre- and post-commit static analysis on a large verified commit dataset.

Load-bearing premise

The commits can be accurately verified as AI-authored and static analysis tools correctly attribute detected issues to the AI changes rather than to concurrent human edits or false positives.

What would settle it

A manual audit of a random sample of the surviving issues to confirm they originated from the AI commit and were not introduced or removed by later human edits.

Figures

Figures reproduced from arXiv: 2603.28592 by David Lo, Ivana Clairine Irsan, Junkai Chen, Ratnadira Widyasari, Yanjie Zhao, Yue Liu.

Figure 1
Figure 1. Figure 1: Contributor statistics for the Anthropic [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Command injection risk introduced by GitHub Copilot [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Undefined variables introduced by GitHub Copilot in [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Overview of our approach. and full commit message. This step allows us to identify AI￾authored commits that are not directly visible during repository discovery (e.g., commits on non-default branches, commits outside the observation window). Filtering. To focus on established open-source projects, we filter out repositories that do not meet our study criteria. We keep only repositories with at least 100 Gi… view at source ↗
Figure 5
Figure 5. Figure 5: Example of a recorded issue detected by ESLint in a [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Overview of our dataset: (a) growth of AI-authored [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: Net impact of AI coding assistants: issues introduced [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Cumulative growth of AI-introduced issues over time, [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: A TypeScript lint issue introduced by a Claude [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗
read the original abstract

AI coding assistants are now widely used in software development. Software developers increasingly integrate AI-generated code into their codebases to improve productivity. Prior studies have shown that AI-generated code may contain code quality issues under controlled settings. However, we still know little about the real-world impact of AI-generated code on software quality and maintenance after it is introduced into production repositories. In other words, it remains unclear whether such issues are quickly fixed or persist and accumulate over time as technical debt. In this paper, we conduct a large-scale empirical study on the technical debt introduced by AI coding assistants in the wild. To achieve that, we built a dataset of 302.6k verified AI-authored commits from 6,299 GitHub repositories, covering five widely used AI coding assistants. For each commit, we run static analysis before and after the change to precisely attribute which code smells, correctness issues, and security issues the AI introduced. We then track each introduced issue from the introducing commit to the latest repository revision to study its lifecycle. Our results show that we identified 484,366 distinct issues, and that code smells are by far the most common type, accounting for 89.3% of all issues. We also find that more than 15% of commits from every AI coding assistant introduce at least one issue, although the rates vary across tools. More importantly, 22.7% of tracked AI-introduced issues still survive at the latest version of the repository. These findings show that AI-generated code can introduce long-term maintenance costs into real software projects and highlight the need for stronger quality assurance in AI-assisted development.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript constructs a dataset of 302.6k verified AI-authored commits from 6,299 GitHub repositories spanning five AI coding assistants. It applies static analysis before and after each commit to attribute 484,366 issues (89.3% code smells) to the AI changes, then tracks issue survival to the latest revision, reporting that 22.7% of AI-introduced issues persist and that more than 15% of commits from each assistant introduce at least one issue. The central claim is that AI-generated code introduces measurable long-term technical debt in real projects.

Significance. If the attribution methodology holds, the work supplies a large-scale, real-world measurement of AI code quality impact that extends beyond lab studies. The scale of the commit corpus and the before-after tracking design are clear strengths that enable concrete quantification of issue persistence. This could usefully inform tool evaluation and developer guidelines once the attribution precision is documented.

major comments (1)
  1. [Methods] Methods section on dataset construction and static analysis: the claim that before-and-after analysis 'precisely attribute[s]' issues to AI changes rests on two unvalidated assumptions—(1) that every tracked commit contains only AI-generated code with no concurrent human edits in the same diff, and (2) that the static analyzer reports only issues whose root cause lies in the AI diff rather than pre-existing code, false positives, or cross-file interactions. No accuracy metrics, manual validation sample, or inter-rater checks are reported for the 302.6k commits; without these, the 22.7% survival rate cannot be interpreted as a reliable measure of long-term maintenance cost.
minor comments (2)
  1. [Abstract] Abstract: the five AI coding assistants are not named; listing them would improve immediate readability.
  2. [Results] Results: the survival-rate figure (22.7%) is presented without confidence intervals or sensitivity analysis to the static-analysis configuration; adding these would strengthen the presentation even if the core method is later validated.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the thoughtful and detailed comments, which help clarify the strengths and limitations of our large-scale empirical approach. We address the methodological concerns below and outline targeted revisions to improve transparency and interpretability of our results.

read point-by-point responses
  1. Referee: [Methods] Methods section on dataset construction and static analysis: the claim that before-and-after analysis 'precisely attribute[s]' issues to AI changes rests on two unvalidated assumptions—(1) that every tracked commit contains only AI-generated code with no concurrent human edits in the same diff, and (2) that the static analyzer reports only issues whose root cause lies in the AI diff rather than pre-existing code, false positives, or cross-file interactions. No accuracy metrics, manual validation sample, or inter-rater checks are reported for the 302.6k commits; without these, the 22.7% survival rate cannot be interpreted as a reliable measure of long-term maintenance cost.

    Authors: We agree that the attribution relies on assumptions that benefit from explicit validation and discussion. For (1), Section 3.1 details our verification of AI-authored commits via commit-message heuristics, author metadata, and repository signals (e.g., Copilot co-author tags); we selected only commits where these indicators align to reduce mixed-edit noise. We acknowledge residual risk of minor human edits and will add a manual validation subsection reporting results from a random sample of 200 commits (two authors independently labeled each for AI-only vs. mixed content, with inter-rater agreement). For (2), the before-and-after design isolates net changes introduced by the diff, but we recognize that static analyzers (SonarQube in our case) can flag pre-existing or interaction-induced issues. We will expand the limitations section to quantify this risk, report the sample-based precision of our AI-commit filter, and discuss how these factors affect interpretation of the 22.7% persistence figure. These additions will allow readers to assess the reliability of our long-term debt measurements without overstating precision. revision: partial

Circularity Check

0 steps flagged

No circularity: pure empirical measurement with no derivations or self-referential predictions

full rationale

This is a data-driven empirical study that collects 302.6k AI-authored commits, runs static analysis before/after each commit to count introduced issues, and tracks survival to the latest revision. No equations, fitted parameters, predictions, or ansatzes are present. The 22.7% survival rate and 484,366 issue counts are direct tallies from the observed data rather than quantities derived from or equivalent to the input collection process by construction. No self-citation load-bearing steps or uniqueness theorems are invoked. The study is self-contained against external benchmarks (GitHub commits and static-analysis tools) and receives the default non-circular finding for measurement papers.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on accurate attribution of issues to AI commits via static analysis and on the representativeness of the collected GitHub dataset.

axioms (1)
  • domain assumption Static analysis tools accurately detect and attribute code smells, correctness issues, and security issues introduced specifically by AI changes.
    The study runs static analysis before and after each commit to identify AI-introduced problems.

pith-pipeline@v0.9.0 · 5620 in / 1112 out tokens · 48273 ms · 2026-05-14T21:24:54.734096+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems

    cs.SE 2026-04 unverdicted novelty 5.0

    Claude Code centers on a model-tool while-loop surrounded by permission systems, context compaction, extensibility hooks, subagent delegation, and session storage; the same design questions yield different answers in ...

  2. To Copilot and Beyond: 22 AI Systems Developers Want Built

    cs.SE 2026-04 unverdicted novelty 5.0

    Survey of 860 developers reveals 22 desired AI systems for non-coding tasks with explicit constraints on authority, provenance, and quality signals, framed as bounded delegation where AI handles assembly work but not ...

Reference graph

Works this paper leans on

75 extracted references · 75 canonical work pages · cited by 2 Pith papers · 1 internal anchor

  1. [1]

    Octoverse: A new developer joins github every second as ai leads typescript to #1,

    GitHub, “Octoverse: A new developer joins github every second as ai leads typescript to #1,” October 2024. [Online]. Available: https://github.blog/news-insights/octoverse/octoverse-a-new-developer -joins-github-every-second-as-ai-leads-typescript-to-1/

  2. [2]

    2025 developer survey,

    Stack Overflow, “2025 developer survey,” June 2025. [Online]. Available: https://survey.stackoverflow.co/2025/

  3. [3]

    Satya nadella says as much as 30% of microsoft code is written by ai,

    J. Novet, “Satya nadella says as much as 30% of microsoft code is written by ai,” April 2025. [Online]. Available: https: //www.cnbc.com/2025/04/29/satya-nadella-says-as-much-as-30percent -of-microsoft-code-is-written-by-ai.html

  4. [4]

    Google ceo sundar pichai says more than a quarter of the company’s new code is created by ai,

    K. Robison, “Google ceo sundar pichai says more than a quarter of the company’s new code is created by ai,” October 2024. [Online]. Available: https://fortune.com/2024/10/30/googles-code-ai-sundar-pic hai/ 12

  5. [5]

    Refining chatgpt-generated code: Characterizing and mitigating code quality issues,

    Y . Liu, T. Le-Cong, R. Widyasari, C. Tantithamthavorn, L. Li, X.- B. D. Le, and D. Lo, “Refining chatgpt-generated code: Characterizing and mitigating code quality issues,”ACM Transactions on Software Engineering and Methodology, vol. 33, no. 5, pp. 1–26, 2024

  6. [6]

    Quality assessment of chatgpt generated code and their use by developers,

    M. L. Siddiq, L. Roney, J. Zhang, and J. C. D. S. Santos, “Quality assessment of chatgpt generated code and their use by developers,” in Proceedings of the 21st international conference on mining software repositories, 2024, pp. 152–156

  7. [7]

    Asleep at the keyboard? assessing the security of github copilot’s code con- tributions,

    H. Pearce, B. Ahmad, B. Tan, B. Dolan-Gavitt, and R. Karri, “Asleep at the keyboard? assessing the security of github copilot’s code con- tributions,”Communications of the ACM, vol. 68, no. 2, pp. 96–105, 2025

  8. [8]

    Do users write more insecure code with ai assistants?

    N. Perry, M. Srivastava, D. Kumar, and D. Boneh, “Do users write more insecure code with ai assistants?” inProceedings of the 2023 ACM SIGSAC conference on computer and communications security, 2023, pp. 2785–2799

  9. [9]

    When ai takes the wheel: Security analysis of framework-constrained program generation,

    Y . Liu, Z. Xing, S. Pan, and C. Tantithamthavorn, “When ai takes the wheel: Security analysis of framework-constrained program generation,” arXiv preprint arXiv:2510.16823, 2025

  10. [10]

    Trust dynamics in ai-assisted development: Defi- nitions, factors, and implications,

    S. Sabouri, P. Eibl, X. Zhou, M. Ziyadi, N. Medvidovic, L. Lindemann, and S. Chattopadhyay, “Trust dynamics in ai-assisted development: Defi- nitions, factors, and implications,” in2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE). IEEE, 2025, pp. 1678– 1690

  11. [11]

    Coding on copilot: 2023 data suggests downward pressure on code quality,

    W. Harding and M. Kloster, “Coding on copilot: 2023 data suggests downward pressure on code quality,”https://www. gitclear. com/cod- ing on copilot data shows ais downward pressure on code quality/, 2024

  12. [12]

    The evolution of technical debt from devops to generative ai: A multivocal literature review,

    S. Moreschini, E.-M. Arvanitou, E.-P. Kanidou, N. Nikolaidis, R. Su, A. Ampatzoglou, A. Chatzigeorgiou, and V . Lenarduzzi, “The evolution of technical debt from devops to generative ai: A multivocal literature review,”Journal of Systems and Software, vol. 231, p. 112599, 2026

  13. [13]

    Security weaknesses of copilot-generated code in github projects: An empirical study,

    Y . Fu, P. Liang, A. Tahir, Z. Li, M. Shahin, J. Yu, and J. Chen, “Security weaknesses of copilot-generated code in github projects: An empirical study,”ACM Transactions on Software Engineering and Methodology, vol. 34, no. 8, pp. 1–34, 2025

  14. [14]

    Ai code in the wild: Measuring security risks and ecosystem shifts of ai-generated code in modern software,

    B. Wang, W. Yu, Y . Zhong, H. Yu, K. Lian, C. Lu, H. Zheng, D. Zhang, and H. Li, “Ai code in the wild: Measuring security risks and ecosystem shifts of ai-generated code in modern software,”arXiv preprint arXiv:2512.18567, 2025

  15. [15]

    Does ai-assisted coding deliver? a difference-in-differences study of cursor’s impact on software projects,

    H. He, C. Miller, S. Agarwal, C. K ¨astner, and B. Vasilescu, “Does ai-assisted coding deliver? a difference-in-differences study of cursor’s impact on software projects,”arXiv e-prints, pp. arXiv–2511, 2025

  16. [16]

    On the use of agentic coding: An empirical study of pull requests on github,

    M. Watanabe, H. Li, Y . Kashiwa, B. Reid, H. Iida, and A. E. Hassan, “On the use of agentic coding: An empirical study of pull requests on github,”arXiv preprint arXiv:2509.14745, 2025

  17. [17]

    More code, less reuse: Investigating code qual- ity and reviewer sentiment towards ai-generated pull requests,

    H. Huang, P. Jaisri, S. Shimizu, L. Chen, S. Nakashima, and G. Rodr ´ıguez-P´erez, “More code, less reuse: Investigating code qual- ity and reviewer sentiment towards ai-generated pull requests,”arXiv preprint arXiv:2601.21276, 2026

  18. [18]

    An exploratory study on self-admitted technical debt,

    A. Potdar and E. Shihab, “An exploratory study on self-admitted technical debt,” in2014 IEEE International Conference on Software Maintenance and Evolution. IEEE, 2014, pp. 91–100

  19. [19]

    Using natural language processing to automatically detect self-admitted technical debt,

    E. da Silva Maldonado, E. Shihab, and N. Tsantalis, “Using natural language processing to automatically detect self-admitted technical debt,”IEEE Transactions on Software Engineering, vol. 43, no. 11, pp. 1044–1062, 2017

  20. [20]

    Claude opus 4.6 wrote a dependency-free c compiler in rust, with backends targeting x86 (64- and 32-bit), arm, and risc-v, capable of compiling a booting linux kernel,

    Anthropic, “Claude opus 4.6 wrote a dependency-free c compiler in rust, with backends targeting x86 (64- and 32-bit), arm, and risc-v, capable of compiling a booting linux kernel,” February 2026. [Online]. Available: https://github.com/anthropics/claudes-c-compiler

  21. [21]

    The wycash portfolio management system,

    W. Cunningham, “The wycash portfolio management system,”ACM Sigplan Oops Messenger, vol. 4, no. 2, pp. 29–30, 1992

  22. [22]

    Managing tech- nical debt in software engineering (dagstuhl seminar 16162),

    P. Avgeriou, P. Kruchten, I. Ozkaya, and C. Seaman, “Managing tech- nical debt in software engineering (dagstuhl seminar 16162),”Dagstuhl reports, vol. 6, no. 4, pp. 110–138, 2016

  23. [23]

    A systematic mapping study on technical debt and its management,

    Z. Li, P. Avgeriou, and P. Liang, “A systematic mapping study on technical debt and its management,”Journal of systems and software, vol. 101, pp. 193–220, 2015

  24. [24]

    hysteria2,

    seagullz4, “hysteria2,” 2025, accessed: 2026-01-15. [Online]. Available: https://github.com/seagullz4/hysteria2

  25. [25]

    Commitd9e392d: Improve code security by removing shell=true,

    ——, “Commitd9e392d: Improve code security by removing shell=true,” 2025. [Online]. Available: https://github.com/seagullz4/hys teria2/commit/d9e392d

  26. [26]

    librealsense,

    RealSense, “librealsense,” 2025, accessed: 2026-01-15. [Online]. Available: https://github.com/IntelRealSense/librealsense

  27. [27]

    Commit14026c8: Add missing constants and fix 6fps bug,

    Intel RealSense, “Commit14026c8: Add missing constants and fix 6fps bug,” 2025. [Online]. Available: https://github.com/realsenseai/lib realsense/commit/14026c898f790db79a0b588983c08a3108fa326e

  28. [28]

    Commite277daf: Introduce shell-based subprocess call,

    seagullz4, “Commite277daf: Introduce shell-based subprocess call,”

  29. [29]

    Available: https://github.com/seagullz4/hysteria2/com mit/e277daf540dad4b5a34822f0088e70617b689587

    [Online]. Available: https://github.com/seagullz4/hysteria2/com mit/e277daf540dad4b5a34822f0088e70617b689587

  30. [30]

    Commit5535b8a: Refactor test script with named constants,

    Intel RealSense, “Commit5535b8a: Refactor test script with named constants,” 2025. [Online]. Available: https://github.com/realsenseai/lib realsense/commit/5535b8a204bc759324ee89f864eb680362be5ece

  31. [31]

    An empirical evaluation of github copilot’s code suggestions,

    N. Nguyen and S. Nadi, “An empirical evaluation of github copilot’s code suggestions,” inProceedings of the 19th International Conference on Mining Software Repositories, 2022, pp. 1–5

  32. [32]

    Gh archive,

    GH Archive, “Gh archive,” 2026, accessed: 2026-03-20. [Online]. Available: https://www.gharchive.org/

  33. [33]

    H archive: Ganalyzing event data with bigquery,

    ——, “H archive: Ganalyzing event data with bigquery,” 2026, accessed: 2026-03-20. [Online]. Available: www.gharchive.org/#bigquery

  34. [34]

    Github rest api documentation,

    GitHub, “Github rest api documentation,” 2026, accessed: 2026-03-20. [Online]. Available: https://docs.github.com/en/rest

  35. [35]

    ESLint Documentation,

    ESLint, “ESLint Documentation,” 2026, accessed: 2026-03-24. [Online]. Available: https://eslint.org/docs/latest/

  36. [36]

    Pylint Documentation,

    Python Code Quality Authority, “Pylint Documentation,” 2026, accessed: 2026-03-24. [Online]. Available: https://pylint.readthedocs.io/

  37. [37]

    Semgrep Documentation,

    Semgrep, “Semgrep Documentation,” 2026, accessed: 2026-03-24. [Online]. Available: https://semgrep.dev/docs/

  38. [38]

    Commit46695d1: feat: Add redis connection pooling for proxy caching layers,

    superagent-ai, “Commit46695d1: feat: Add redis connection pooling for proxy caching layers,” Aug. 2025. [Online]. Available: https://github.com/superagent-ai/superagent/commit/46695d14622a6c5d e22315ce9514964d22e4d825

  39. [39]

    GitHub Copilot,

    GitHub, “GitHub Copilot,” 2026, accessed: 2026-03-24. [Online]. Available: https://docs.github.com/en/copilot/get-started/what-is-githu b-copilot

  40. [40]

    Claude Code Overview,

    Anthropic, “Claude Code Overview,” 2026, accessed: 2026-03-24. [Online]. Available: https://code.claude.com/docs/en/overview

  41. [41]

    Cursor Documentation,

    Cursor, “Cursor Documentation,” 2026, accessed: 2026-03-24. [Online]. Available: https://cursor.com/docs

  42. [42]

    Gemini Code Assist Overview,

    Google, “Gemini Code Assist Overview,” 2026, accessed: 2026-03-24. [Online]. Available: https://developers.google.com/gemini-code-assist/ docs/overview

  43. [43]

    Introducing Devin,

    Cognition AI, “Introducing Devin,” 2026, accessed: 2026-03-24. [Online]. Available: https://docs.devin.ai/get-started/devin-intro

  44. [44]

    Commitd360798: Replace index.json with index.jsonl flat jsonl format,

    “Commitd360798: Replace index.json with index.jsonl flat jsonl format,” 2025. [Online]. Available: https://github.com/ArchiveBox/Ar chiveBox/commit/d36079829bed32d71b2a1a5e8e6019457d6a7ae7

  45. [45]

    Archivebox,

    “Archivebox,” 2025. [Online]. Available: https://github.com/ArchiveBo x/ArchiveBox

  46. [46]

    Fowler,Refactoring: improving the design of existing code

    M. Fowler,Refactoring: improving the design of existing code. Addison-Wesley Professional, 2018

  47. [47]

    PEP 597 – Add optional EncodingWarning,

    I. Naoki, “PEP 597 – Add optional EncodingWarning,” 2021. [Online]. Available: https://peps.python.org/pep-0597/

  48. [48]

    Commitfb99747: fix: revert accidental cache=true changes to preserve original cache parameter handling,

    firecrawl, “Commitfb99747: fix: revert accidental cache=true changes to preserve original cache parameter handling,” 2025. [Online]. Available: https://github.com/firecrawl/firecrawl/commit/fb99747ba978 7683ac5722ba55c46f823461691a

  49. [49]

    firecrawl,

    ——, “firecrawl,” 2026. [Online]. Available: https://github.com/firecra wl/firecrawl

  50. [50]

    Commita7aa0cb: Fix pydantic field name shadowing issues causing import nameerror,

    ——, “Commita7aa0cb: Fix pydantic field name shadowing issues causing import nameerror,” 2025. [Online]. Available: https: //github.com/firecrawl/firecrawl/commit/a7aa0cb2f4496394a94b50f001 3eb0328b408dc8

  51. [51]

    Commitd8549c0: Add refresh data feature with backend endpoint and ui components,

    Microsoft, “Commitd8549c0: Add refresh data feature with backend endpoint and ui components,” 2025. [Online]. Available: https://github.com/microsoft/data-formulator/commit/d8549c0c8c13953 1ee5bf266609f7e5352384c5f

  52. [52]

    data-formulator,

    ——, “data-formulator,” 2026. [Online]. Available: https://github.com /microsoft/data-formulator

  53. [53]

    Commit762cddc: fix: address pr review comments from cubic-dev- ai,

    “Commit762cddc: fix: address pr review comments from cubic-dev- ai,” 2025. [Online]. Available: https://github.com/ArchiveBox/ArchiveB ox/commit/762cddc8c5d42095c26dda0e193fab6794fd69d5

  54. [54]

    Stirling-pdf,

    S. Tools, “Stirling-pdf,” 2026. [Online]. Available: https://github.com/S tirling-Tools/Stirling-PDF

  55. [55]

    Commite7109bb: Convert extract-image-scans to react component,

    ——, “Commite7109bb: Convert extract-image-scans to react component,” 2025. [Online]. Available: https://github.com/Stirling-Too ls/Stirling-PDF/commit/e7109bb4e9fbeb1fed7f10f50e5831f48da870be

  56. [56]

    Commit00efc880: Fix typescript linting error in zipfileser- vice,

    ——, “Commit00efc880: Fix typescript linting error in zipfileser- vice,” 2025. [Online]. Available: https://github.com/Stirling-Tools/Stirl ing-PDF/commit/00efc8802cd4be7bdf30c746dbd7a2cb1108a601 13

  57. [57]

    Commit439cde1: style: apply final formatting changes,

    crewAIInc, “Commit439cde1: style: apply final formatting changes,”

  58. [58]

    Available: https://github.com/crewAIInc/crewAI-tools/c ommit/439cde180cd69791f46dedde192c41184ca1f96f

    [Online]. Available: https://github.com/crewAIInc/crewAI-tools/c ommit/439cde180cd69791f46dedde192c41184ca1f96f

  59. [59]

    B113: Test for missing requests timeout,

    PyCQA, “B113: Test for missing requests timeout,” 2023. [Online]. Available: https://bandit.readthedocs.io/en/latest/plugins/b113 request without timeout.html

  60. [60]

    On the robustness of code generation techniques: An empirical study on github copilot,

    A. Mastropaolo, L. Pascarella, E. Guglielmi, M. Ciniselli, S. Scalabrino, R. Oliveto, and G. Bavota, “On the robustness of code generation techniques: An empirical study on github copilot,” in2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 2023, pp. 2149–2160

  61. [61]

    Lost at c: A user study on the security implications of large language model code assistants,

    G. Sandoval, H. Pearce, T. Nys, R. Karri, S. Garg, and B. Dolan-Gavitt, “Lost at c: A user study on the security implications of large language model code assistants,” in32nd USENIX Security Symposium (USENIX Security 23), 2023, pp. 2205–2222

  62. [62]

    The Impact of AI on Developer Productivity: Evidence from GitHub Copilot

    S. Peng, E. Kalliamvakou, P. Cihon, and M. Demirer, “The impact of ai on developer productivity: Evidence from github copilot,”arXiv preprint arXiv:2302.06590, 2023

  63. [63]

    Measuring github copilot’s impact on productivity,

    A. Ziegler, E. Kalliamvakou, X. A. Li, A. Rice, D. Rifkin, S. Simister, G. Sittampalam, and E. Aftandilian, “Measuring github copilot’s impact on productivity,”Communications of the ACM, vol. 67, no. 3, pp. 54–63, 2024

  64. [64]

    Ai-assisted code author- ing at scale: Fine-tuning, deploying, and mixed methods evaluation,

    V . Murali, C. Maddila, I. Ahmad, M. Bolin, D. Cheng, N. Ghorbani, R. Fernandez, N. Nagappan, and P. C. Rigby, “Ai-assisted code author- ing at scale: Fine-tuning, deploying, and mixed methods evaluation,” Proceedings of the ACM on Software Engineering, vol. 1, no. FSE, pp. 1066–1085, 2024

  65. [65]

    A large-scale survey on the usability of ai programming assistants: Successes and challenges,

    J. T. Liang, C. Yang, and B. A. Myers, “A large-scale survey on the usability of ai programming assistants: Successes and challenges,” in Proceedings of the 46th IEEE/ACM international conference on software engineering, 2024, pp. 1–13

  66. [66]

    An industry case study on adoption of ai- based programming assistants,

    N. Davila, I. Wiese, I. Steinmacher, L. Lucio da Silva, A. Kawamoto, G. J. P. Favaro, and I. Nunes, “An industry case study on adoption of ai- based programming assistants,” inProceedings of the 46th international conference on software engineering: software engineering in practice, 2024, pp. 92–102

  67. [67]

    Using ai assistants in software development: A qualitative study on security practices and concerns,

    J. H. Klemmer, S. A. Horstmann, N. Patnaik, C. Ludden, C. Burton Jr, C. Powers, F. Massacci, A. Rahman, D. V otipka, H. R. Lipfordet al., “Using ai assistants in software development: A qualitative study on security practices and concerns,” inProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, 2024, pp. 2726–2740

  68. [68]

    Automating change-level self-admitted technical debt determination,

    M. Yan, X. Xia, E. Shihab, D. Lo, J. Yin, and X. Yang, “Automating change-level self-admitted technical debt determination,”IEEE Trans- actions on Software Engineering, vol. 45, no. 12, pp. 1211–1229, 2018

  69. [69]

    Neural network-based detection of self-admitted technical debt: From perfor- mance to explainability,

    X. Ren, Z. Xing, X. Xia, D. Lo, X. Wang, and J. Grundy, “Neural network-based detection of self-admitted technical debt: From perfor- mance to explainability,”ACM transactions on software engineering and methodology (TOSEM), vol. 28, no. 3, pp. 1–45, 2019

  70. [70]

    On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation,

    F. Palomba, G. Bavota, M. Di Penta, F. Fasano, R. Oliveto, and A. De Lucia, “On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation,” inProceedings of the 40th international conference on software engineering, 2018, pp. 482–482

  71. [71]

    When and why your code starts to smell bad (and whether the smells go away),

    M. Tufano, F. Palomba, G. Bavota, R. Oliveto, M. Di Penta, A. De Lucia, and D. Poshyvanyk, “When and why your code starts to smell bad (and whether the smells go away),”IEEE Transactions on Software Engineering, vol. 43, no. 11, pp. 1063–1088, 2017

  72. [72]

    How do developers fix issues and pay back technical debt in the apache ecosystem?

    G. Digkas, M. Lungu, P. Avgeriou, A. Chatzigeorgiou, and A. Ampat- zoglou, “How do developers fix issues and pay back technical debt in the apache ecosystem?” in2018 IEEE 25th International Conference on software analysis, evolution and reengineering (SANER). IEEE, 2018, pp. 153–163

  73. [73]

    Can clean new code reduce technical debt density?

    G. Digkas, A. Chatzigeorgiou, A. Ampatzoglou, and P. Avgeriou, “Can clean new code reduce technical debt density?”IEEE Transactions on Software Engineering, vol. 48, no. 5, pp. 1705–1721, 2020

  74. [74]

    Was self-admitted technical debt removal a real removal? an in-depth perspective,

    F. Zampetti, A. Serebrenik, and M. Di Penta, “Was self-admitted technical debt removal a real removal? an in-depth perspective,” in Proceedings of the 15th international conference on mining software repositories, 2018, pp. 526–536

  75. [75]

    Towards automatically addressing self-admitted technical debt: How far are we?

    A. Mastropaolo, M. Di Penta, and G. Bavota, “Towards automatically addressing self-admitted technical debt: How far are we?” in2023 38th IEEE/ACM International Conference on Automated Software Engineer- ing (ASE). IEEE, 2023, pp. 585–597