arxiv: 2603.28592 · v2 · submitted 2026-03-30 · 💻 cs.SE

Recognition: no theorem link

Debt Behind the AI Boom: A Large-Scale Empirical Study of AI-Generated Code in the Wild

Yue Liu , Ratnadira Widyasari , Yanjie Zhao , Ivana Clairine Irsan , Junkai Chen , David Lo

Authors on Pith no claims yet

Pith reviewed 2026-05-14 21:24 UTC · model grok-4.3

classification 💻 cs.SE

keywords AI-generated codetechnical debtcode quality issuesempirical studyGitHub repositoriesstatic analysiscode smellssoftware maintenance

0 comments

The pith

AI-generated code introduces persistent technical debt, with 22.7% of issues still surviving in the latest repository versions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tracks hundreds of thousands of AI-written commits across thousands of GitHub repositories to see what happens to the problems they introduce after integration. It finds that code smells dominate the issues introduced, that more than 15 percent of AI commits add at least one problem, and that nearly a quarter of those problems remain in the code at the latest version. A sympathetic reader would care because this suggests AI assistants are adding lasting maintenance burdens rather than temporary problems that developers clean up right away.

Core claim

By constructing a dataset of 302.6k verified AI-authored commits from 6,299 repositories and applying static analysis before and after each commit, the study attributes 484,366 distinct issues to the AI changes. Code smells make up 89.3 percent of these, over 15 percent of commits from each assistant introduce issues, and 22.7 percent of the issues survive to the current repository state.

What carries the argument

Lifecycle tracking of AI-introduced issues via pre- and post-commit static analysis on a large verified commit dataset.

Load-bearing premise

The commits can be accurately verified as AI-authored and static analysis tools correctly attribute detected issues to the AI changes rather than to concurrent human edits or false positives.

What would settle it

A manual audit of a random sample of the surviving issues to confirm they originated from the AI commit and were not introduced or removed by later human edits.

Figures

Figures reproduced from arXiv: 2603.28592 by David Lo, Ivana Clairine Irsan, Junkai Chen, Ratnadira Widyasari, Yanjie Zhao, Yue Liu.

**Figure 2.** Figure 2: Command injection risk introduced by GitHub Copilot [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Undefined variables introduced by GitHub Copilot in [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Overview of our approach. and full commit message. This step allows us to identify AIauthored commits that are not directly visible during repository discovery (e.g., commits on non-default branches, commits outside the observation window). Filtering. To focus on established open-source projects, we filter out repositories that do not meet our study criteria. We keep only repositories with at least 100 Gi… view at source ↗

**Figure 5.** Figure 5: Example of a recorded issue detected by ESLint in a [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗

**Figure 6.** Figure 6: Overview of our dataset: (a) growth of AI-authored [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗

**Figure 8.** Figure 8: Net impact of AI coding assistants: issues introduced [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

**Figure 9.** Figure 9: Cumulative growth of AI-introduced issues over time, [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗

**Figure 10.** Figure 10: A TypeScript lint issue introduced by a Claude [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗

read the original abstract

AI coding assistants are now widely used in software development. Software developers increasingly integrate AI-generated code into their codebases to improve productivity. Prior studies have shown that AI-generated code may contain code quality issues under controlled settings. However, we still know little about the real-world impact of AI-generated code on software quality and maintenance after it is introduced into production repositories. In other words, it remains unclear whether such issues are quickly fixed or persist and accumulate over time as technical debt. In this paper, we conduct a large-scale empirical study on the technical debt introduced by AI coding assistants in the wild. To achieve that, we built a dataset of 302.6k verified AI-authored commits from 6,299 GitHub repositories, covering five widely used AI coding assistants. For each commit, we run static analysis before and after the change to precisely attribute which code smells, correctness issues, and security issues the AI introduced. We then track each introduced issue from the introducing commit to the latest repository revision to study its lifecycle. Our results show that we identified 484,366 distinct issues, and that code smells are by far the most common type, accounting for 89.3% of all issues. We also find that more than 15% of commits from every AI coding assistant introduce at least one issue, although the rates vary across tools. More importantly, 22.7% of tracked AI-introduced issues still survive at the latest version of the repository. These findings show that AI-generated code can introduce long-term maintenance costs into real software projects and highlight the need for stronger quality assurance in AI-assisted development.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

22.7% of AI-introduced issues persist in real repos, but the attribution step lacks reported validation.

read the letter

The main result is that 22.7% of the issues linked to AI commits are still present at the latest repository version. Code smells account for 89% of the 484k issues found, and every tool adds at least one issue in more than 15% of its commits. This comes from 302.6k verified AI commits across 6,299 GitHub repositories and five coding assistants. The paper tracks each issue forward from the introducing commit using static analysis on before-and-after states. That longitudinal view in production code is the clearest step beyond earlier controlled studies. The scale of the dataset and the decision to follow issue survival rather than just count introductions are the parts that hold up. The work gives a usable picture of how often AI changes add maintenance load that does not disappear quickly. The soft spot is the attribution itself. The claim that static analysis precisely ties issues to the AI change rests on clean diffs and accurate tool output. The abstract gives no numbers on how they confirmed commits contain only AI edits, how they handled concurrent human changes, or how they checked for false positives and pre-existing issues. If those checks are limited, the survival rate could be inflated. This is aimed at researchers studying AI in software engineering and at teams that maintain large codebases. A reader who wants large-scale evidence on technical debt from AI code will get value from the tracking method and the cross-tool comparison. The paper shows straightforward empirical thinking on a practical question. It deserves a serious referee so the methods on commit verification and issue attribution can be examined in detail.

Referee Report

1 major / 2 minor

Summary. The manuscript constructs a dataset of 302.6k verified AI-authored commits from 6,299 GitHub repositories spanning five AI coding assistants. It applies static analysis before and after each commit to attribute 484,366 issues (89.3% code smells) to the AI changes, then tracks issue survival to the latest revision, reporting that 22.7% of AI-introduced issues persist and that more than 15% of commits from each assistant introduce at least one issue. The central claim is that AI-generated code introduces measurable long-term technical debt in real projects.

Significance. If the attribution methodology holds, the work supplies a large-scale, real-world measurement of AI code quality impact that extends beyond lab studies. The scale of the commit corpus and the before-after tracking design are clear strengths that enable concrete quantification of issue persistence. This could usefully inform tool evaluation and developer guidelines once the attribution precision is documented.

major comments (1)

[Methods] Methods section on dataset construction and static analysis: the claim that before-and-after analysis 'precisely attribute[s]' issues to AI changes rests on two unvalidated assumptions—(1) that every tracked commit contains only AI-generated code with no concurrent human edits in the same diff, and (2) that the static analyzer reports only issues whose root cause lies in the AI diff rather than pre-existing code, false positives, or cross-file interactions. No accuracy metrics, manual validation sample, or inter-rater checks are reported for the 302.6k commits; without these, the 22.7% survival rate cannot be interpreted as a reliable measure of long-term maintenance cost.

minor comments (2)

[Abstract] Abstract: the five AI coding assistants are not named; listing them would improve immediate readability.
[Results] Results: the survival-rate figure (22.7%) is presented without confidence intervals or sensitivity analysis to the static-analysis configuration; adding these would strengthen the presentation even if the core method is later validated.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the thoughtful and detailed comments, which help clarify the strengths and limitations of our large-scale empirical approach. We address the methodological concerns below and outline targeted revisions to improve transparency and interpretability of our results.

read point-by-point responses

Referee: [Methods] Methods section on dataset construction and static analysis: the claim that before-and-after analysis 'precisely attribute[s]' issues to AI changes rests on two unvalidated assumptions—(1) that every tracked commit contains only AI-generated code with no concurrent human edits in the same diff, and (2) that the static analyzer reports only issues whose root cause lies in the AI diff rather than pre-existing code, false positives, or cross-file interactions. No accuracy metrics, manual validation sample, or inter-rater checks are reported for the 302.6k commits; without these, the 22.7% survival rate cannot be interpreted as a reliable measure of long-term maintenance cost.

Authors: We agree that the attribution relies on assumptions that benefit from explicit validation and discussion. For (1), Section 3.1 details our verification of AI-authored commits via commit-message heuristics, author metadata, and repository signals (e.g., Copilot co-author tags); we selected only commits where these indicators align to reduce mixed-edit noise. We acknowledge residual risk of minor human edits and will add a manual validation subsection reporting results from a random sample of 200 commits (two authors independently labeled each for AI-only vs. mixed content, with inter-rater agreement). For (2), the before-and-after design isolates net changes introduced by the diff, but we recognize that static analyzers (SonarQube in our case) can flag pre-existing or interaction-induced issues. We will expand the limitations section to quantify this risk, report the sample-based precision of our AI-commit filter, and discuss how these factors affect interpretation of the 22.7% persistence figure. These additions will allow readers to assess the reliability of our long-term debt measurements without overstating precision. revision: partial

Circularity Check

0 steps flagged

No circularity: pure empirical measurement with no derivations or self-referential predictions

full rationale

This is a data-driven empirical study that collects 302.6k AI-authored commits, runs static analysis before/after each commit to count introduced issues, and tracks survival to the latest revision. No equations, fitted parameters, predictions, or ansatzes are present. The 22.7% survival rate and 484,366 issue counts are direct tallies from the observed data rather than quantities derived from or equivalent to the input collection process by construction. No self-citation load-bearing steps or uniqueness theorems are invoked. The study is self-contained against external benchmarks (GitHub commits and static-analysis tools) and receives the default non-circular finding for measurement papers.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on accurate attribution of issues to AI commits via static analysis and on the representativeness of the collected GitHub dataset.

axioms (1)

domain assumption Static analysis tools accurately detect and attribute code smells, correctness issues, and security issues introduced specifically by AI changes.
The study runs static analysis before and after each commit to identify AI-introduced problems.

pith-pipeline@v0.9.0 · 5620 in / 1112 out tokens · 48273 ms · 2026-05-14T21:24:54.734096+00:00 · methodology

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems
cs.SE 2026-04 unverdicted novelty 5.0

Claude Code centers on a model-tool while-loop surrounded by permission systems, context compaction, extensibility hooks, subagent delegation, and session storage; the same design questions yield different answers in ...
To Copilot and Beyond: 22 AI Systems Developers Want Built
cs.SE 2026-04 unverdicted novelty 5.0

Survey of 860 developers reveals 22 desired AI systems for non-coding tasks with explicit constraints on authority, provenance, and quality signals, framed as bounded delegation where AI handles assembly work but not ...

Reference graph

Works this paper leans on

75 extracted references · 75 canonical work pages · cited by 2 Pith papers · 1 internal anchor

[1]

Octoverse: A new developer joins github every second as ai leads typescript to #1,

GitHub, “Octoverse: A new developer joins github every second as ai leads typescript to #1,” October 2024. [Online]. Available: https://github.blog/news-insights/octoverse/octoverse-a-new-developer -joins-github-every-second-as-ai-leads-typescript-to-1/

work page 2024
[2]

2025 developer survey,

Stack Overflow, “2025 developer survey,” June 2025. [Online]. Available: https://survey.stackoverflow.co/2025/

work page 2025
[3]

Satya nadella says as much as 30% of microsoft code is written by ai,

J. Novet, “Satya nadella says as much as 30% of microsoft code is written by ai,” April 2025. [Online]. Available: https: //www.cnbc.com/2025/04/29/satya-nadella-says-as-much-as-30percent -of-microsoft-code-is-written-by-ai.html

work page 2025
[4]

Google ceo sundar pichai says more than a quarter of the company’s new code is created by ai,

K. Robison, “Google ceo sundar pichai says more than a quarter of the company’s new code is created by ai,” October 2024. [Online]. Available: https://fortune.com/2024/10/30/googles-code-ai-sundar-pic hai/ 12

work page 2024
[5]

Refining chatgpt-generated code: Characterizing and mitigating code quality issues,

Y . Liu, T. Le-Cong, R. Widyasari, C. Tantithamthavorn, L. Li, X.- B. D. Le, and D. Lo, “Refining chatgpt-generated code: Characterizing and mitigating code quality issues,”ACM Transactions on Software Engineering and Methodology, vol. 33, no. 5, pp. 1–26, 2024

work page 2024
[6]

Quality assessment of chatgpt generated code and their use by developers,

M. L. Siddiq, L. Roney, J. Zhang, and J. C. D. S. Santos, “Quality assessment of chatgpt generated code and their use by developers,” in Proceedings of the 21st international conference on mining software repositories, 2024, pp. 152–156

work page 2024
[7]

Asleep at the keyboard? assessing the security of github copilot’s code con- tributions,

H. Pearce, B. Ahmad, B. Tan, B. Dolan-Gavitt, and R. Karri, “Asleep at the keyboard? assessing the security of github copilot’s code con- tributions,”Communications of the ACM, vol. 68, no. 2, pp. 96–105, 2025

work page 2025
[8]

Do users write more insecure code with ai assistants?

N. Perry, M. Srivastava, D. Kumar, and D. Boneh, “Do users write more insecure code with ai assistants?” inProceedings of the 2023 ACM SIGSAC conference on computer and communications security, 2023, pp. 2785–2799

work page 2023
[9]

When ai takes the wheel: Security analysis of framework-constrained program generation,

Y . Liu, Z. Xing, S. Pan, and C. Tantithamthavorn, “When ai takes the wheel: Security analysis of framework-constrained program generation,” arXiv preprint arXiv:2510.16823, 2025

work page arXiv 2025
[10]

Trust dynamics in ai-assisted development: Defi- nitions, factors, and implications,

S. Sabouri, P. Eibl, X. Zhou, M. Ziyadi, N. Medvidovic, L. Lindemann, and S. Chattopadhyay, “Trust dynamics in ai-assisted development: Defi- nitions, factors, and implications,” in2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE). IEEE, 2025, pp. 1678– 1690

work page 2025
[11]

Coding on copilot: 2023 data suggests downward pressure on code quality,

W. Harding and M. Kloster, “Coding on copilot: 2023 data suggests downward pressure on code quality,”https://www. gitclear. com/cod- ing on copilot data shows ais downward pressure on code quality/, 2024

work page 2023
[12]

The evolution of technical debt from devops to generative ai: A multivocal literature review,

S. Moreschini, E.-M. Arvanitou, E.-P. Kanidou, N. Nikolaidis, R. Su, A. Ampatzoglou, A. Chatzigeorgiou, and V . Lenarduzzi, “The evolution of technical debt from devops to generative ai: A multivocal literature review,”Journal of Systems and Software, vol. 231, p. 112599, 2026

work page 2026
[13]

Security weaknesses of copilot-generated code in github projects: An empirical study,

Y . Fu, P. Liang, A. Tahir, Z. Li, M. Shahin, J. Yu, and J. Chen, “Security weaknesses of copilot-generated code in github projects: An empirical study,”ACM Transactions on Software Engineering and Methodology, vol. 34, no. 8, pp. 1–34, 2025

work page 2025
[14]

Ai code in the wild: Measuring security risks and ecosystem shifts of ai-generated code in modern software,

B. Wang, W. Yu, Y . Zhong, H. Yu, K. Lian, C. Lu, H. Zheng, D. Zhang, and H. Li, “Ai code in the wild: Measuring security risks and ecosystem shifts of ai-generated code in modern software,”arXiv preprint arXiv:2512.18567, 2025

work page arXiv 2025
[15]

Does ai-assisted coding deliver? a difference-in-differences study of cursor’s impact on software projects,

H. He, C. Miller, S. Agarwal, C. K ¨astner, and B. Vasilescu, “Does ai-assisted coding deliver? a difference-in-differences study of cursor’s impact on software projects,”arXiv e-prints, pp. arXiv–2511, 2025

work page 2025
[16]

On the use of agentic coding: An empirical study of pull requests on github,

M. Watanabe, H. Li, Y . Kashiwa, B. Reid, H. Iida, and A. E. Hassan, “On the use of agentic coding: An empirical study of pull requests on github,”arXiv preprint arXiv:2509.14745, 2025

work page arXiv 2025
[17]

More code, less reuse: Investigating code qual- ity and reviewer sentiment towards ai-generated pull requests,

H. Huang, P. Jaisri, S. Shimizu, L. Chen, S. Nakashima, and G. Rodr ´ıguez-P´erez, “More code, less reuse: Investigating code qual- ity and reviewer sentiment towards ai-generated pull requests,”arXiv preprint arXiv:2601.21276, 2026

work page arXiv 2026
[18]

An exploratory study on self-admitted technical debt,

A. Potdar and E. Shihab, “An exploratory study on self-admitted technical debt,” in2014 IEEE International Conference on Software Maintenance and Evolution. IEEE, 2014, pp. 91–100

work page 2014
[19]

Using natural language processing to automatically detect self-admitted technical debt,

E. da Silva Maldonado, E. Shihab, and N. Tsantalis, “Using natural language processing to automatically detect self-admitted technical debt,”IEEE Transactions on Software Engineering, vol. 43, no. 11, pp. 1044–1062, 2017

work page 2017
[20]

Claude opus 4.6 wrote a dependency-free c compiler in rust, with backends targeting x86 (64- and 32-bit), arm, and risc-v, capable of compiling a booting linux kernel,

Anthropic, “Claude opus 4.6 wrote a dependency-free c compiler in rust, with backends targeting x86 (64- and 32-bit), arm, and risc-v, capable of compiling a booting linux kernel,” February 2026. [Online]. Available: https://github.com/anthropics/claudes-c-compiler

work page 2026
[21]

The wycash portfolio management system,

W. Cunningham, “The wycash portfolio management system,”ACM Sigplan Oops Messenger, vol. 4, no. 2, pp. 29–30, 1992

work page 1992
[22]

Managing tech- nical debt in software engineering (dagstuhl seminar 16162),

P. Avgeriou, P. Kruchten, I. Ozkaya, and C. Seaman, “Managing tech- nical debt in software engineering (dagstuhl seminar 16162),”Dagstuhl reports, vol. 6, no. 4, pp. 110–138, 2016

work page 2016
[23]

A systematic mapping study on technical debt and its management,

Z. Li, P. Avgeriou, and P. Liang, “A systematic mapping study on technical debt and its management,”Journal of systems and software, vol. 101, pp. 193–220, 2015

work page 2015
[24]

hysteria2,

seagullz4, “hysteria2,” 2025, accessed: 2026-01-15. [Online]. Available: https://github.com/seagullz4/hysteria2

work page 2025
[25]

Commitd9e392d: Improve code security by removing shell=true,

——, “Commitd9e392d: Improve code security by removing shell=true,” 2025. [Online]. Available: https://github.com/seagullz4/hys teria2/commit/d9e392d

work page 2025
[26]

librealsense,

RealSense, “librealsense,” 2025, accessed: 2026-01-15. [Online]. Available: https://github.com/IntelRealSense/librealsense

work page 2025
[27]

Commit14026c8: Add missing constants and fix 6fps bug,

Intel RealSense, “Commit14026c8: Add missing constants and fix 6fps bug,” 2025. [Online]. Available: https://github.com/realsenseai/lib realsense/commit/14026c898f790db79a0b588983c08a3108fa326e

work page 2025
[28]

Commite277daf: Introduce shell-based subprocess call,

seagullz4, “Commite277daf: Introduce shell-based subprocess call,”

work page
[29]

Available: https://github.com/seagullz4/hysteria2/com mit/e277daf540dad4b5a34822f0088e70617b689587

[Online]. Available: https://github.com/seagullz4/hysteria2/com mit/e277daf540dad4b5a34822f0088e70617b689587

work page
[30]

Commit5535b8a: Refactor test script with named constants,

Intel RealSense, “Commit5535b8a: Refactor test script with named constants,” 2025. [Online]. Available: https://github.com/realsenseai/lib realsense/commit/5535b8a204bc759324ee89f864eb680362be5ece

work page 2025
[31]

An empirical evaluation of github copilot’s code suggestions,

N. Nguyen and S. Nadi, “An empirical evaluation of github copilot’s code suggestions,” inProceedings of the 19th International Conference on Mining Software Repositories, 2022, pp. 1–5

work page 2022
[32]

Gh archive,

GH Archive, “Gh archive,” 2026, accessed: 2026-03-20. [Online]. Available: https://www.gharchive.org/

work page 2026
[33]

H archive: Ganalyzing event data with bigquery,

——, “H archive: Ganalyzing event data with bigquery,” 2026, accessed: 2026-03-20. [Online]. Available: www.gharchive.org/#bigquery

work page 2026
[34]

Github rest api documentation,

GitHub, “Github rest api documentation,” 2026, accessed: 2026-03-20. [Online]. Available: https://docs.github.com/en/rest

work page 2026
[35]

ESLint Documentation,

ESLint, “ESLint Documentation,” 2026, accessed: 2026-03-24. [Online]. Available: https://eslint.org/docs/latest/

work page 2026
[36]

Pylint Documentation,

Python Code Quality Authority, “Pylint Documentation,” 2026, accessed: 2026-03-24. [Online]. Available: https://pylint.readthedocs.io/

work page 2026
[37]

Semgrep Documentation,

Semgrep, “Semgrep Documentation,” 2026, accessed: 2026-03-24. [Online]. Available: https://semgrep.dev/docs/

work page 2026
[38]

Commit46695d1: feat: Add redis connection pooling for proxy caching layers,

superagent-ai, “Commit46695d1: feat: Add redis connection pooling for proxy caching layers,” Aug. 2025. [Online]. Available: https://github.com/superagent-ai/superagent/commit/46695d14622a6c5d e22315ce9514964d22e4d825

work page 2025
[39]

GitHub Copilot,

GitHub, “GitHub Copilot,” 2026, accessed: 2026-03-24. [Online]. Available: https://docs.github.com/en/copilot/get-started/what-is-githu b-copilot

work page 2026
[40]

Claude Code Overview,

Anthropic, “Claude Code Overview,” 2026, accessed: 2026-03-24. [Online]. Available: https://code.claude.com/docs/en/overview

work page 2026
[41]

Cursor Documentation,

Cursor, “Cursor Documentation,” 2026, accessed: 2026-03-24. [Online]. Available: https://cursor.com/docs

work page 2026
[42]

Gemini Code Assist Overview,

Google, “Gemini Code Assist Overview,” 2026, accessed: 2026-03-24. [Online]. Available: https://developers.google.com/gemini-code-assist/ docs/overview

work page 2026
[43]

Introducing Devin,

Cognition AI, “Introducing Devin,” 2026, accessed: 2026-03-24. [Online]. Available: https://docs.devin.ai/get-started/devin-intro

work page 2026
[44]

Commitd360798: Replace index.json with index.jsonl flat jsonl format,

“Commitd360798: Replace index.json with index.jsonl flat jsonl format,” 2025. [Online]. Available: https://github.com/ArchiveBox/Ar chiveBox/commit/d36079829bed32d71b2a1a5e8e6019457d6a7ae7

work page 2025
[45]

Archivebox,

“Archivebox,” 2025. [Online]. Available: https://github.com/ArchiveBo x/ArchiveBox

work page 2025
[46]

Fowler,Refactoring: improving the design of existing code

M. Fowler,Refactoring: improving the design of existing code. Addison-Wesley Professional, 2018

work page 2018
[47]

PEP 597 – Add optional EncodingWarning,

I. Naoki, “PEP 597 – Add optional EncodingWarning,” 2021. [Online]. Available: https://peps.python.org/pep-0597/

work page 2021
[48]

Commitfb99747: fix: revert accidental cache=true changes to preserve original cache parameter handling,

firecrawl, “Commitfb99747: fix: revert accidental cache=true changes to preserve original cache parameter handling,” 2025. [Online]. Available: https://github.com/firecrawl/firecrawl/commit/fb99747ba978 7683ac5722ba55c46f823461691a

work page 2025
[49]

firecrawl,

——, “firecrawl,” 2026. [Online]. Available: https://github.com/firecra wl/firecrawl

work page 2026
[50]

Commita7aa0cb: Fix pydantic field name shadowing issues causing import nameerror,

——, “Commita7aa0cb: Fix pydantic field name shadowing issues causing import nameerror,” 2025. [Online]. Available: https: //github.com/firecrawl/firecrawl/commit/a7aa0cb2f4496394a94b50f001 3eb0328b408dc8

work page 2025
[51]

Commitd8549c0: Add refresh data feature with backend endpoint and ui components,

Microsoft, “Commitd8549c0: Add refresh data feature with backend endpoint and ui components,” 2025. [Online]. Available: https://github.com/microsoft/data-formulator/commit/d8549c0c8c13953 1ee5bf266609f7e5352384c5f

work page 2025
[52]

data-formulator,

——, “data-formulator,” 2026. [Online]. Available: https://github.com /microsoft/data-formulator

work page 2026
[53]

Commit762cddc: fix: address pr review comments from cubic-dev- ai,

“Commit762cddc: fix: address pr review comments from cubic-dev- ai,” 2025. [Online]. Available: https://github.com/ArchiveBox/ArchiveB ox/commit/762cddc8c5d42095c26dda0e193fab6794fd69d5

work page 2025
[54]

Stirling-pdf,

S. Tools, “Stirling-pdf,” 2026. [Online]. Available: https://github.com/S tirling-Tools/Stirling-PDF

work page 2026
[55]

Commite7109bb: Convert extract-image-scans to react component,

——, “Commite7109bb: Convert extract-image-scans to react component,” 2025. [Online]. Available: https://github.com/Stirling-Too ls/Stirling-PDF/commit/e7109bb4e9fbeb1fed7f10f50e5831f48da870be

work page 2025
[56]

Commit00efc880: Fix typescript linting error in zipfileser- vice,

——, “Commit00efc880: Fix typescript linting error in zipfileser- vice,” 2025. [Online]. Available: https://github.com/Stirling-Tools/Stirl ing-PDF/commit/00efc8802cd4be7bdf30c746dbd7a2cb1108a601 13

work page 2025
[57]

Commit439cde1: style: apply final formatting changes,

crewAIInc, “Commit439cde1: style: apply final formatting changes,”

work page
[58]

Available: https://github.com/crewAIInc/crewAI-tools/c ommit/439cde180cd69791f46dedde192c41184ca1f96f

[Online]. Available: https://github.com/crewAIInc/crewAI-tools/c ommit/439cde180cd69791f46dedde192c41184ca1f96f

work page
[59]

B113: Test for missing requests timeout,

PyCQA, “B113: Test for missing requests timeout,” 2023. [Online]. Available: https://bandit.readthedocs.io/en/latest/plugins/b113 request without timeout.html

work page 2023
[60]

On the robustness of code generation techniques: An empirical study on github copilot,

A. Mastropaolo, L. Pascarella, E. Guglielmi, M. Ciniselli, S. Scalabrino, R. Oliveto, and G. Bavota, “On the robustness of code generation techniques: An empirical study on github copilot,” in2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 2023, pp. 2149–2160

work page 2023
[61]

Lost at c: A user study on the security implications of large language model code assistants,

G. Sandoval, H. Pearce, T. Nys, R. Karri, S. Garg, and B. Dolan-Gavitt, “Lost at c: A user study on the security implications of large language model code assistants,” in32nd USENIX Security Symposium (USENIX Security 23), 2023, pp. 2205–2222

work page 2023
[62]

The Impact of AI on Developer Productivity: Evidence from GitHub Copilot

S. Peng, E. Kalliamvakou, P. Cihon, and M. Demirer, “The impact of ai on developer productivity: Evidence from github copilot,”arXiv preprint arXiv:2302.06590, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[63]

Measuring github copilot’s impact on productivity,

A. Ziegler, E. Kalliamvakou, X. A. Li, A. Rice, D. Rifkin, S. Simister, G. Sittampalam, and E. Aftandilian, “Measuring github copilot’s impact on productivity,”Communications of the ACM, vol. 67, no. 3, pp. 54–63, 2024

work page 2024
[64]

Ai-assisted code author- ing at scale: Fine-tuning, deploying, and mixed methods evaluation,

V . Murali, C. Maddila, I. Ahmad, M. Bolin, D. Cheng, N. Ghorbani, R. Fernandez, N. Nagappan, and P. C. Rigby, “Ai-assisted code author- ing at scale: Fine-tuning, deploying, and mixed methods evaluation,” Proceedings of the ACM on Software Engineering, vol. 1, no. FSE, pp. 1066–1085, 2024

work page 2024
[65]

A large-scale survey on the usability of ai programming assistants: Successes and challenges,

J. T. Liang, C. Yang, and B. A. Myers, “A large-scale survey on the usability of ai programming assistants: Successes and challenges,” in Proceedings of the 46th IEEE/ACM international conference on software engineering, 2024, pp. 1–13

work page 2024
[66]

An industry case study on adoption of ai- based programming assistants,

N. Davila, I. Wiese, I. Steinmacher, L. Lucio da Silva, A. Kawamoto, G. J. P. Favaro, and I. Nunes, “An industry case study on adoption of ai- based programming assistants,” inProceedings of the 46th international conference on software engineering: software engineering in practice, 2024, pp. 92–102

work page 2024
[67]

Using ai assistants in software development: A qualitative study on security practices and concerns,

J. H. Klemmer, S. A. Horstmann, N. Patnaik, C. Ludden, C. Burton Jr, C. Powers, F. Massacci, A. Rahman, D. V otipka, H. R. Lipfordet al., “Using ai assistants in software development: A qualitative study on security practices and concerns,” inProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, 2024, pp. 2726–2740

work page 2024
[68]

Automating change-level self-admitted technical debt determination,

M. Yan, X. Xia, E. Shihab, D. Lo, J. Yin, and X. Yang, “Automating change-level self-admitted technical debt determination,”IEEE Trans- actions on Software Engineering, vol. 45, no. 12, pp. 1211–1229, 2018

work page 2018
[69]

Neural network-based detection of self-admitted technical debt: From perfor- mance to explainability,

X. Ren, Z. Xing, X. Xia, D. Lo, X. Wang, and J. Grundy, “Neural network-based detection of self-admitted technical debt: From perfor- mance to explainability,”ACM transactions on software engineering and methodology (TOSEM), vol. 28, no. 3, pp. 1–45, 2019

work page 2019
[70]

On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation,

F. Palomba, G. Bavota, M. Di Penta, F. Fasano, R. Oliveto, and A. De Lucia, “On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation,” inProceedings of the 40th international conference on software engineering, 2018, pp. 482–482

work page 2018
[71]

When and why your code starts to smell bad (and whether the smells go away),

M. Tufano, F. Palomba, G. Bavota, R. Oliveto, M. Di Penta, A. De Lucia, and D. Poshyvanyk, “When and why your code starts to smell bad (and whether the smells go away),”IEEE Transactions on Software Engineering, vol. 43, no. 11, pp. 1063–1088, 2017

work page 2017
[72]

How do developers fix issues and pay back technical debt in the apache ecosystem?

G. Digkas, M. Lungu, P. Avgeriou, A. Chatzigeorgiou, and A. Ampat- zoglou, “How do developers fix issues and pay back technical debt in the apache ecosystem?” in2018 IEEE 25th International Conference on software analysis, evolution and reengineering (SANER). IEEE, 2018, pp. 153–163

work page 2018
[73]

Can clean new code reduce technical debt density?

G. Digkas, A. Chatzigeorgiou, A. Ampatzoglou, and P. Avgeriou, “Can clean new code reduce technical debt density?”IEEE Transactions on Software Engineering, vol. 48, no. 5, pp. 1705–1721, 2020

work page 2020
[74]

Was self-admitted technical debt removal a real removal? an in-depth perspective,

F. Zampetti, A. Serebrenik, and M. Di Penta, “Was self-admitted technical debt removal a real removal? an in-depth perspective,” in Proceedings of the 15th international conference on mining software repositories, 2018, pp. 526–536

work page 2018
[75]

Towards automatically addressing self-admitted technical debt: How far are we?

A. Mastropaolo, M. Di Penta, and G. Bavota, “Towards automatically addressing self-admitted technical debt: How far are we?” in2023 38th IEEE/ACM International Conference on Automated Software Engineer- ing (ASE). IEEE, 2023, pp. 585–597

work page 2023