PatchTrack: A Comprehensive Analysis of ChatGPT's Influence on Pull Request Outcomes
Pith reviewed 2026-05-22 16:08 UTC · model grok-4.3
The pith
Developers rarely adopt ChatGPT-generated code fully in pull requests, instead using it as a starting point that shapes adaptation and review discussions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The study of 338 pull requests with self-admitted ChatGPT usage, covering 645 AI-generated snippets and 3486 developer patches, finds a median integration rate of 25 percent. Qualitative examination of 89 cases with integrated patches identifies recurring patterns of structural integration, selective extraction, and iterative refinement. Developers treat AI output as a starting point rather than a final implementation. Even without direct adoption, ChatGPT affects workflows through conceptual guidance, documentation, and debugging. Integration decisions depend on contextual fit, integration effort, maintainer trust, and established review norms rather than serving as direct measures of code
What carries the argument
PatchTrack, an automated classifier that determines whether AI-generated patches were applied, partially reused, or not integrated into pull requests.
If this is right
- Full adoption of ChatGPT-generated code is uncommon in pull request workflows.
- Developers typically treat AI output as a starting point rather than a final implementation.
- ChatGPT influences workflows through conceptual guidance, documentation, and debugging strategies even when code is not directly adopted.
- Integration decisions reflect contextual fit, integration effort, maintainer trust, and established pull request review norms.
Where Pith is reading between the lines
- AI coding tools could be designed to better support partial reuse and adaptation of suggestions rather than aiming for complete replacements.
- Similar analyses of other large language models might show whether the observed integration patterns hold beyond ChatGPT.
- The work implies that AI assistance may gradually alter established norms in code review and collaboration.
Load-bearing premise
The dataset of 338 pull requests containing self-admitted ChatGPT usage accurately represents typical AI-assisted development without significant selection or reporting bias.
What would settle it
Measuring integration rates in a larger set of pull requests from projects known to use AI tools but identified without requiring self-admission and finding substantially different rates would challenge the representativeness of the observed patterns.
read the original abstract
The rapid adoption of large language models (LLMs) like ChatGPT has introduced new dynamics in software development, particularly within pull request workflows. While prior research has examined the quality of AI-generated code, less is known about how developers evaluate, adapt, and integrate these suggestions in real-world collaboration. We analyze 338 pull requests from 255 GitHub repositories containing self-admitted ChatGPT usage, comprising 645 AI-generated snippets and 3,486 developer-authored patches. To support this analysis at scale, we use PatchTrack, an automated classifier that identifies whether AI-generated patches were applied, partially reused, or not integrated. Our findings reveal that full adoption of ChatGPT-generated code is uncommon: the median integration rate is 25%. Qualitative analysis of 89 pull requests with integrated patches reveals recurring patterns of structural integration, selective extraction, and iterative refinement, indicating that developers typically treat AI output as a starting point rather than a final implementation. Even when code is not directly adopted, ChatGPT influences workflows through conceptual guidance, documentation, and debugging strategies. Integration decisions reflect contextual fit, integration effort, maintainer trust, and established pull request review norms rather than serving as direct indicators of code correctness. Overall, this study provides empirical insight into AI-mediated decision-making in collaborative software development, showing that the influence of generative AI extends beyond patch generation to how developers reason about, adapt, and negotiate code during review within pull request workflows. These findings inform the design of AI-assisted tools and support more transparent and effective use of LLMs in practice.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript analyzes 338 pull requests from 255 GitHub repositories containing self-admitted ChatGPT usage, comprising 645 AI-generated snippets and 3,486 developer-authored patches. It introduces PatchTrack, an automated classifier to categorize whether AI patches were fully applied, partially reused, or not integrated. Findings show a median integration rate of 25%, with qualitative review of 89 integrated cases revealing patterns of structural integration, selective extraction, and iterative refinement. The study concludes that developers treat AI output as a starting point, that ChatGPT influences workflows via conceptual guidance even without direct adoption, and that integration decisions depend on contextual fit, effort, trust, and review norms rather than code correctness alone. Overall, the paper claims that generative AI shapes not only patch generation but also reasoning, adaptation, and negotiation during PR review.
Significance. If the core empirical patterns hold after addressing methodological gaps, this study offers meaningful insight into LLM use in real collaborative software development. The scale of the self-admitted dataset and the mixed quantitative-qualitative approach provide concrete observations on integration rates and adaptation behaviors that go beyond synthetic benchmarks. Strengths include the focus on actual PR workflows and the identification of recurring developer strategies; these can usefully inform tool design and guidelines for transparent LLM adoption. The work is a solid empirical contribution to the growing literature on AI-assisted development.
major comments (2)
- [Abstract and data collection] Abstract and data collection description: The central claim that the influence of generative AI extends to how developers reason about, adapt, and negotiate code in PR workflows rests on the 338 self-admitted PRs (and the 89 qualitatively reviewed) being representative of typical AI-assisted development. Self-admission selects for developers willing to disclose usage, which may correlate with higher transparency, different trust levels, or project norms that favor integration; this selection effect is not mitigated or quantified in the described collection approach and directly affects the generalizability of the 25% median adoption figure and the qualitative themes.
- [PatchTrack classifier and qualitative analysis] PatchTrack classifier and qualitative analysis sections: The manuscript provides insufficient detail on validation of the automated classifier (e.g., precision, recall, or agreement with manual labels), inter-rater reliability for the coding of the 89 cases, and any controls for confounding variables such as PR size, complexity, or repository-specific review norms. These elements are load-bearing for the reliability of the reported integration patterns and the distinction between full, partial, and non-integration.
minor comments (2)
- [Abstract] The abstract would benefit from an explicit sentence on the limitations of relying on self-admitted usage to help readers calibrate expectations about generalizability.
- [Results] Notation for the three integration categories (full, partial, none) should be defined consistently in the text and any tables or figures that report the 25% median rate.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback, which identifies key areas for improving methodological transparency and acknowledging limitations. We have revised the manuscript to address both major comments by expanding the limitations discussion and adding validation details.
read point-by-point responses
-
Referee: [Abstract and data collection] Abstract and data collection description: The central claim that the influence of generative AI extends to how developers reason about, adapt, and negotiate code in PR workflows rests on the 338 self-admitted PRs (and the 89 qualitatively reviewed) being representative of typical AI-assisted development. Self-admission selects for developers willing to disclose usage, which may correlate with higher transparency, different trust levels, or project norms that favor integration; this selection effect is not mitigated or quantified in the described collection approach and directly affects the generalizability of the 25% median adoption figure and the qualitative themes.
Authors: We acknowledge that reliance on self-admitted ChatGPT usage introduces a selection bias, as developers who publicly disclose AI assistance may differ systematically in transparency, trust levels, or project norms from those who do not. This is an inherent challenge when studying emerging practices without platform-level logging of AI tool use. In the revised manuscript, we have added an expanded Limitations section that explicitly discusses this selection effect, its potential influence on the observed 25% median integration rate and qualitative themes, and the resulting bounds on generalizability. We have clarified that findings are presented as observations from disclosed cases rather than claims of representativeness across all AI-assisted development, and we suggest directions for future work using complementary identification methods. revision: yes
-
Referee: [PatchTrack classifier and qualitative analysis] PatchTrack classifier and qualitative analysis sections: The manuscript provides insufficient detail on validation of the automated classifier (e.g., precision, recall, or agreement with manual labels), inter-rater reliability for the coding of the 89 cases, and any controls for confounding variables such as PR size, complexity, or repository-specific review norms. These elements are load-bearing for the reliability of the reported integration patterns and the distinction between full, partial, and non-integration.
Authors: We agree that additional methodological detail is required to support the reliability of PatchTrack and the qualitative findings. In the revised manuscript, we have inserted a dedicated validation subsection for the PatchTrack classifier that reports agreement metrics with manual labels on a held-out set. For the qualitative coding of the 89 integrated cases, we now include inter-rater reliability statistics. We have also added explicit discussion of how we considered potential confounders such as PR size, complexity, and repository norms, including stratification where data permitted and sensitivity checks in the thematic analysis. These revisions directly address the load-bearing elements raised. revision: yes
- Fully quantifying the magnitude of selection bias from self-admission would require a separate comparative study of undisclosed AI usage, which exceeds the scope of this observational analysis.
Circularity Check
No circularity: purely observational empirical study with no derivations or self-referential reductions
full rationale
This paper conducts an empirical analysis of 338 GitHub pull requests containing self-admitted ChatGPT usage, using data collection, an automated classifier (PatchTrack) for integration patterns, and qualitative coding on a subset of 89 PRs. All claims rest on observed frequencies, median integration rates, and recurring patterns identified in the collected data rather than any mathematical derivation, fitted-parameter prediction, or self-citation chain that reduces the central findings to the inputs by construction. The study is self-contained against external benchmarks of GitHub data and qualitative methods, with no load-bearing steps that equate outputs to inputs via definition or prior author work.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Self-admitted ChatGPT usage in pull request descriptions serves as a reliable indicator of actual AI assistance without substantial false positives or under-reporting bias.
invented entities (1)
-
PatchTrack automated classifier
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We analyze 338 pull requests ... use PatchTrack, an automated classifier that identifies whether AI-generated patches were applied, partially reused, or not integrated. ... median integration rate is 25%.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Qualitative analysis ... recurring patterns of structural integration, selective extraction, and iterative refinement
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 2 Pith papers
-
AgenticFlict: A Large-Scale Dataset of Merge Conflicts in AI Coding Agent Pull Requests on GitHub
AgenticFlict is a public dataset of 29K+ textual merge conflicts from AI agent PRs, collected via merge simulation on 107K processed PRs and showing a 27.67% conflict rate with variation across agents.
-
How AI Coding Agents Modify Code: A Large-Scale Study of GitHub Pull Requests
AI coding agents produce pull requests with substantially more commits and slightly higher description-to-diff similarity than human developers, based on analysis of 29,095 merged PRs.
Reference graph
Works this paper leans on
-
[1]
Automated Software Engineering27(4), 459–489 (2020) https: //doi.org/10.1007/s10515-020-00280-2
Menzies, T., Pecheur, C.: Software engineering with ai/ml: State of the art and future prospects. Automated Software Engineering27(4), 459–489 (2020) https: //doi.org/10.1007/s10515-020-00280-2
- [2]
-
[3]
arXiv preprint arXiv:2403.02583 (2024) https://doi.org/10.48550/arXiv.2403.02583
Huang, Y., Chen, Y., Chen, X., Chen, J., Peng, R., Tang, Z., Huang, J., Xu, F., Zheng, Z.: Generative software engineering. arXiv preprint arXiv:2403.02583 (2024) https://doi.org/10.48550/arXiv.2403.02583 . Submitted on 5 Mar 2024, last revised 3 Apr 2024 (this version, v2)
-
[4]
IEEE Software 40(4), 30–38 (2023) https://doi.org/10.1109/MS.2023.3265877
Ebert, C., Louridas, P.: Generative ai for software practitioners. IEEE Software 40(4), 30–38 (2023) https://doi.org/10.1109/MS.2023.3265877
-
[5]
Automated Software Engineering 31(26) (2024) https://doi.org/10.1007/s10515-024-00330-1 43
Sauvola, J., Tarkoma, S., Klemettinen, M., Riekki, J., Doermann, D.: Future of software development with generative ai. Automated Software Engineering 31(26) (2024) https://doi.org/10.1007/s10515-024-00330-1 43
-
[6]
European Journal of Technic (2023) https://doi.org/10.36222/ejt.1330631
Ozpolat, Z., Yildirim, Karabatak, M.: Artificial intelligence-based tools in software development processes: Application of chatgpt. European Journal of Technic (2023) https://doi.org/10.36222/ejt.1330631
-
[7]
The Impact of AI on Developer Productivity: Evidence from GitHub Copilot
Peng, S., Kalliamvakou, E., Cihon, P., Demirer, M.: The impact of ai on developer productivity: Evidence from github copilot. arXiv preprint arXiv:2302.06590 (2023) https://doi.org/10.48550/arXiv.2302.06590 . Submit- ted on 13 Feb 2023
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2302.06590 2023
-
[8]
In: Proceedings of the 54th ACM Technical Symposium on Computer Sci- ence Education, p
Wermelinger, M.: Using github copilot to solve simple programming problems. In: Proceedings of the 54th ACM Technical Symposium on Computer Sci- ence Education, p. 7. ACM, Toronto, Canada (2023). https://doi.org/10.1145/ 3545945.3569830
-
[9]
In: Proceedings of the 21st Inter- national Conference on Mining Software Repositories
Jin, K., Wang, C.-Y., Pham, H.V., Hemmati, H.: Can chatgpt support devel- opers? an empirical evaluation of large language models for code generation. In: Proceedings of the 21st International Conference on Mining Software Repositories. MSR ’24, pp. 167–171. Association for Computing Machin- ery, New York, NY, USA (2024). https://doi.org/10.1145/3643991.3...
-
[10]
In: Proceedings of the 21st International Conference on Mining Software Repositories
Grewal, B., Lu, W., Nadi, S., Bezemer, C.-P.: Analyzing developer use of chat- gpt generated code in open source github projects. In: Proceedings of the 21st International Conference on Mining Software Repositories. MSR ’24, pp. 157–
-
[11]
In: Proceedings of the 21st Inter- national Conference on Mining Software Repositories
Association for Computing Machinery, New York, NY, USA (2024). https: //doi.org/10.1145/3643991.3645072 .https://doi.org/10.1145/3643991.3645072
-
[12]
In: Proceedings of the 21st International Conference on Mining Software Repositories
Siddiq, M.L., Roney, L., Zhang, J., Santos, J.C.D.S.: Quality assessment of chat- gpt generated code and their use by developers. In: Proceedings of the 21st International Conference on Mining Software Repositories. MSR ’24, pp. 152–
-
[13]
In: Proceedings of the 21st Inter- national Conference on Mining Software Repositories
Association for Computing Machinery, New York, NY, USA (2024). https: //doi.org/10.1145/3643991.3645071 .https://doi.org/10.1145/3643991.3645071
-
[14]
In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Rigby, P.C., Bird, C.: Convergent contemporary software peer review prac- tices. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. ESEC/FSE 2013, pp. 202–212. Association for Computing Machin- ery, New York, NY, USA (2013). https://doi.org/10.1145/2491411.2491444 . https://doi.org/10.1145/2491411.2491444
-
[15]
Bacchelli, A., Bird, C.: Expectations, outcomes, and challenges of modern code review. In: 2013 35th International Conference on Software Engineering (ICSE), pp. 712–721 (2013). https://doi.org/10.1109/ICSE.2013.6606617
-
[16]
Storey, M.-A., Zagalsky, A., Filho, F.F., Singer, L., German, D.M.: How social and communication channels shape and challenge a participatory culture in soft- ware development. IEEE Transactions on Software Engineering43(2), 185–204 (2017) https://doi.org/10.1109/TSE.2016.2584053 44
-
[17]
In: Proceedings of the 36th International Conference on Software Engineering
Gousios, G., Pinzger, M., Deursen, A.v.: An exploratory study of the pull- based software development model. In: Proceedings of the 36th International Conference on Software Engineering. ICSE 2014, pp. 345–355. Association for Computing Machinery, New York, NY, USA (2014). https://doi.org/10.1145/ 2568225.2568260 .https://doi.org/10.1145/2568225.2568260
-
[18]
In: Proceedings of the 36th International Conference on Software Engineering
Tsay, J., Dabbish, L., Herbsleb, J.: Influence of social and technical factors for evaluating contribution in github. In: Proceedings of the 36th International Conference on Software Engineering. ICSE 2014, pp. 356–366. Association for Computing Machinery, New York, NY, USA (2014). https://doi.org/10.1145/ 2568225.2568315 .https://doi.org/10.1145/2568225.2568315
-
[19]
In: Proceedings of the 38th International Conference on Software Engineering
Gousios, G., Storey, M.-A., Bacchelli, A.: Work practices and challenges in pull-based development: the contributor’s perspective. In: Proceedings of the 38th International Conference on Software Engineering. ICSE ’16, pp. 285–
-
[20]
https: //doi.org/10.1145/2884781.2884826 .https://doi.org/10.1145/2884781.2884826
Association for Computing Machinery, New York, NY, USA (2016). https: //doi.org/10.1145/2884781.2884826 .https://doi.org/10.1145/2884781.2884826
-
[21]
In: Proceedings of the 30th Annual ACM Symposium on Applied Computing
Soares, D.M., Lima J´ unior, M.L., Murta, L., Plastino, A.: Acceptance factors of pull requests in open-source projects. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing. SAC ’15, pp. 1541–1546. Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/ 2695664.2695856 .https://doi.org/10.1145/2695664.2695856
-
[22]
Zhu, J., Zhou, M., Mockus, A.: Effectiveness of code contribution: from patch-based to pull-request-based tools. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engi- neering. FSE 2016, pp. 871–882. Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2950290.2950364 . https...
-
[23]
Xiao, T., Hata, H., Treude, C., Matsumoto, K.: Generative ai for pull request descriptions: Adoption, impact, and developer interventions. Proc. ACM Softw. Eng.1(FSE) (2024) https://doi.org/10.1145/3643773
-
[24]
ACM Press/Addison- Wesley, Reading, MA (1990)
Rich, C., Waters, R.C.: The Programmer’s Apprentice. ACM Press/Addison- Wesley, Reading, MA (1990)
work page 1990
-
[25]
Empirical Software Engineering24(4), 2140–2170 (2019) https://doi.org/10.1007/s10664-019-09696-8
Zhao, G., Costa, D.A., Zou, Y.: Improving the pull requests review process using learning-to-rank algorithms. Empirical Software Engineering24(4), 2140–2170 (2019) https://doi.org/10.1007/s10664-019-09696-8
-
[26]
In: Proceedings of the International Conference on Software and System Processes
Azeem, M.I., Panichella, S., Di Sorbo, A., Serebrenik, A., Wang, Q.: Action- based recommendation in pull-request development. In: Proceedings of the International Conference on Software and System Processes. ICSSP ’20, pp. 115–
-
[27]
https: //doi.org/10.1145/3379177.3388904 .https://doi.org/10.1145/3379177.3388904 45
Association for Computing Machinery, New York, NY, USA (2020). https: //doi.org/10.1145/3379177.3388904 .https://doi.org/10.1145/3379177.3388904 45
-
[28]
Dey, T., Mockus, A.: Effect of technical and social factors on pull request quality for the npm ecosystem. In: Proceedings of the 14th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). ESEM ’20. Association for Computing Machin- ery, New York, NY, USA (2020). https://doi.org/10.1145/3382494.3410685 . https://doi....
-
[29]
arXiv preprint arXiv:2402.15943 (2024)
Hassan, A.E., Lin, D., Rajbahadur, G.K., Gallaba, K., Cogo, F.R., Chen, B., Zhang, H., Thangarajah, K., Oliva, G.A., Lin, J., Abdullah, W.M., Jiang, Z.M.: Rethinking software engineering in the foundation model era: A curated cata- logue of challenges in the development of trustworthy fmware. arXiv preprint arXiv:2402.15943 (2024)
-
[30]
A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT
White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., Elnashar, A., Spencer-Smith, J., Schmidt, D.C.: A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:2302.11382 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[31]
Journal of the American Medical Informatics Association, 037 (2024)
Luo, L., Ning, J., Zhao, Y., Wang, Z., Ding, Z., Chen, P., Fu, W., Han, Q., Xu, G., Qiu, Y., et al.: Taiyi: a bilingual fine-tuned large language model for diverse biomedical tasks. Journal of the American Medical Informatics Association, 037 (2024)
work page 2024
-
[32]
Howard, J., Ruder, S.: Universal language model fine-tuning for text classifica- tion. In: Gurevych, I., Miyao, Y. (eds.) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 328–339. Association for Computational Linguistics, Melbourne, Australia (2018). https://doi.org/10.18653/v1/P18-1031 ...
-
[33]
Jiang, N., Liu, K., Lutellier, T., Tan, L.: Impact of code language mod- els on automated program repair. In: Proceedings of the 45th Inter- national Conference on Software Engineering. ICSE ’23, pp. 1430–1442. IEEE Press, ??? (2023). https://doi.org/10.1109/ICSE48619.2023.00125 . https://doi.org/10.1109/ICSE48619.2023.00125
-
[34]
In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering
Guo, Q., Cao, J., Xie, X., Liu, S., Li, X., Chen, B., Peng, X.: Explor- ing the potential of chatgpt in automated code refinement: An empirical study. In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering. ICSE ’24, pp. 1–13. Association for Computing Machin- ery, New York, NY, USA (2024). https://doi.org/10.1145/3597503.36...
-
[35]
Dataflow analysis-inspired deep learning for efficient vulnerability detection
Deng, Y., Xia, C.S., Yang, C., Zhang, S.D., Yang, S., Zhang, L.: Large lan- guage models are edge-case generators: Crafting unusual programs for fuzzing deep learning libraries. In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering. ICSE ’24, pp. 1–13. Association for Computing Machinery, New York, NY, USA (2024). https://d...
-
[36]
arXiv preprint arXiv:2308.10620 (2023)
Hou, X., Zhao, Y., Liu, Y., Yang, Z., Wang, K., Li, L., Luo, X., Lo, D., Grundy, J., Wang, H.: Large language models for software engineering: A systematic literature review. arXiv preprint arXiv:2308.10620 (2023)
-
[37]
In: 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE), pp
Ju, J., Yu, L., Li, X., Yang, L., Zuo, C.: Llama-reviewer: Advancing code review automation with large language models through parameter-efficient fine- tuning. In: 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE), pp. 647–658. IEEE, ??? (2023)
work page 2023
-
[38]
arXiv preprint arXiv:2305.00418 (2023)
Siddiq, M.L., Santos, J., Tanvir, R.H., Ulfat, N., Rifat, F.A., Lopes, V.C.: Exploring the effectiveness of large language models in generating unit tests. arXiv preprint arXiv:2305.00418 (2023)
-
[39]
arXiv preprint arXiv:2402.13456 (2024)
Tufano, R., Mastropaolo, A., Pepe, F., Dabi´ c, O., Di Penta, M., Bavota, G.: Unveiling chatgpt’s usage in open source projects: A mining-based study. arXiv preprint arXiv:2402.13456 (2024). Paper accepted for publication at 21st International Conference on Mining Software Repositories (MASR’24)
-
[40]
Tufano, R., Dabi´ c, O., Mastropaolo, A., Ciniselli, M., Bavota, G.: Code review automation: Strengths and weaknesses of the state of the art. IEEE Trans. Softw. Eng.50(2), 338–353 (2024) https://doi.org/10.1109/TSE.2023.3348172
-
[41]
In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering
Tanzil, M.H., Khan, J.Y., Uddin, G.: Chatgpt incorrectness detection in software reviews. In: Proceedings of the IEEE/ACM 46th International Confer- ence on Software Engineering. ICSE ’24. Association for Computing Machin- ery, New York, NY, USA (2024). https://doi.org/10.1145/3597503.3639194 . https://doi.org/10.1145/3597503.3639194
-
[42]
arXiv preprint arXiv:2506.04418 (2025)
Nashid, N., Ding, D., Gallaba, K., Hassan, A.E., Mesbah, A.: Characterizing multi-hunk patches: Divergence, proximity, and llm repair challenges. arXiv preprint arXiv:2506.04418 (2025)
-
[43]
Zenodo (2023) https://doi.org/10.5281/zenodo.8304091
Xiao, T., Treude, C., Hata, H., Matsumoto, K.: Devgpt: Studying developer- chatgpt conversations. Zenodo (2023) https://doi.org/10.5281/zenodo.8304091
-
[44]
In: 2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE), pp
Li, S., Cheng, Y., Chen, J., Xuan, J., He, S., Shang, W.: Assessing the per- formance of ai-generated code: A case study on github copilot. In: 2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE), pp. 216–227 (2024). https://doi.org/10.1109/ISSRE62328.2024.00030
-
[45]
Ogenrwot, D., Businge, J.: Replication Package for PatchTrack: A Comprehen- sive Analysis of ChatGPT’s Influence on Pull Request Outcomes. https://doi. org/10.5281/zenodo.14978624 . https://doi.org/10.5281/zenodo.14978624
-
[46]
GitHub: Online Appendix. GitHub. https://www.gnu.org/software/diffutils/ 47 manual/html node/Hunks.html
-
[47]
https://chatgpt.com/share/ 8cb16814-2855-4fbd-87e5-bde8ba349728
GitHub pull request (2023). https://chatgpt.com/share/ 8cb16814-2855-4fbd-87e5-bde8ba349728
work page 2023
-
[48]
https://github.com/faker-js/faker/pull/2405
GitHub pull request (2023). https://github.com/faker-js/faker/pull/2405
work page 2023
-
[49]
In: Proceedings of the 22nd Interna- tional Conference on Mining Software Repositories (MSR 2025)
Ehsani, R., Pathak, S., Chatterjee, P.: Towards detecting prompt knowledge gaps for improved llm-guided issue resolution. In: Proceedings of the 22nd Interna- tional Conference on Mining Software Repositories (MSR 2025). ACM, Ottawa, Canada (2025). To appear
work page 2025
-
[50]
GitHub: GitHub REST API Documentation. GitHub. https://docs.github.com/ en/rest?apiVersion=2022-11-28
work page 2022
-
[51]
OpenAI: Terms of Use. OpenAI. https://openai.com/policies/terms-of-use
-
[52]
BMC Medical Research Methodology13(1), 117 (2013) https://doi
Gale, N.K., Heath, G., Cameron, E., Rashid, S., Redwood, S.: Using the frame- work method for the analysis of qualitative data in multi-disciplinary health research. BMC Medical Research Methodology13(1), 117 (2013) https://doi. org/10.1186/1471-2288-13-117
-
[53]
Empirical Software Engineering28, 150 (2023) https://doi.org/ 10.1007/s10664-023-10394-9
Weeraddana, N.R., Xu, X., Alfadel, M.,et al.: An empirical comparison of ethnic and gender diversity of devops and non-devops contributions to open- source projects. Empirical Software Engineering28, 150 (2023) https://doi.org/ 10.1007/s10664-023-10394-9 . Accepted: 11 September 2023
-
[54]
Wang, L., Zheng, Z., Wu, X., Sang, B., Zhang, J., Tao, X.: Fork entropy: Assessing the diversity of open source software projects’ forks. In: 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 204–216 (2023). https://doi.org/10.1109/ASE56229.2023.00168
-
[55]
Businge, J., Decan, A., Zerouali, A., Mens, T., Demeyer, S., De Roover, C.: Variant forks – motivations and impediments. In: Proceedings of the 29th Edition of the IEEE International Conference on Software Analysis, Evolu- tion and Reengineering, pp. 867–877. IEEE Computer Society, ??? (2022). https://doi.org/10.1109/SANER53432.2022.00105
-
[56]
https://github.blog/ news-insights/octoverse/octoverse-2024/
GitHub: The State of Open Source: Octoverse 2024 (2024). https://github.blog/ news-insights/octoverse/octoverse-2024/
work page 2024
-
[57]
ReDeBug: Finding unpatched code clones in entire OS distributions,
Jang, J., Agrawal, A., Brumley, D.: Redebug: Finding unpatched code clones in entire os distributions. In: 2012 IEEE Symposium on Security and Privacy, pp. 48–62 (2012). https://doi.org/10.1109/SP.2012.13
-
[58]
https://github.com/pokt-network/poktroll/pull/ 185
GitHub pull request (2024). https://github.com/pokt-network/poktroll/pull/ 185. 48
work page 2024
-
[59]
https://github.com/Mudlet/Mudlet/pull/7123
GitHub pull request (2024). https://github.com/Mudlet/Mudlet/pull/7123
work page 2024
-
[60]
https://github.com/nylas/nylas-python/pull/279
GitHub pull request (2024). https://github.com/nylas/nylas-python/pull/279
work page 2024
-
[61]
https://github.com/alshedivat/al-folio/pull/2059
GitHub pull request (2024). https://github.com/alshedivat/al-folio/pull/2059
work page 2024
-
[62]
https://github.com/laravel-json-api/core/pull/12
GitHub pull request (2023). https://github.com/laravel-json-api/core/pull/12
work page 2023
-
[63]
https://github.com/ory/elements/pull/171
GitHub pull request (2023). https://github.com/ory/elements/pull/171
work page 2023
-
[64]
https://github.com/darklang/dark/pull/5063
GitHub pull request (2023). https://github.com/darklang/dark/pull/5063
work page 2023
-
[65]
https://github.com/sveltejs/learn.svelte.dev/pull/ 522
GitHub pull request (2023). https://github.com/sveltejs/learn.svelte.dev/pull/ 522
work page 2023
-
[66]
https://github.com/darklang/dark/pull/5058
GitHub pull request (2023). https://github.com/darklang/dark/pull/5058
work page 2023
-
[67]
https://github.com/Bananapus/nana-core/pull/ 37
GitHub pull request (2023). https://github.com/Bananapus/nana-core/pull/ 37
work page 2023
-
[68]
Moumoula, M., Kabore, A., Klein, J., Bissyand´ e, T.: Cross-lingual code clone detection: When llms fall short against embedding-based classifier. In: Proceed- ings of the 39th IEEE/ACM International Conference on Automated Software Engineering (ASE) (2024)
work page 2024
-
[69]
Ou´ edraogo, W., Kabore, K., Tian, H., Song, Y., Koyuncu, A., Klein, J., Lo, D., Bissyand´ e, T.: Llms and prompting for unit test generation: A large-scale evaluation. In: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering (ASE) (2024)
work page 2024
-
[70]
ACM Transactions on Software Engineering and Methodology33(5), 1–44 (2024)
Chen, D., Liu, Y., Zhou, M., Zhao, Y., Wang, S., Wang, X., Chen, X., Bissyand´ e, T., Klein, J.: Llm for mobile: An initial roadmap. ACM Transactions on Software Engineering and Methodology33(5), 1–44 (2024)
work page 2024
-
[71]
https://github.com/codecrafters-io/frontend/pull/ 1061
GitHub pull request (2023). https://github.com/codecrafters-io/frontend/pull/ 1061
work page 2023
-
[72]
https://github.com/faker-js/faker/pull/2230
GitHub pull request (2023). https://github.com/faker-js/faker/pull/2230
work page 2023
-
[73]
https://github.com/digitalbitbox/ bitbox-wallet-app/pull/2415
GitHub pull request (2023). https://github.com/digitalbitbox/ bitbox-wallet-app/pull/2415
work page 2023
-
[74]
https://github.com/darklang/dark/pull/5068
GitHub pull request (2023). https://github.com/darklang/dark/pull/5068
work page 2023
-
[75]
https://github.com/gemini-hlsw/scheduler/pull/ 428
GitHub pull request (2024). https://github.com/gemini-hlsw/scheduler/pull/ 428
work page 2024
-
[76]
https://github.com/theosanderson/taxonium/pull/ 49 534
GitHub pull request (2024). https://github.com/theosanderson/taxonium/pull/ 49 534
work page 2024
-
[77]
https://github.com/open-learning-exchange/ myplanet/pull/2214
GitHub pull request (2024). https://github.com/open-learning-exchange/ myplanet/pull/2214
work page 2024
-
[78]
https://github.com/open-learning-exchange/ myplanet/pull/2212
GitHub pull request (2024). https://github.com/open-learning-exchange/ myplanet/pull/2212
work page 2024
-
[79]
https://github.com/labdao/plex/pull/468
GitHub pull request (2024). https://github.com/labdao/plex/pull/468
work page 2024
-
[80]
https://github.com/plausible/analytics/pull/3792
GitHub pull request (2024). https://github.com/plausible/analytics/pull/3792
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.