pith. machine review for the scientific record. sign in

arxiv: 2604.16404 · v1 · submitted 2026-03-31 · 💻 cs.SE

Recognition: no theorem link

On the Use of Commit Messages for Corrective Software Maintenance: A Systematic Mapping Study

Authors on Pith no claims yet

Pith reviewed 2026-05-13 23:31 UTC · model grok-4.3

classification 💻 cs.SE
keywords commit messagescorrective maintenancesystematic mapping studybug analysisrepository miningnatural language processingsoftware evolutionbug fix identification
0
0 comments X

The pith

Commit messages aid bug analysis and fixes but frequently omit details needed to understand code changes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This mapping study reviews 97 papers published from 2004 to May 2025 on the role of commit messages in corrective software maintenance. It finds rising use of these messages, paired with code diffs, for identifying bugs and documenting fixes through repository mining and NLP or machine learning techniques. Developers are the primary stakeholders considered, and messages are shown to carry useful information for tracking software evolution. However, the studies consistently note that messages often lack sufficient detail to convey the intent behind changes. Other maintenance goals such as automated repair or security receive far less attention in the literature.

Core claim

A systematic mapping of 97 primary sources establishes that commit messages support corrective maintenance tasks, especially bug analysis and bug fix identification, because they carry crucial information helpful for stakeholders to understand and improve the code base through the software evolution process, yet they often lack important information and are not enough to convey the intent of code changes to future readers.

What carries the argument

Systematic mapping study of 97 primary sources that catalogs goals, combined artifacts such as commit diffs, methodologies including repository mining and NLP/AI, and stakeholder roles in corrective maintenance.

If this is right

  • Commit messages will remain central to bug-related maintenance tasks when combined with diffs and automated analysis.
  • Developers should prioritize writing more complete messages to reduce future maintenance effort.
  • Research attention should expand beyond bug analysis to areas such as automated program repair.
  • Repository mining paired with NLP and machine learning will continue as the dominant approach for studying these messages.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Tools that automatically suggest or complete commit messages could directly address the documented gaps in information quality.
  • Standard guidelines for commit message content might improve consistency across open-source and industrial projects.
  • Integrating commit message analysis with other artifacts like test results could strengthen corrective maintenance workflows.
  • Training programs for new developers could emphasize the long-term value of detailed change descriptions.

Load-bearing premise

The search strategy and inclusion criteria used to identify the 97 primary sources produced a representative and unbiased sample of all relevant literature published up to May 2025.

What would settle it

Discovery of a large body of additional peer-reviewed studies on commit messages in corrective maintenance that show markedly different usage patterns, methodologies, or quality assessments would contradict the mapped landscape.

Figures

Figures reproduced from arXiv: 2604.16404 by Stefano Zacchiroli, Syful Islam.

Figure 1
Figure 1. Figure 1: Protocol used for the selection of primary sources. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Query template applied on the selected digital li [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Year-wise publication (rolling 2-year) trend. [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Publication venue types [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Publication venues. are further discussed in Section 3, providing a comprehensive in￾terpretation of the findings. 3 Results In this section, we review and analyze the information extracted from 97 primary sources following the scales and dimensions iden￾tified for each research question in [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Frequency and co-occurrence of software artifacts used to support corrective software maintenance activities. [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Methodology types and their usage patterns. [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Stakeholder types and their mention patterns. [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗
read the original abstract

Corrective maintenance is crucial to ensure the quality of software, thereby improving reliability and user experience. In a version control system (VCS), developers write commit messages to document their changes and support later maintenance. Still, to this day, no secondary study has mapped the research landscape of how commit messages have been used in corrective software maintenance. We present a systematic mapping study of 97 primary sources published between 2004 and May 2025, where we examine the goals, potential utilization of source code artifacts along with commit messages, methodologies, stakeholders, and the key findings about their influence on corrective maintenance. Our analysis reveals a growing interest in the usage of commit messages to perform corrective maintenance tasks, in particular for bug analysis and bug fix identification goals. Surprisingly few studies address other themes such as automated program repair, security development practices, etc. We find that the software artifacts most used in combination with commit messages are commit "diffs" and that repository mining, together with natural language processing (NLP) and artificial intelligence/machine learning (AI/ML) are the methodological foundations of studies in this field. Among stakeholders considered in previous studies, developers play the most important role in shaping corrective maintenance practices. Key findings in previous studies about commit messages establish their significant role in corrective maintenance, due to the fact that they carry crucial information helpful for stakeholders to understand and improve the code base through the software evolution process. Often, though, commit messages lack important information and are not enough to convey the intent of code changes to future readers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents a systematic mapping study of 97 primary sources (2004–May 2025) examining the use of commit messages in corrective software maintenance. It synthesizes trends in research goals (primarily bug analysis and bug-fix identification), combined artifacts (especially commit diffs), methodologies (repository mining plus NLP/AI/ML), stakeholders (developers dominant), and key findings that commit messages supply crucial information for understanding code evolution yet frequently lack sufficient detail to convey change intent.

Significance. If the 97-source sample proves representative, the mapping supplies a useful baseline for the field by documenting growth in the topic, identifying under-explored areas such as automated program repair and security practices, and highlighting the methodological reliance on repository mining and machine-learning techniques. Such an overview can usefully orient future empirical work on improving commit-message quality.

major comments (2)
  1. [Methodology] Search and selection process (methodology section): the abstract and high-level description provide no explicit search strings, database list, or inclusion/exclusion criteria, leaving the representativeness of the 97-paper sample unverified and directly undermining the synthesized claims about dominant goals, “surprisingly few studies,” and overall trends.
  2. [Results / Key Findings] Key findings synthesis: the assertion that commit messages “often lack important information” is presented as a central result yet lacks any quantitative breakdown (e.g., fraction of the 97 sources supporting the claim or coding scheme used), making it impossible to gauge the strength or consistency of the evidence.
minor comments (2)
  1. [Abstract] Abstract: the cutoff date “May 2025” appears forward-looking; confirm whether this is a typographical error for 2024 or an intended future date.
  2. [Abstract / Results] Terminology: the phrase “source code artifacts along with commit messages” is used repeatedly without an explicit enumeration of the artifact categories beyond “commit diffs”; a table or bullet list would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments on our systematic mapping study. We address each major comment below and outline the revisions we will make to improve the manuscript.

read point-by-point responses
  1. Referee: [Methodology] Search and selection process (methodology section): the abstract and high-level description provide no explicit search strings, database list, or inclusion/exclusion criteria, leaving the representativeness of the 97-paper sample unverified and directly undermining the synthesized claims about dominant goals, “surprisingly few studies,” and overall trends.

    Authors: We agree that the abstract and high-level description do not explicitly list the search strings, databases, and inclusion/exclusion criteria. These are detailed in the Methodology section of the paper. To enhance accessibility and allow immediate verification of the sample's representativeness, we will revise the abstract and introduction to include a brief overview of the search strategy, the databases queried, and the key inclusion/exclusion criteria. This addition will strengthen the transparency without altering the core methodology. revision: yes

  2. Referee: [Results / Key Findings] Key findings synthesis: the assertion that commit messages “often lack important information” is presented as a central result yet lacks any quantitative breakdown (e.g., fraction of the 97 sources supporting the claim or coding scheme used), making it impossible to gauge the strength or consistency of the evidence.

    Authors: The observation that commit messages often lack important information is synthesized from the key findings reported in the primary studies. While the current version presents this qualitatively, we recognize the value of providing quantitative support. In the revised manuscript, we will include a quantitative breakdown in the Results section, specifying the number or fraction of the 97 sources that contribute to this finding, along with a description of the coding scheme employed to classify the key findings. This will allow readers to better assess the robustness of the claim. revision: yes

Circularity Check

0 steps flagged

No circularity: synthesis from external primary sources

full rationale

This systematic mapping study derives all claims by aggregating and classifying findings across 97 externally identified primary sources. No equations, fitted parameters, predictions, or self-referential derivations appear; the reported trends in goals, artifacts, methodologies, and stakeholder roles are direct syntheses from the reviewed literature. The search and inclusion process is a methodological input rather than a derivation that reduces to the paper's own outputs, and no load-bearing self-citations or uniqueness theorems from the authors' prior work are invoked.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard assumption that a systematic literature search yields a representative sample; no free parameters or invented entities are introduced.

axioms (1)
  • domain assumption The search strategy and inclusion criteria capture a representative sample of relevant literature
    Standard premise in systematic mapping studies to justify generalizability of synthesized findings.

pith-pipeline@v0.9.0 · 5576 in / 1259 out tokens · 51219 ms · 2026-05-13T23:31:21.336924+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

124 extracted references · 124 canonical work pages

  1. [1]

    [n. d.]. Conventional Commits. https://www.conventionalcommits.org/en/v1.0.0/. Last Accessed: December 10, 2025

  2. [2]

    Maria Alaranta and Stefanie Betz. 2012. Knowledge Problems in Corrective Software Maintenance–A Case Study. In2012 45th Hawaii International Conference on System Sciences. IEEE, 3746–3755

  3. [3]

    Stefano Balla, Thomas Degueule, Romain Robbes, Jean-Rémy Falleri, and Stefano Zacchiroli. 2025. Automatic Classification of Software Repositories: a Systematic Mapping Study. InInternational Conference on Evaluation and Assessment in Software Engineering (EASE 2025)

  4. [4]

    Shariq Aziz Butt, Acosta-Coll Melisa, and Sanjay Misra. 2022. Software product maintenance: A case study. InInternational Conference on Computer Information Systems and Industrial Management. Springer, 81–92

  5. [5]

    Gerardo Canfora and Aniello Cimitile. 2001. Software maintenance. InHandbook of Software Engineering and Knowledge Engineering: Volume I: Fundamentals. World Scientific, 91–120

  6. [6]

    Syful Islam and Stefano Zacchiroli. 2026. Reproducibility package for: On the Use of Commit Messages for Corrective Software Maintenance: A Systematic Mapping Study. https://doi.org/10.5281/zenodo.18324248. Last Accessed: March 16, 2026

  7. [7]

    Barbara Kitchenham, O Pearl Brereton, David Budgen, Mark Turner, John Bai- ley, and Stephen Linkman. 2009. Systematic literature reviews in software engineering–a systematic literature review.Information and software technology 51, 1 (2009), 7–15

  8. [8]

    Jiawei Li and Iftekhar Ahmed. 2023. Commit message matters: Investigating impact and evolution of commit message quality. In2023 IEEE/ACM 45th Interna- tional Conference on Software Engineering (ICSE). IEEE, 806–817

  9. [9]

    1983.Software Maintenance: The Problems and Its Solutions

    James Martin and Carma L McClure. 1983.Software Maintenance: The Problems and Its Solutions. Prentice Hall Professional Technical Reference

  10. [10]

    Mohammad Mehdi Morovati, Amin Nikanjam, Florian Tambon, Foutse Khomh, and Zhen Ming Jiang. 2024. Bug characterization in machine learning-based systems.Empirical Software Engineering29, 1 (2024), 14

  11. [11]

    Cliodhna O’Connor and Helene Joffe. 2020. Intercoder reliability in qualitative research: Debates and practical guidelines.International journal of qualitative methods19 (2020), 1609406919899220

  12. [12]

    Kai Petersen, Robert Feldt, Shahid Mujtaba, and Michael Mattsson. 2008. System- atic mapping studies in software engineering. In12th international conference on evaluation and assessment in software engineering (EASE). BCS Learning & Development

  13. [13]

    Kai Petersen, Sairam Vakkalanka, and Ludwik Kuzniarz. 2015. Guidelines for conducting systematic mapping studies in software engineering: An update. Information and software technology64 (2015), 1–18

  14. [14]

    E Burton Swanson. 1976. The dimensions of maintenance. InProceedings of the 2nd international conference on Software engineering. 492–497

  15. [15]

    Yingchen Tian, Yuxia Zhang, Klaas-Jan Stol, Lin Jiang, and Hui Liu. 2022. What makes a good commit message?. InProceedings of the 44th International Conference on Software Engineering. 2389–2401

  16. [16]

    A Marie Vans, Anneliese von Mayrhauser, and Gabriel Somlo. 1999. Program understanding behavior during corrective maintenance of large-scale software. International Journal of Human-Computer Studies51, 1 (1999), 31–70

  17. [17]

    Song Wang and Nachiappan Nagappan. 2021. Characterizing and understanding software developer networks in security development. In2021 IEEE 32nd Interna- tional Symposium on Software Reliability Engineering (ISSRE). IEEE, 534–545

  18. [18]

    2012.Experimentation in software engineering

    Claes Wohlin, Per Runeson, Martin Höst, Magnus C Ohlsson, Björn Regnell, Anders Wesslén, et al. 2012.Experimentation in software engineering. Vol. 236. Springer

  19. [19]

    Pengyu Xue, Linhao Wu, Zhongxing Yu, Zhi Jin, Zhen Yang, Xinyi Li, Zhenyu Yang, and Yue Tan. 2024. Automated commit message generation with large language models: An empirical study and beyond.IEEE Transactions on Software Engineering(2024)

  20. [20]

    Thomas Zimmermann. 2016. Card-sorting: From text to themes. InPerspectives on data science for software engineering. Elsevier, 137–141. Primary sources

  21. [21]

    Manar Abu Talib, Ali Bou Nassif, Mohammad Azzeh, Yaser Alesh, and Yaman Afadar. 2024. Parameter-efficient fine-tuning of pre-trained code models for just-in-time defect prediction.Neural Computing and Applications36, 27 (2024), 16911–16940

  22. [22]

    Toufique Ahmed and Premkumar Devanbu. 2023. Better patching using llm prompting, via self-consistency. In2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 1742–1746

  23. [23]

    Bader Alkhazi, Andrew DiStasi, Wajdi Aljedaani, Hussein Alrubaye, Xin Ye, and Mohamed Wiem Mkaouer. 2020. Learning to rank developers for bug report assignment.Applied Soft Computing95 (2020), 106667

  24. [24]

    Idan Amit and Dror G Feitelson. 2021. Corrective commit probability: a measure of the effort invested in bug fixing.Software Quality Journal29, 4 (2021), 817– 861

  25. [25]

    Gábor Antal, Márton Keleti, and Péter Heged˘us. 2020. Exploring the security awareness of the python and javascript open source communities. InProceedings of the 17th International Conference on Mining Software Repositories. 16–20

  26. [26]

    Farhan Rahman Arnob, Rubel Hassan Mollik, Pooja Goyal, and Renee Bryce

  27. [27]

    InInternational Conference on Information Technology-New Generations

    Bug Triaging Based on Transformer Models Utilizing Commit Messages. InInternational Conference on Information Technology-New Generations. Springer, 354–366

  28. [28]

    Adrian Bachmann and Abraham Bernstein. 2010. When process data quality affects the number of bugs: Correlations in software engineering datasets. In 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010). IEEE, 62–71

  29. [29]

    Jiaqi Bai, Long Zhou, Ambrosio Blanco, Shujie Liu, Furu Wei, Ming Zhou, and Zhoujun Li. 2021. Jointly Learning to Repair Code and Generate Commit Message. InProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 9784–9795

  30. [30]

    Lingfeng Bao, Xin Xia, Ahmed E Hassan, and Xiaohu Yang. 2022. V-SZZ: automatic identification of version ranges affected by CVE vulnerabilities. In Proceedings of the 44th international conference on software engineering. 2352– 2364

  31. [31]

    Jacob G Barnett, Charles K Gathuru, Luke S Soldano, and Shane McIntosh. 2016. The relationship between commit message detail and defect proneness in java projects on github. InProceedings of the 13th International Conference on Mining Software Repositories. 496–499

  32. [32]

    Pamela Bhattacharya, Marios Iliofotou, Iulian Neamtiu, and Michalis Faloutsos

  33. [33]

    In2012 34th International conference on software engineering (ICSE)

    Graph-based analysis and prediction for software evolution. In2012 34th International conference on software engineering (ICSE). IEEE, 419–429

  34. [34]

    Christian Bird, Adrian Bachmann, Eirik Aune, John Duffy, Abraham Bernstein, Vladimir Filkov, and Premkumar Devanbu. 2009. Fair and balanced? bias in bug-fix datasets. InProceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of On the Use of Commit Messages for Corrective Softwa...

  35. [35]

    Peter Bludau and Alexander Pretschner. 2022. PR-SZZ: How pull requests can support the tracing of defects in software repositories. In2022 IEEE international conference on software analysis, evolution and reengineering (SANER). IEEE, 1–12

  36. [36]

    Maria Caulo and Giuseppe Scanniello. 2019. On the Use of Commit Messages to Support the Creation of Datasets for Fault Prediction: an Empirical Assess- ment. In2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). IEEE, 193–198

  37. [37]

    Tse-Hsun Chen, Meiyappan Nagappan, Emad Shihab, and Ahmed E Hassan

  38. [38]

    InProceedings of the 11th Working Conference on Mining Software Repositories

    An empirical study of dormant bugs. InProceedings of the 11th Working Conference on Mining Software Repositories. 82–91

  39. [39]

    Xiang Chen, Hongling Xia, Wenlong Pei, Chao Ni, and Ke Liu. 2023. Boosting multi-objective just-in-time software defect prediction by fusing expert metrics and semantic metrics.Journal of Systems and Software206 (2023), 111853

  40. [40]

    Yang Chen, Andrew E Santosa, Ang Ming Yi, Abhishek Sharma, Asankhaya Sharma, and David Lo. 2020. A machine learning approach for vulnerability curation. InProceedings of the 17th International Conference on Mining Software Repositories. 32–42

  41. [41]

    Jiwon Choi, Taeyoung Kim, Duksan Ryu, Jongmoon Baik, and Suntae Kim

  42. [42]

    Just-in-time defect prediction for self-driving software via a deep learning model.Journal of Web Engineering22, 2 (2023), 303–326

  43. [43]

    Filipe Roseiro Cogo, Gustavo A Oliva, and Ahmed E Hassan. 2019. An empirical study of dependency downgrades in the npm ecosystem.IEEE Transactions on Software Engineering47, 11 (2019), 2457–2470

  44. [44]

    Beyza Eken, RiFat Atar, Sahra Sertalp, and Ayşe Tosun. 2019. Predicting defects with latent and semantic features from commit logs in an industrial setting. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW). IEEE, 98–105

  45. [45]

    Jon Eyolfson, Lin Tan, and Patrick Lam. 2011. Do time of day and developer ex- perience affect commit bugginess?. InProceedings of the 8th Working Conference on Mining Software Repositories. 153–162

  46. [46]

    Nitzan Farhi, Noam Koenigstein, and Yuval Shavitt. 2025. PatchView: Multi- modality detection of security patches.Computers & Security151 (2025), 104356

  47. [47]

    Kelsey R Fulton, Daniel Votipka, Desiree Abrokwa, Michelle L Mazurek, Michael Hicks, and James Parker. 2022. Understanding the how and the why: Exploring secure development practices through a course competition. InProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security. 1141–1155

  48. [48]

    Amiao Gao, Zenong Zhang, Simin Wang, Liguo Huang, Shiyi Wei, and Vincent Ng. 2025. Teaching AI the Why and How of Software Vulnerability Fixes. Proceedings of the ACM on Software Engineering2, FSE (2025), 2006–2029

  49. [49]

    Yuxiang Guo, Xiaopeng Gao, Zhenyu Zhang, Wing Kwong Chan, and Bo Jiang

  50. [50]

    In2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security (QRS)

    A study on the impact of pre-trained model on Just-In-Time defect predic- tion. In2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security (QRS). IEEE, 105–116

  51. [51]

    Zhaoqiang Guo, Shiran Liu, Xutong Liu, Wei Lai, Mingliang Ma, Xu Zhang, Chao Ni, Yibiao Yang, Yanhui Li, Lin Chen, et al. 2023. Code-line-level bugginess identification: How far have we come, and how far have we yet to go?ACM Transactions on Software Engineering and Methodology32, 4 (2023), 1–55

  52. [52]

    Kim Herzig, Sascha Just, Andreas Rau, and Andreas Zeller. 2013. Predicting defects using change genealogies. In2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 118–127

  53. [53]

    Kim Herzig, Sascha Just, and Andreas Zeller. 2013. It’s not a bug, it’s a fea- ture: how misclassification impacts bug prediction. In2013 35th international conference on software engineering (ICSE). IEEE, 392–401

  54. [54]

    Kim Herzig, Sascha Just, and Andreas Zeller. 2016. The impact of tangled code changes on defect prediction models.Empirical Software Engineering21, 2 (2016), 303–336

  55. [55]

    Thong Hoang, Hoa Khanh Dam, Yasutaka Kamei, David Lo, and Naoyasu Ubayashi. 2019. Deepjit: an end-to-end deep learning framework for just- in-time defect prediction. In2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). IEEE, 34–45

  56. [56]

    Thong Hoang, Hong Jin Kang, David Lo, and Julia Lawall. 2020. Cc2vec: Dis- tributed representations of code changes. InProceedings of the ACM/IEEE 42nd international conference on software engineering. 518–529

  57. [57]

    Thong Hoang, Julia Lawall, Yuan Tian, Richard J Oentaryo, and David Lo. 2019. Patchnet: Hierarchical deep learning-based stable patch identification for the linux kernel.IEEE Transactions on Software Engineering47, 11 (2019), 2471–2486

  58. [58]

    Md Kamal Hossen, Huzefa Kagdi, and Denys Poshyvanyk. 2014. Amalgamat- ing source code authors, maintainers, and change proneness to triage change requests. InProceedings of the 22nd International Conference on Program Com- prehension. 130–141

  59. [59]

    Syed Fatiul Huq, Ali Zafar Sadiq, and Kazi Sakib. 2019. Understanding the effect of developer sentiment on fix-inducing changes: An exploratory study on github pull requests. In2019 26th Asia-Pacific Software Engineering Conference (APSEC). IEEE, 514–521

  60. [60]

    Syed Fatiul Huq, Ali Zafar Sadiq, and Kazi Sakib. 2020. Is developer sentiment related to software bugs: An exploratory study on github commits. In2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 527–531

  61. [61]

    Tian Jiang, Lin Tan, and Sunghun Kim. 2013. Personalized defect prediction. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE). Ieee, 279–289

  62. [62]

    Yuze Jiang, Beijun Shen, and Xiaodong Gu. 2025. Just-in-time software defect prediction via bi-modal change representation learning.Journal of Systems and Software219 (2025), 112253

  63. [63]

    Matthieu Jimenez, Yves Le Traon, and Mike Papadakis. 2018. Enabling the conti- nous analysis of security vulnerabilities with vuldata7. In18th IEEE International Working Conference on Source Code Analysis and Manipulation

  64. [64]

    Matthieu Jimenez, Renaud Rwemalika, Mike Papadakis, Federica Sarro, Yves Le Traon, and Mark Harman. 2019. The importance of accounting for real- world labelling when predicting software vulnerabilities. InProceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 695–705

  65. [65]

    Sunghun Kim, E James Whitehead, and Yi Zhang. 2008. Classifying software changes: Clean or buggy?IEEE Transactions on software engineering34, 2 (2008), 181–196

  66. [66]

    Frank Li and Vern Paxson. 2017. A large-scale empirical study of security patches. InProceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 2201–2215

  67. [67]

    Jingyue Li and Michael D Ernst. 2012. CBCD: Cloned buggy code detector. In 2012 34th International Conference on Software Engineering (ICSE). IEEE, 310–320

  68. [68]

    Jingyu Liu, Jun Ai, Minyan Lu, Jie Wang, and Haoxiang Shi. 2023. Semantic feature learning for software defect prediction from source code and external knowledge.Journal of Systems and Software204 (2023), 111753

  69. [69]

    Rongkai Liu, Heyuan Shi, Shuning Liu, Chao Hu, Sisheng Li, Yuheng Shen, Runzhe Wang, Xiaohai Shi, and Yu Jiang. 2025. PatchScope: LLM-Enhanced Fine-Grained Stable Patch Classification for Linux Kernel.Proceedings of the ACM on Software Engineering2, ISSTA (2025), 1513–1535

  70. [70]

    Rongkai Liu, Heyuan Shi, Yongchao Zhang, Runzhe Wang, Yuheng Shen, Yuao Chen, Jing Luo, Xiaohai Shi, Chao Hu, and Yu Jiang. 2024. PatchBert: Continuous stable patch identification for Linux kernel via pre-trained model fine-tuning. In2024 IEEE International Conference on Software Analysis, Evolution and Reengi- neering (SANER). IEEE, 349–358

  71. [71]

    Thibaud Lutellier, Hung Viet Pham, Lawrence Pang, Yitong Li, Moshi Wei, and Lin Tan. 2020. Coconut: combining context-aware neural translation models using ensemble for program repair. InProceedings of the 29th ACM SIGSOFT international symposium on software testing and analysis. 101–114

  72. [72]

    Maryam Marzban, Zahra Khoshmanesh, and Ashkan Sami. 2011. Cohesion be- tween size of commit and type of commit. InComputer Science and Convergence: CSA 2011 & WCC 2011 Proceedings. Springer, 231–239

  73. [73]

    Moritz Mock, Thomas Forrer, and Barbara Russo. 2024. Where do developers admit their security-related concerns?. InInternational Conference on Agile Software Development. Springer, 189–195

  74. [74]

    Patrick Morrison, Tosin Daniel Oyetoyan, and Laurie Williams. 2018. Identifying security issues in software development: are keywords enough?. InProceed- ings of the 40th International Conference on Software Engineering: Companion Proceeedings. 426–427

  75. [75]

    Alessandro Murgia, Giulio Concas, Michele Marchesi, and Roberto Tonelli. 2010. A machine learning approach for text categorization of fixing-issue commits on CVS. InProceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement. 1–10

  76. [76]

    K Muthukumaran, Abhinav Choudhary, and NL Bhanu Murthy. 2015. Mining GitHub for novel change metrics to predict buggy files in software systems. In 2015 International Conference on Computational Intelligence and Networks. IEEE, 15–20

  77. [77]

    Truong Giang Nguyen, Thanh Le-Cong, Hong Jin Kang, Xuan-Bach D Le, and David Lo. 2022. Vulcurator: a vulnerability-fixing commit detector. InPro- ceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1726–1730

  78. [78]

    Giang Nguyen-Truong, Hong Jin Kang, David Lo, Abhishek Sharma, Andrew E Santosa, Asankhaya Sharma, and Ming Yi Ang. 2022. Hermes: Using commit- issue linking to detect vulnerability-fixing commits. In2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 51–62

  79. [79]

    Chao Ni, Wei Wang, Kaiwen Yang, Xin Xia, Kui Liu, and David Lo. 2022. The best of both worlds: integrating semantic features with expert features for defect prediction and localization. InProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 672–683

  80. [80]

    Henning Perl, Sergej Dechand, Matthew Smith, Daniel Arp, Fabian Yamaguchi, Konrad Rieck, Sascha Fahl, and Yasemin Acar. 2015. Vccfinder: Finding potential vulnerabilities in open-source projects to assist code audits. InProceedings of the 22nd ACM SIGSAC conference on computer and communications security. 426–437. EASE 2026, 9–12 June, 2026, Glasgow, Scot...

Showing first 80 references.