Beyond the Tip of the Iceberg: Understanding SATD in Dockerfiles through the Lens of Co-evolution

Biniam Fesseha Demissie; David Lo; Jiakun Liu; Lwin Khin Shar; Mariano Ceccato; Rui'ang Hu; Wei Minn; Yan Naing Tun

arxiv: 2605.21238 · v1 · pith:TKSL6OSQnew · submitted 2026-05-20 · 💻 cs.SE

Beyond the Tip of the Iceberg: Understanding SATD in Dockerfiles through the Lens of Co-evolution

Wei Minn , Yan Naing Tun , Biniam Fesseha Demissie , Rui'ang Hu , Jiakun Liu , Mariano Ceccato , Lwin Khin Shar , David Lo This is my paper

Pith reviewed 2026-05-21 03:24 UTC · model grok-4.3

classification 💻 cs.SE

keywords self-admitted technical debtSATDDockerfilesco-evolutioninfrastructure as codetechnical debt repaymentsoftware evolutionIaC

0 comments

The pith

Dockerfile self-admitted technical debt often couples with changes in other source files.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper studies self-admitted technical debt in Dockerfiles by tracking how these files change alongside the rest of a project's source code instead of examining them alone. It shows that 27 percent of SATD admission events and 40 percent of repayment events link to modifications in non-Dockerfile artifacts, and that the nature of these links depends on the type of debt involved. Coupled SATD tends to be repaid more quickly overall, yet debt about missing functionalities lasts longer when coupled. Qualitative analysis of the events identifies external dependency problems as the leading trigger for new debt admissions and architectural refactoring as the main step before repayment occurs. The findings support treating co-evolution across files as the main way to understand and manage this form of technical debt in infrastructure code.

Core claim

Approximately 27% of admission events and 40% of repayment events are coupled to non-Dockerfile artifacts, with coupling sources varying by subtype. Coupled SATD is repaid significantly faster in general, although coupled SATD about missing functionalities persists longer than isolated cases. External dependency issues, particularly unreleased upstream packages and bug fixes, are the most common admission triggers, while architectural refactoring is the most common prerequisite for repayment. These patterns indicate that co-evolution with source code should become the primary unit of analysis for SATD in Dockerfiles.

What carries the argument

Coupled SATD events identified through commit history that link Dockerfile changes to non-Dockerfile artifacts, together with open and axial coding to classify their causes and prerequisites.

If this is right

Developers and project managers should examine source code changes when addressing SATD in Dockerfiles rather than treating the files in isolation.
SATD researchers should shift from single-file analysis to co-evolution as the main unit of study for infrastructure-as-code artifacts.
External dependency issues, especially unreleased packages and bug fixes, commonly trigger new SATD admissions in Dockerfiles.
Architectural refactoring in the broader codebase frequently precedes repayment of SATD in Dockerfiles.
Repayment speed differs between coupled and isolated SATD and also varies across specific debt subtypes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The co-evolution approach used here could be applied to other infrastructure-as-code files such as Kubernetes manifests or Terraform configurations.
Commit-monitoring tools might automatically surface candidate coupled SATD events for developer attention.
Project teams could use repayment speed differences to prioritize which technical debt items to address first.
The patterns may differ in closed-source projects or in ecosystems that use different container or build technologies.

Load-bearing premise

Commit history and file change records accurately reflect genuine co-evolution relationships, and the qualitative coding of events reliably identifies causes and prerequisites without major researcher bias.

What would settle it

A replication study that manually inspects a fresh sample of commits to verify whether the reported coupling percentages, subtype differences, and repayment time gaps still appear when independent raters classify the same events.

read the original abstract

Dockerfiles enable the creation of portable container-based execution environments for the application code, and have become an important part of the modern software development process. As Dockerfiles are a form of Infrastructure-as-Code (IaC), they can include temporary workarounds and other suboptimal implementations, leading to the accrual of technical debt that affects their reliability, security, and maintainability in the future. Prior work characterized self-admitted technical debt (SATD) in Dockerfile comments and the surrounding file chunks. This single-file view is incomplete since source code evolution involves changes across different types of software artifacts such as production, test, build, and other configuration files. Thus, we address this gap by studying SATD events in Dockerfiles alongside the related source code. We find that approximately 27% of admission events and 40% of repayment events are coupled to non-Dockerfile artifacts, and coupling sources are subtype-specific. We also observed that coupled SATD in general are repaid significantly faster overall (p = 0.0201), while coupled SATD regarding missing functionalities persists longer than its isolated counterparts; Lastly, we conducted open and axial coding of coupled SATD events, and we observe that external dependency issues, more particularly regarding unreleased upstream packages and bug fixes, are the most common cause of admission triggers in the source code; we also observe that architectural refactoring is the most common prerequisite for the repayment of SATD in Dockerfiles. These findings indicate that both practitioners (e.g. developers and project managers) and SATD researchers should integrate the source code-side co-evolution, rather than the single-file view, as the primary unit of analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript presents an empirical study of self-admitted technical debt (SATD) in Dockerfiles, analyzing its co-evolution with non-Dockerfile artifacts through commit history. It reports that ~27% of SATD admission events and ~40% of repayment events are coupled to changes in other artifacts, with coupling sources varying by SATD subtype. Coupled SATD is repaid significantly faster overall (p=0.0201), although coupled SATD on missing functionalities persists longer than isolated cases. Open and axial coding of coupled events identifies external dependency issues (particularly unreleased upstream packages and bug fixes) as the most common admission triggers and architectural refactoring as the most common repayment prerequisite. The authors conclude that SATD analysis should treat source-code co-evolution as the primary unit rather than the single-file view.

Significance. If the coupling classification and qualitative coding hold, the work is significant for technical-debt research in Infrastructure-as-Code. It supplies concrete coupling percentages, a statistically supported repayment-time difference, and subtype-specific qualitative patterns that together challenge single-artifact SATD studies. The mixed-methods design and falsifiable quantitative claims (percentages and p-value) are strengths that could influence both practitioner guidelines and future multi-artifact empirical work.

major comments (2)

[Section 3] Section 3 (Research Design / Coupling Identification): classifying an SATD event as 'coupled' solely because its commit also touches at least one non-Dockerfile artifact risks conflating incidental co-changes (large refactors, license updates, CI modifications) with substantive co-evolution. This operationalization directly supports the headline 27% / 40% figures and the p=0.0201 repayment-time comparison; without a validation step (e.g., manual inspection of a random sample of coupled commits to confirm dependency or causal linkage), both the subtype-specific source distributions and the faster-repayment result remain vulnerable to commit-granularity artifacts.
[Section 5] Section 5 (Qualitative Analysis): the open and axial coding that concludes 'external dependency issues' are the dominant admission trigger and 'architectural refactoring' the dominant repayment prerequisite lacks reported inter-rater reliability metrics, number of coders, or disagreement-resolution protocol. Because these coded categories are used to interpret the quantitative coupling results, the absence of reliability evidence weakens the causal claims derived from the qualitative step.

minor comments (3)

[Abstract] Abstract and §4: the exact statistical test underlying p=0.0201 (Mann-Whitney, log-rank, etc.) and any multiple-comparison correction should be stated explicitly.
[Section 3] §3: dataset summary statistics (number of projects, total Dockerfiles, total SATD events, commit window) are referenced but not tabulated; a concise table would improve reproducibility.
[Figures] Figure captions and axis labels should explicitly indicate whether 'coupled' refers to any non-Dockerfile change or only to changes in specific artifact types.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We address each major comment point by point below, providing clarifications on our methodological choices while indicating where we will strengthen the presentation in revision.

read point-by-point responses

Referee: [Section 3] Section 3 (Research Design / Coupling Identification): classifying an SATD event as 'coupled' solely because its commit also touches at least one non-Dockerfile artifact risks conflating incidental co-changes (large refactors, license updates, CI modifications) with substantive co-evolution. This operationalization directly supports the headline 27% / 40% figures and the p=0.0201 repayment-time comparison; without a validation step (e.g., manual inspection of a random sample of coupled commits to confirm dependency or causal linkage), both the subtype-specific source distributions and the faster-repayment result remain vulnerable to commit-granularity artifacts.

Authors: We agree that commit-level co-changes can include incidental modifications and that this is a known limitation of commit-granularity analyses in software evolution research. Our operationalization follows the standard practice of treating the commit as the atomic unit of developer activity, where any non-Dockerfile change in the same commit is considered coupled by definition. The headline percentages and the statistically significant repayment-time difference (p=0.0201) are therefore based on this observable co-occurrence rather than inferred causality. To mitigate concerns about noise, our qualitative coding was performed exclusively on the coupled events and surfaced consistent patterns (external dependencies as triggers, architectural refactoring as repayment prerequisite) that align with the quantitative results. We will add an explicit discussion of commit-granularity limitations and report results from a manual inspection of a random sample of coupled commits in the revised manuscript. revision: partial
Referee: [Section 5] Section 5 (Qualitative Analysis): the open and axial coding that concludes 'external dependency issues' are the dominant admission trigger and 'architectural refactoring' the dominant repayment prerequisite lacks reported inter-rater reliability metrics, number of coders, or disagreement-resolution protocol. Because these coded categories are used to interpret the quantitative coupling results, the absence of reliability evidence weakens the causal claims derived from the qualitative step.

Authors: The open and axial coding was performed by the first author on the set of coupled SATD events, with iterative discussions among all co-authors to resolve disagreements, refine category definitions, and reach consensus on the final themes. We did not compute formal inter-rater reliability statistics because the process was not designed as independent parallel coding by multiple raters. We will revise Section 5 to explicitly state the number of coders, describe the disagreement-resolution protocol, and acknowledge the absence of IRR metrics as a limitation of the qualitative component. revision: yes

Circularity Check

0 steps flagged

Empirical observations from commit data and qualitative coding exhibit no circular derivation

full rationale

The paper's core results (27% admission coupling, 40% repayment coupling, faster repayment with p=0.0201, and coded causes/prerequisites) are produced by direct processing of repository commit histories, file-change detection, and open/axial coding of events. No equations, fitted parameters, or predictions are defined in terms of themselves; the classification of 'coupled' events is an operational definition applied to observable data rather than a self-referential loop. Prior SATD work is cited only for background and does not supply load-bearing uniqueness theorems or ansatzes that the current claims reduce to. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

This empirical study relies on observational data from software repositories and qualitative analysis rather than introducing new mathematical entities or free parameters.

axioms (1)

domain assumption SATD events can be reliably identified and classified from comments and commit history across Dockerfiles and related source artifacts.
The study builds on prior definitions of self-admitted technical debt and assumes standard mining techniques from version control systems apply without major errors.

pith-pipeline@v0.9.0 · 5868 in / 1419 out tokens · 42594 ms · 2026-05-21T03:24:05.548017+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

78 extracted references · 78 canonical work pages · 1 internal anchor

[1]

URLhttps://buildah.io

buildah.io. URLhttps://buildah.io

work page
[2]

URLhttps://docs-internal.github.com/en/code-sec urity/tutorials/secure-your-dependencies/dependabot-quickstart-guide

Dependabot quickstart guide. URLhttps://docs-internal.github.com/en/code-sec urity/tutorials/secure-your-dependencies/dependabot-quickstart-guide

work page
[3]

URLhttps://docs.renovatebot.com/key-c oncepts/dashboard/

Dependency Dashboard - Renovate Docs. URLhttps://docs.renovatebot.com/key-c oncepts/dashboard/

work page
[4]

URLhttps://www.docker.c om/

Docker: Accelerated Container Application Development. URLhttps://www.docker.c om/

work page
[5]

URLhttps://docs.docker.com/referenc e/api/hub/latest/

Docker Hub API reference|Docker Docs. URLhttps://docs.docker.com/referenc e/api/hub/latest/

work page
[6]

URLhttps://podman.io/

Podman. URLhttps://podman.io/

work page
[7]

URLhttps://developer.hashicorp.com/terraform

Terraform|HashiCorp Developer. URLhttps://developer.hashicorp.com/terraform

work page
[8]

URLhttps://www.ibm.com/think/topics/c ontainerization

What Is Containerization?|IBM (2024). URLhttps://www.ibm.com/think/topics/c ontainerization

work page 2024
[9]

Empirical Software Engineering27(2), 49 (2022)

Azuma, H., Matsumoto, S., Kamei, Y., Kusumoto, S.: An empirical study on self- admitted technical debt in Dockerfiles. Empirical Software Engineering27(2), 49 (2022). DOI 10.1007/s10664-021-10081-7. URLhttps://doi.org/10.1007/s10664-021-100 81-7. TLDR: A manual classification for SATDs in Dockerfile was conducted, finding that about 3.0% of the comments i...

work page doi:10.1007/s10664-021-10081-7 2022
[10]

In: Proceedings of the 13th International Conference on Mining Software Repositories, pp

Bavota, G., Russo, B.: A large-scale empirical study on self-admitted technical debt. In: Proceedings of the 13th International Conference on Mining Software Repositories, pp. 315–326. ACM, Austin Texas (2016). DOI 10.1145/2901739.2901742. URLhttps: //dl.acm.org/doi/10.1145/2901739.2901742

work page doi:10.1145/2901739.2901742 2016
[11]

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing , url =

Benjamini, Y., Hochberg, Y.: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society: Series B (Methodological)57(1), 289–300 (1995). DOI 10.1111/j.2517-6161.1995.tb02031.x. URLhttps://doi.org/10.1111/j.2517-6161.1995.tb02031.x. Remark: Benjamini- Hochberg

work page doi:10.1111/j.2517-6161.1995.tb02031.x 1995
[12]

Empirical Software Engineering28(4), 97 (2023)

Bernardo, J.H., da Costa, D.A., Kulesza, U., Treude, C.: The impact of a continuous integration service on the delivery time of merged pull requests. Empirical Software Engineering28(4), 97 (2023). DOI 10.1007/s10664-023-10327-6. URLhttps://doi. org/10.1007/s10664-023-10327-6

work page doi:10.1007/s10664-023-10327-6 2023
[13]

ACM Transactions on Software Engineering and Methodology0(ja)

Bhatia, A., https://orcid.org/0000-0002-3552-9460, View Profile, Khomh, F., https://orcid.org/0000-0002-5704-4173, View Profile, Adams, B., https://orcid.org/0000-0001-7213-4006, View Profile, Hassan, A.E., https://orcid.org/0000-0001-7749-5513, View Profile: An Empirical Study of Self- Admitted Technical Debt in Machine Learning Software. ACM Transaction...

work page doi:10.1145/3785001
[14]

In: 2023 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp

Bui, Q.C., Lauk¨ otter, M., Scandariato, R.: DockerCleaner: Automatic Repair of Security Smells in Dockerfiles. In: 2023 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 160–170. IEEE, Bogot´ a, Colombia (2023). DOI 10.1109/ ICSME58846.2023.00026. URLhttps://ieeexplore.ieee.org/document/10336292/

work page arXiv 2023
[15]

ACM Trans

Cai, X., Liu, J., Liu, C., Bao, L., Yu, Y., Jiang, L.: Fortifying the Seams Between C/C++ and Rust: Characterizing Bugs in Interop Tools. ACM Trans. Softw. Eng. Methodol. (2026). DOI 10.1145/3795532. URLhttps://dl.acm.org/doi/10.1145/3795532. Just Accepted Title Suppressed Due to Excessive Length 37

work page doi:10.1145/3795532 2026
[16]

Chi, J., Wang, X., Huang, Y., Yu, L., Cui, D., Sun, J., Sun, J.: REACCEPT: Automated Co-evolution of Production and Test Code Based on Dynamic Validation and Large Language Models. Proc. ACM Softw. Eng.2(ISSTA), ISSTA055:1234–ISSTA055:1256 (2025). DOI 10.1145/3728930. URLhttps://dl.acm.org/doi/10.1145/3728930

work page doi:10.1145/3728930 2025
[17]

A coefficient of agreement for nominal scales.Educational and Psychological Measurement, 20(1):37–46, 1960

Cohen, J.: A Coefficient of Agreement for Nominal Scales. Educational and Psycho- logical Measurement20(1), 37–46 (1960). DOI 10.1177/001316446002000104. URL https://doi.org/10.1177/001316446002000104

work page doi:10.1177/001316446002000104 1960
[18]

Nonparametric

Conroy, R.M.: What Hypotheses do “Nonparametric” Two-Group Tests Actually Test? The Stata Journal12(2), 182–190 (2012). DOI 10.1177/1536867X1201200202. URL https://doi.org/10.1177/1536867X1201200202. Remark: Mann-Whitney

work page doi:10.1177/1536867x1201200202 2012
[19]

Princeton University Press (1946)

Cram´ er, H.: Mathematical Methods of Statistics. Princeton University Press (1946). Remark: Phi Coefficient Google-Books-ID: db1jwEACAAJ

work page 1946
[20]

DOI 10.48550/arXiv.2302.01707

Durieux, T.: Parfum: Detection and Automatic Repair of Dockerfile Smells (2023). DOI 10.48550/arXiv.2302.01707. URLhttp://arxiv.org/abs/2302.01707. ArXiv:2302.01707 [cs]

work page doi:10.48550/arxiv.2302.01707 2023
[21]

In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, ICSE ’24, pp

Durieux, T.: Empirical Study of the Docker Smells Impact on the Image Size. In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, ICSE ’24, pp. 1–12. Association for Computing Machinery, New York, NY, USA (2024). DOI 10.1145/3597503.3639143. URLhttps://dl.acm.org/doi/10.1145/3597503.363 9143

work page doi:10.1145/3597503.3639143 2024
[22]

Willsch, D

Fluri, B., W¨ ursch, M., Giger, E., Gall, H.C.: Analyzing the co-evolution of comments and source code. Software Quality Journal17(4), 367–394 (2009). DOI 10.1007/s1 1219-009-9075-x. URLhttp://link.springer.com/10.1007/s11219-009-9075-x. TLDR: An approach to associate comments with source code entities to track their co-evolution over multiple versions is...

work page doi:10.1007/s1 2009
[23]

DOI 10.1145/3652152

Gao, Z., Su, Y., Hu, X., Xia, X.: Automating TODO-missed Methods Detection and Patching (2024). DOI 10.1145/3652152. URLhttp://arxiv.org/abs/2405.06225. ArXiv:2405.06225 [cs]

work page doi:10.1145/3652152 2024
[24]

In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2021, pp

Gao, Z., Xia, X., Lo, D., Grundy, J., Zimmermann, T.: Automating the removal of obsolete TODO comments. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2021, pp. 218–229. Association for Computing Machinery, New York, NY, USA (2021). DOI 10.1145/34...

work page doi:10.1145/3468264.3468553 2021
[25]

Gu, H., Zhang, S., Huang, Q., Liao, Z., Liu, J., Lo, D.: Self-Admitted Technical Debts Identification: How Far Are We? In: 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 804–815. IEEE, Rovaniemi, Finland (2024). DOI 10.1109/SANER60148.2024.00087. URLhttps: //ieeexplore.ieee.org/document/10589830/

work page doi:10.1109/saner60148.2024.00087 2024
[26]

ACM Trans

Guo, Z., Liu, S., Liu, J., Li, Y., Chen, L., Lu, H., Zhou, Y.: How Far Have We Progressed in Identifying Self-admitted Technical Debts? A Comprehensive Empirical Study. ACM Trans. Softw. Eng. Methodol.30(4), 45:1–45:56 (2021). DOI 10.1145/3447247. URL https://dl.acm.org/doi/10.1145/3447247

work page doi:10.1145/3447247 2021
[27]

In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ICSE ’20, pp

Henkel, J., Bird, C., Lahiri, S.K., Reps, T.: Learning from, understanding, and support- ing DevOps artifacts for docker. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ICSE ’20, pp. 38–49. Association for Comput- ing Machinery, New York, NY, USA (2020). DOI 10.1145/3377811.3380406. URL https://dl.acm.org/doi/10.114...

work page doi:10.1145/3377811.3380406 2020
[28]

Empirical Software Engineering23(1), 418–451 (2018)

Huang, Q., Shihab, E., Xia, X., Lo, D., Li, S.: Identifying self-admitted technical debt in open source projects using text mining. Empirical Software Engineering23(1), 418–451 (2018). DOI 10.1007/s10664-017-9522-4. URLhttp://link.springer.com/10.1007/ s10664-017-9522-4

work page doi:10.1007/s10664-017-9522-4 2018
[29]

In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp

Jiang, Y., Adams, B.: Co-evolution of Infrastructure and Source Code - An Empirical Study. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 45–55. IEEE, Florence, Italy (2015). DOI 10.1109/MSR.2015.12. URLhttp: //ieeexplore.ieee.org/document/7180066/

work page doi:10.1109/msr.2015.12 2015
[30]

John Wiley & Sons (2002)

Kalbfleisch, J.D., Prentice, R.L.: The Statistical Analysis of Failure Time Data. John Wiley & Sons (2002). Remark: Time-to Google-Books-ID: 38C DwAAQBAJ 38 Wei Minn et al

work page 2002
[31]

Journal of the American Statistical Association53(282), 457–481 (1958)

Kaplan, E.L., Meier, P.: Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association53(282), 457–481 (1958). DOI 10.1080/01621459.1958.10501452. URLhttps://www.tandfonline.com/doi/full /10.1080/01621459.1958.10501452. Remark: Kaplan-Meier

work page doi:10.1080/01621459.1958.10501452 1958
[32]

In: Proceedings of the 21st International Conference on Mining Software Repositories, MSR ’24, pp

Ksontini, E., Abid, A., Khalsi, R., Kessentini, M.: DRMiner: A Tool For Identifying And Analyzing Refactorings In Dockerfile. In: Proceedings of the 21st International Conference on Mining Software Repositories, MSR ’24, pp. 584–594. Association for Computing Machinery, New York, NY, USA (2024). DOI 10.1145/3643991.3644921. URLhttps://dl.acm.org/doi/10.11...

work page doi:10.1145/3643991.3644921 2024
[33]

Biometrics33(1), 159–174 (1977)

Landis, J.R., Koch, G.G.: The Measurement of Observer Agreement for Categorical Data. Biometrics33(1), 159–174 (1977). DOI 10.2307/2529310. URLhttps://www.js tor.org/stable/2529310

work page doi:10.2307/2529310 1977
[34]

The Co-Evolution of Test Maintenance and Code Maintenance through the lens of Fine-Grained Semantic Changes

Levin, S., Yehudai, A.: The Co-Evolution of Test Maintenance and Code Maintenance through the lens of Fine-Grained Semantic Changes (2017). DOI 10.48550/arXiv.1709. 09029. URLhttp://arxiv.org/abs/1709.09029. Remark: ICSME’17 arXiv:1709.09029 [cs]

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1709 2017
[35]

DOI 10.48550/arXiv.2502.10802

Li, K., Yuan, Y., Yu, H., Guo, T., Cao, S.: CoCoEvo: Co-Evolution of Programs and Test Cases to Enhance Code Generation (2025). DOI 10.48550/arXiv.2502.10802. URLhttp: //arxiv.org/abs/2502.10802. ArXiv:2502.10802 [cs] TLDR: CoEvo is introduced, a novel LLM-based co-evolution framework that simultaneously evolves programs and test cases and proposes optimi...

work page doi:10.48550/arxiv.2502.10802 2025
[36]

ACM Trans

Li, Q., Yin, Z., Yang, Y., Li, C., Shen, Z., Ge, J., Zhong, W., Luo, B., Ng, V.: IMPACT: Identifying and Classifying Multiple Sourced and Categorized Self-Admitted Technical Debts. ACM Trans. Softw. Eng. Methodol. (2025). DOI 10.1145/3747180. URL https://dl.acm.org/doi/10.1145/3747180. Just Accepted

work page doi:10.1145/3747180 2025
[37]

Empirical Software Engineering28(3), 65 (2023)

Li, Y., Soliman, M., Avgeriou, P.: Automatic identification of self-admitted technical debt from four different sources. Empirical Software Engineering28(3), 65 (2023). DOI 10.1007/s10664-023-10297-9. URLhttps://doi.org/10.1007/s10664-023-10297-9

work page doi:10.1007/s10664-023-10297-9 2023
[38]

In: Pro- ceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Society, pp

Liu, J., Huang, Q., Xia, X., Shihab, E., Lo, D., Li, S.: Is using deep learning frame- works free?: characterizing technical debt in deep learning frameworks. In: Pro- ceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Society, pp. 1–10. ACM, Seoul South Korea (2020). DOI 10.1145/3377815.3381377. URLhtt...

work page doi:10.1145/3377815.3381377 2020
[39]

Empirical Software Engineering26(2), 16 (2021)

Liu, J., Huang, Q., Xia, X., Shihab, E., Lo, D., Li, S.: An exploratory study on the in- troduction and removal of different types of technical debt in deep learning frameworks. Empirical Software Engineering26(2), 16 (2021). DOI 10.1007/s10664-020-09917-5. URLhttp://link.springer.com/10.1007/s10664-020-09917-5

work page doi:10.1007/s10664-020-09917-5 2021
[40]

In: 2009 6th IEEE International Working Con- ference on Mining Software Repositories, pp

Lubsen, Z., Zaidman, A., Pinzger, M.: Using association rules to study the co-evolution of production & test code. In: 2009 6th IEEE International Working Con- ference on Mining Software Repositories, pp. 151–154. IEEE, Vancouver, BC, Canada (2009). DOI 10.1109/MSR.2009.5069493. URLhttp://ieeexplore.ieee.org/docume nt/5069493/

work page doi:10.1109/msr.2009.5069493 2009
[41]

Empirical Software Engineering25(5), 3770–3798 (2020)

Maipradit, R., Treude, C., Hata, H., Matsumoto, K.: Wait for it: identifying “On-Hold” self-admitted technical debt. Empirical Software Engineering25(5), 3770–3798 (2020). DOI 10.1007/s10664-020-09854-3. URLhttps://doi.org/10.1007/s10664-020-09854 -3

work page doi:10.1007/s10664-020-09854-3 2020
[42]

In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp

Maldonado, E.D.S., Abdalkareem, R., Shihab, E., Serebrenik, A.: An Empirical Study on the Removal of Self-Admitted Technical Debt. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 238–248 (2017). DOI 10.1109/ ICSME.2017.8. URLhttps://ieeexplore.ieee.org/document/8094425

work page arXiv 2017
[43]

IEEE Transactions on Software Engineering43(11), 1044–1062 (2017)

Maldonado, E.D.S., Shihab, E., Tsantalis, N.: Using Natural Language Processing to Automatically Detect Self-Admitted Technical Debt. IEEE Transactions on Software Engineering43(11), 1044–1062 (2017). DOI 10.1109/TSE.2017.2654244. URLhttp: //ieeexplore.ieee.org/document/7820211/

work page doi:10.1109/tse.2017.2654244 2017
[44]

In: 2014 IEEE 14th International Working Conference Title Suppressed Due to Excessive Length 39 on Source Code Analysis and Manipulation, pp

Marsavina, C., Romano, D., Zaidman, A.: Studying Fine-Grained Co-evolution Patterns of Production and Test Code. In: 2014 IEEE 14th International Working Conference Title Suppressed Due to Excessive Length 39 on Source Code Analysis and Manipulation, pp. 195–204. IEEE, Victoria, BC, Canada (2014). DOI 10.1109/SCAM.2014.28. URLhttp://ieeexplore.ieee.org/do...

work page doi:10.1109/scam.2014.28 2014
[45]

2015 12th Working IEEE/IFIP Conference on Software Architec- ture pp

Martini, A., Bosch, J.: The Danger of Architectural Technical Debt: Contagious Debt and Vicious Circles. 2015 12th Working IEEE/IFIP Conference on Software Architec- ture pp. 1–10 (2015). DOI 10.1109/WICSA.2015.31. URLhttp://ieeexplore.ieee. org/document/7158498/. Conference Name: 2015 12th Working IEEE/IFIP Conference on Software Architecture (WICSA) ISB...

work page doi:10.1109/wicsa.2015.31 2015
[46]

Mastropaolo, A., Di Penta, M., Bavota, G.: Towards Automatically Addressing Self- Admitted Technical Debt: How Far Are We? In: Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, ASE ’23, pp. 585–597. IEEE Press, Echternach, Luxembourg (2024). DOI 10.1109/ASE56229.2023.00103. URLhttps://dl.acm.org/doi/10.1109/ASE56...

work page doi:10.1109/ase56229.2023.00103 2024
[47]

In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp

Mcintosh, S., Adams, B., Nagappan, M., Hassan, A.E.: Mining Co-change Informa- tion to Understand When Build Changes Are Necessary. In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp. 241–250. IEEE, Victoria, BC, Canada (2014). DOI 10.1109/ICSME.2014.46. URLhttp://ieeexplore.ieee.org/do cument/6976090/

work page doi:10.1109/icsme.2014.46 2014
[48]

In: Proceedings of the 33rd International Conference on Software Engineering, pp

McIntosh, S., Adams, B., Nguyen, T.H., Kamei, Y., Hassan, A.E.: An empirical study of build maintenance effort. In: Proceedings of the 33rd International Conference on Software Engineering, pp. 141–150. ACM, Waikiki, Honolulu HI USA (2011). DOI 10.1145/1985793.1985813. URLhttps://dl.acm.org/doi/10.1145/1985793.1985813

work page doi:10.1145/1985793.1985813 2011
[49]

URLhttps://osf.io/sh5xd /overview?view_only=e06572d75ee54348807f3925c14b0371

Minn, W.: Dockerfile SATD-source co-evolution dataset. URLhttps://osf.io/sh5xd /overview?view_only=e06572d75ee54348807f3925c14b0371

work page
[50]

Empirical Software Engineering27(6), 130 (2022)

Muse, B.A., Nagy, C., Cleve, A., Khomh, F., Antoniol, G.: FIXME: synchronize with database! An empirical study of data access self-admitted technical debt. Empirical Software Engineering27(6), 130 (2022). DOI 10.1007/s10664-022-10119-4. URL https://doi.org/10.1007/s10664-022-10119-4

work page doi:10.1007/s10664-022-10119-4 2022
[51]

ACM Trans

Openja, M., Khomh, F., Foundjem, A., Jiang, Z.M.J., Abidi, M., Hassan, A.E.: An Empirical Study of Testing Machine Learning in the Wild. ACM Trans. Softw. Eng. Methodol.34(1), 7:1–7:63 (2024). DOI 10.1145/3680463. URLhttps://dl.acm.org/d oi/10.1145/3680463

work page doi:10.1145/3680463 2024
[52]

Proceedings of the Royal Society of London 58, 240–242

Pearson, K.: VII. Note on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London58(347-352), 240–242 (1895). DOI 10.1098/rspl.1895.0041. URLhttps://royalsocietypublishing.org/rspl/articl e/58/347-352/240/43470/VII-Note-on-regression-and-inheritance-in-the-case. Remark: Pearson Correlation

work page doi:10.1098/rspl.1895.0041
[53]

URLhttps://kubernetes.io/

Penfound, K., Dagger, Nickerson, C., Kubeshop: Production-Grade Container Orches- tration. URLhttps://kubernetes.io/

work page
[54]

Journal of the Royal Statistical Society

Peto, R., Peto, J.: Asymptotically Efficient Rank Invariant Test Procedures. Journal of the Royal Statistical Society. Series A (General)135(2), 185 (1972). DOI 10.2307/ 2344317. URLhttps://www.jstor.org/stable/10.2307/2344317?origin=crossref. Remark: Log-Rank

work page doi:10.2307/2344317 1972
[55]

In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp

Potdar, A., Shihab, E.: An Exploratory Study on Self-Admitted Technical Debt. In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp. 91–

work page 2014
[56]

DOI 10.1109/ICSME.2014.31

IEEE, Victoria, BC, Canada (2014). DOI 10.1109/ICSME.2014.31. URLhttp: //ieeexplore.ieee.org/document/6976075/

work page doi:10.1109/icsme.2014.31 2014
[57]

URLhttps://semver.org/

Preston-Werner, T.: Semantic Versioning 2.0.0. URLhttps://semver.org/

work page
[58]

In: Proceedings of the 28th International Conference on Program Comprehension, pp

Ren, H., Li, Y., Chen, L.: An Empirical Study on Critical Blocking Bugs. In: Proceedings of the 28th International Conference on Program Comprehension, pp. 72–82. ACM, Seoul Republic of Korea (2020). DOI 10.1145/3387904.3389267. URLhttps://dl.acm .org/doi/10.1145/3387904.3389267

work page doi:10.1145/3387904.3389267 2020
[59]

ACM Trans

Ren, X., Xing, Z., Xia, X., Lo, D., Wang, X., Grundy, J.: Neural Network-based De- tection of Self-Admitted Technical Debt: From Performance to Explainability. ACM Trans. Softw. Eng. Methodol.28(3), 15:1–15:45 (2019). DOI 10.1145/3324916. URL https://dl.acm.org/doi/10.1145/3324916

work page doi:10.1145/3324916 2019
[60]

In: Proceedings of 40 Wei Minn et al

Rong, G., Yu, Y., Liu, S., Tan, X., Zhang, T., Shen, H., Hu, J.: Code Comment Incon- sistency Detection and Rectification Using a Large Language Model. In: Proceedings of 40 Wei Minn et al. the IEEE/ACM 47th International Conference on Software Engineering, ICSE ’25, pp. 1832–1843. IEEE Press, Ottawa, Ontario, Canada (2025). DOI 10.1109/ICSE55347.20 25.00...

work page doi:10.1109/icse55347.20 2025
[61]

In: Proceedings of the 21st International Conference on Mining Software Repositories, pp

Rosa, G., Scalabrino, S., Robles, G., Oliveto, R.: Not all Dockerfile Smells are the Same: An Empirical Evaluation of Hadolint Writing Practices by Experts. In: Proceedings of the 21st International Conference on Mining Software Repositories, pp. 231–241. ACM, Lisbon Portugal (2024). DOI 10.1145/3643991.3644905. URLhttps://dl.acm.org/d oi/10.1145/3643991.3644905

work page doi:10.1145/3643991.3644905 2024
[62]

Empirical Software Engineering29(5), 108 (2024)

Rosa, G., Zappone, F., Scalabrino, S., Oliveto, R.: Fixing Dockerfile smells: an empirical study. Empirical Software Engineering29(5), 108 (2024). DOI 10.1007/s10664-024-1 0471-7. URLhttps://doi.org/10.1007/s10664-024-10471-7

work page doi:10.1007/s10664-024-1 2024
[63]

In: 2025 IEEE/ACM 33rd International Conference on Program Comprehension (ICPC), pp

Russo, B., Melegati, J., Mock, M.: Leveraging multi-task learning to improve the de- tection of SATD and vulnerability. In: 2025 IEEE/ACM 33rd International Conference on Program Comprehension (ICPC), pp. 01–12 (2025). DOI 10.1109/ICPC66645.2025 .00017. URLhttp://arxiv.org/abs/2501.15934. ArXiv:2501.15934 [cs]

work page doi:10.1109/icpc66645.2025 2025
[64]

DOI 10.48550/arXiv.2408.05379

Shabani, T., Nashid, N., Alian, P., Mesbah, A.: Dockerfile Flakiness: Characterization and Repair (2025). DOI 10.48550/arXiv.2408.05379. URLhttp://arxiv.org/abs/24 08.05379. ArXiv:2408.05379 [cs]

work page doi:10.48550/arxiv.2408.05379 2025
[65]

DOI 10 .48550/arXiv.2405.06806

Sheikhaei, M.S., Tian, Y., Wang, S., Xu, B.: An Empirical Study on the Effectiveness of Large Language Models for SATD Identification and Classification (2024). DOI 10 .48550/arXiv.2405.06806. URLhttp://arxiv.org/abs/2405.06806. ArXiv:2405.06806 [cs]

work page arXiv 2024
[66]

In: Proceedings of the 3rd ACM/IEEE International Confer- ence on Automation of Software Test, AST ’22, pp

Shimmi, S., Rahimi, M.: Leveraging code-test co-evolution patterns for automated test case recommendation. In: Proceedings of the 3rd ACM/IEEE International Confer- ence on Automation of Software Test, AST ’22, pp. 65–76. Association for Comput- ing Machinery, New York, NY, USA (2022). DOI 10.1145/3524481.3527222. URL https://dl.acm.org/doi/10.1145/352448...

work page doi:10.1145/3524481.3527222 2022
[67]

ACM Trans

Sun, W., Yan, M., Liu, Z., Xia, X., Lei, Y., Lo, D.: Revisiting the Identification of the Co-evolution of Production and Test Code. ACM Trans. Softw. Eng. Methodol.32(6), 152:1–152:37 (2023). DOI 10.1145/3607183. URLhttps://dl.acm.org/doi/10.1145 /3607183. TLDR: An empirical study investigating the reasons for test code updates occurring after the associa...

work page doi:10.1145/3607183 2023
[68]

In: Proceedings of the 11th Working Conference on Mining Software Repositories, pp

Valdivia Garcia, H., Shihab, E.: Characterizing and predicting blocking bugs in open source projects. In: Proceedings of the 11th Working Conference on Mining Software Repositories, pp. 72–81. ACM, Hyderabad India (2014). DOI 10.1145/2597073.2597099. URLhttps://dl.acm.org/doi/10.1145/2597073.2597099

work page doi:10.1145/2597073.2597099 2014
[69]

In: 2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE), pp

Vidacs, L., Pinzger, M.: Co-evolution analysis of production and test code by learning association rules of changes. In: 2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE), pp. 31–36. IEEE, Campobasso (2018). DOI 10.1109/MALTESQUE.2018.8368456. URLhttps://ieeexplore.ieee.org/docu ment/8368456/

work page doi:10.1109/maltesque.2018.8368456 2018
[70]

In: 2021 IEEE International Con- ference on Software Analysis, Evolution and Reengineering (SANER), pp

Wang, S., Wen, M., Liu, Y., Wang, Y., Wu, R.: Understanding and Facilitating the Co-Evolution of Production and Test Code. In: 2021 IEEE International Con- ference on Software Analysis, Evolution and Reengineering (SANER), pp. 272–283. IEEE, Honolulu, HI, USA (2021). DOI 10.1109/SANER50967.2021.00033. URL https://ieeexplore.ieee.org/document/9425945/

work page doi:10.1109/saner50967.2021.00033 2021
[71]

ACM Trans

Watanabe, M., Li, H., Kashiwa, Y., Reid, B., Iida, H., Hassan, A.E.: On the Use of Agentic Coding: An Empirical Study of Pull Requests on GitHub. ACM Trans. Softw. Eng. Methodol. (2026). DOI 10.1145/3798166. URLhttps://dl.acm.org/doi/10.11 45/3798166. Just Accepted

work page doi:10.1145/3798166 2026
[72]

IEEE Access8, 34127–34139 (2020)

Wu, Y., Zhang, Y., Wang, T., Wang, H.: Characterizing the Occurrence of Dockerfile Smells in Open-Source Software: An Empirical Study. IEEE Access8, 34127–34139 (2020). DOI 10.1109/ACCESS.2020.2973750. URLhttps://ieeexplore.ieee.org/do cument/8998208/

work page doi:10.1109/access.2020.2973750 2020
[73]

In: 2020 27th Asia-Pacific Software Title Suppressed Due to Excessive Length 41 Engineering Conference (APSEC), pp

Wu, Y., Zhang, Y., Wang, T., Wang, H.: Dockerfile Changes in Practice: A Large-Scale Empirical Study of 4,110 Projects on GitHub. In: 2020 27th Asia-Pacific Software Title Suppressed Due to Excessive Length 41 Engineering Conference (APSEC), pp. 247–256 (2020). DOI 10.1109/APSEC51365.2 020.00033. URLhttps://ieeexplore.ieee.org/document/9359307. ISSN: 2640-0715

work page doi:10.1109/apsec51365.2 2020
[74]

IEEE Transactions on Software Engineering48(10), 4214–4228 (2022)

Xiao, T., Wang, D., McIntosh, S., Hata, H., Kula, R.G., Ishio, T., Matsumoto, K.: Characterizing and Mitigating Self-Admitted Technical Debt in Build Systems. IEEE Transactions on Software Engineering48(10), 4214–4228 (2022). DOI 10.1109/TSE. 2021.3115772. URLhttps://ieeexplore.ieee.org/document/9551792/. TLDR: A qualitative analysis of 500 SATD comments ...

work page doi:10.1109/tse 2022
[75]

In: 2008 1st International Conference on Software Testing, Verification, and Validation, pp

Zaidman, A., Van Rompaey, B., Demeyer, S., Van Deursen, A.: Mining Software Re- positories to Study Co-Evolution of Production & Test Code. In: 2008 1st International Conference on Software Testing, Verification, and Validation, pp. 220–229. IEEE, Lille- hammer (2008). DOI 10.1109/ICST.2008.47. URLhttps://ieeexplore.ieee.org/do cument/4539549/

work page doi:10.1109/icst.2008.47 2008
[76]

Empirical Software Engineering16(3), 325–364 (2011)

Zaidman, A., Van Rompaey, B., van Deursen, A., Demeyer, S.: Studying the co-evolution of production and test code in open source and industrial developer test processes through repository mining. Empirical Software Engineering16(3), 325–364 (2011). DOI 10.1007/s10664-010-9143-7. URLhttps://doi.org/10.1007/s10664-010-9143-7. TLDR: This paper proposes three...

work page doi:10.1007/s10664-010-9143-7 2011
[77]

In: Proceedings of the 15th International Conference on Mining Software Repositories, pp

Zampetti, F., Serebrenik, A., Di Penta, M.: Was self-admitted technical debt removal a real removal?: an in-depth perspective. In: Proceedings of the 15th International Conference on Mining Software Repositories, pp. 526–536. ACM, Gothenburg Sweden (2018). DOI 10.1145/3196398.3196423. URLhttps://dl.acm.org/doi/10.1145/31963 98.3196423

work page doi:10.1145/3196398.3196423 2018
[78]

ACM Trans

Zhou, Y., Zhan, W., Li, Z., Han, T., Chen, T., Gall, H.: DRIVE: Dockerfile Rule Mining and Violation Detection. ACM Trans. Softw. Eng. Methodol.33(2), 30:1–30:23 (2023). DOI 10.1145/3617173. URLhttps://dl.acm.org/doi/10.1145/3617173

work page doi:10.1145/3617173 2023

[1] [1]

URLhttps://buildah.io

buildah.io. URLhttps://buildah.io

work page

[2] [2]

URLhttps://docs-internal.github.com/en/code-sec urity/tutorials/secure-your-dependencies/dependabot-quickstart-guide

Dependabot quickstart guide. URLhttps://docs-internal.github.com/en/code-sec urity/tutorials/secure-your-dependencies/dependabot-quickstart-guide

work page

[3] [3]

URLhttps://docs.renovatebot.com/key-c oncepts/dashboard/

Dependency Dashboard - Renovate Docs. URLhttps://docs.renovatebot.com/key-c oncepts/dashboard/

work page

[4] [4]

URLhttps://www.docker.c om/

Docker: Accelerated Container Application Development. URLhttps://www.docker.c om/

work page

[5] [5]

URLhttps://docs.docker.com/referenc e/api/hub/latest/

Docker Hub API reference|Docker Docs. URLhttps://docs.docker.com/referenc e/api/hub/latest/

work page

[6] [6]

URLhttps://podman.io/

Podman. URLhttps://podman.io/

work page

[7] [7]

URLhttps://developer.hashicorp.com/terraform

Terraform|HashiCorp Developer. URLhttps://developer.hashicorp.com/terraform

work page

[8] [8]

URLhttps://www.ibm.com/think/topics/c ontainerization

What Is Containerization?|IBM (2024). URLhttps://www.ibm.com/think/topics/c ontainerization

work page 2024

[9] [9]

Empirical Software Engineering27(2), 49 (2022)

Azuma, H., Matsumoto, S., Kamei, Y., Kusumoto, S.: An empirical study on self- admitted technical debt in Dockerfiles. Empirical Software Engineering27(2), 49 (2022). DOI 10.1007/s10664-021-10081-7. URLhttps://doi.org/10.1007/s10664-021-100 81-7. TLDR: A manual classification for SATDs in Dockerfile was conducted, finding that about 3.0% of the comments i...

work page doi:10.1007/s10664-021-10081-7 2022

[10] [10]

In: Proceedings of the 13th International Conference on Mining Software Repositories, pp

Bavota, G., Russo, B.: A large-scale empirical study on self-admitted technical debt. In: Proceedings of the 13th International Conference on Mining Software Repositories, pp. 315–326. ACM, Austin Texas (2016). DOI 10.1145/2901739.2901742. URLhttps: //dl.acm.org/doi/10.1145/2901739.2901742

work page doi:10.1145/2901739.2901742 2016

[11] [11]

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing , url =

Benjamini, Y., Hochberg, Y.: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society: Series B (Methodological)57(1), 289–300 (1995). DOI 10.1111/j.2517-6161.1995.tb02031.x. URLhttps://doi.org/10.1111/j.2517-6161.1995.tb02031.x. Remark: Benjamini- Hochberg

work page doi:10.1111/j.2517-6161.1995.tb02031.x 1995

[12] [12]

Empirical Software Engineering28(4), 97 (2023)

Bernardo, J.H., da Costa, D.A., Kulesza, U., Treude, C.: The impact of a continuous integration service on the delivery time of merged pull requests. Empirical Software Engineering28(4), 97 (2023). DOI 10.1007/s10664-023-10327-6. URLhttps://doi. org/10.1007/s10664-023-10327-6

work page doi:10.1007/s10664-023-10327-6 2023

[13] [13]

ACM Transactions on Software Engineering and Methodology0(ja)

Bhatia, A., https://orcid.org/0000-0002-3552-9460, View Profile, Khomh, F., https://orcid.org/0000-0002-5704-4173, View Profile, Adams, B., https://orcid.org/0000-0001-7213-4006, View Profile, Hassan, A.E., https://orcid.org/0000-0001-7749-5513, View Profile: An Empirical Study of Self- Admitted Technical Debt in Machine Learning Software. ACM Transaction...

work page doi:10.1145/3785001

[14] [14]

In: 2023 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp

Bui, Q.C., Lauk¨ otter, M., Scandariato, R.: DockerCleaner: Automatic Repair of Security Smells in Dockerfiles. In: 2023 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 160–170. IEEE, Bogot´ a, Colombia (2023). DOI 10.1109/ ICSME58846.2023.00026. URLhttps://ieeexplore.ieee.org/document/10336292/

work page arXiv 2023

[15] [15]

ACM Trans

Cai, X., Liu, J., Liu, C., Bao, L., Yu, Y., Jiang, L.: Fortifying the Seams Between C/C++ and Rust: Characterizing Bugs in Interop Tools. ACM Trans. Softw. Eng. Methodol. (2026). DOI 10.1145/3795532. URLhttps://dl.acm.org/doi/10.1145/3795532. Just Accepted Title Suppressed Due to Excessive Length 37

work page doi:10.1145/3795532 2026

[16] [16]

Chi, J., Wang, X., Huang, Y., Yu, L., Cui, D., Sun, J., Sun, J.: REACCEPT: Automated Co-evolution of Production and Test Code Based on Dynamic Validation and Large Language Models. Proc. ACM Softw. Eng.2(ISSTA), ISSTA055:1234–ISSTA055:1256 (2025). DOI 10.1145/3728930. URLhttps://dl.acm.org/doi/10.1145/3728930

work page doi:10.1145/3728930 2025

[17] [17]

A coefficient of agreement for nominal scales.Educational and Psychological Measurement, 20(1):37–46, 1960

Cohen, J.: A Coefficient of Agreement for Nominal Scales. Educational and Psycho- logical Measurement20(1), 37–46 (1960). DOI 10.1177/001316446002000104. URL https://doi.org/10.1177/001316446002000104

work page doi:10.1177/001316446002000104 1960

[18] [18]

Nonparametric

Conroy, R.M.: What Hypotheses do “Nonparametric” Two-Group Tests Actually Test? The Stata Journal12(2), 182–190 (2012). DOI 10.1177/1536867X1201200202. URL https://doi.org/10.1177/1536867X1201200202. Remark: Mann-Whitney

work page doi:10.1177/1536867x1201200202 2012

[19] [19]

Princeton University Press (1946)

Cram´ er, H.: Mathematical Methods of Statistics. Princeton University Press (1946). Remark: Phi Coefficient Google-Books-ID: db1jwEACAAJ

work page 1946

[20] [20]

DOI 10.48550/arXiv.2302.01707

Durieux, T.: Parfum: Detection and Automatic Repair of Dockerfile Smells (2023). DOI 10.48550/arXiv.2302.01707. URLhttp://arxiv.org/abs/2302.01707. ArXiv:2302.01707 [cs]

work page doi:10.48550/arxiv.2302.01707 2023

[21] [21]

In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, ICSE ’24, pp

Durieux, T.: Empirical Study of the Docker Smells Impact on the Image Size. In: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, ICSE ’24, pp. 1–12. Association for Computing Machinery, New York, NY, USA (2024). DOI 10.1145/3597503.3639143. URLhttps://dl.acm.org/doi/10.1145/3597503.363 9143

work page doi:10.1145/3597503.3639143 2024

[22] [22]

Willsch, D

Fluri, B., W¨ ursch, M., Giger, E., Gall, H.C.: Analyzing the co-evolution of comments and source code. Software Quality Journal17(4), 367–394 (2009). DOI 10.1007/s1 1219-009-9075-x. URLhttp://link.springer.com/10.1007/s11219-009-9075-x. TLDR: An approach to associate comments with source code entities to track their co-evolution over multiple versions is...

work page doi:10.1007/s1 2009

[23] [23]

DOI 10.1145/3652152

Gao, Z., Su, Y., Hu, X., Xia, X.: Automating TODO-missed Methods Detection and Patching (2024). DOI 10.1145/3652152. URLhttp://arxiv.org/abs/2405.06225. ArXiv:2405.06225 [cs]

work page doi:10.1145/3652152 2024

[24] [24]

In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2021, pp

Gao, Z., Xia, X., Lo, D., Grundy, J., Zimmermann, T.: Automating the removal of obsolete TODO comments. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2021, pp. 218–229. Association for Computing Machinery, New York, NY, USA (2021). DOI 10.1145/34...

work page doi:10.1145/3468264.3468553 2021

[25] [25]

Gu, H., Zhang, S., Huang, Q., Liao, Z., Liu, J., Lo, D.: Self-Admitted Technical Debts Identification: How Far Are We? In: 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 804–815. IEEE, Rovaniemi, Finland (2024). DOI 10.1109/SANER60148.2024.00087. URLhttps: //ieeexplore.ieee.org/document/10589830/

work page doi:10.1109/saner60148.2024.00087 2024

[26] [26]

ACM Trans

Guo, Z., Liu, S., Liu, J., Li, Y., Chen, L., Lu, H., Zhou, Y.: How Far Have We Progressed in Identifying Self-admitted Technical Debts? A Comprehensive Empirical Study. ACM Trans. Softw. Eng. Methodol.30(4), 45:1–45:56 (2021). DOI 10.1145/3447247. URL https://dl.acm.org/doi/10.1145/3447247

work page doi:10.1145/3447247 2021

[27] [27]

In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ICSE ’20, pp

Henkel, J., Bird, C., Lahiri, S.K., Reps, T.: Learning from, understanding, and support- ing DevOps artifacts for docker. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ICSE ’20, pp. 38–49. Association for Comput- ing Machinery, New York, NY, USA (2020). DOI 10.1145/3377811.3380406. URL https://dl.acm.org/doi/10.114...

work page doi:10.1145/3377811.3380406 2020

[28] [28]

Empirical Software Engineering23(1), 418–451 (2018)

Huang, Q., Shihab, E., Xia, X., Lo, D., Li, S.: Identifying self-admitted technical debt in open source projects using text mining. Empirical Software Engineering23(1), 418–451 (2018). DOI 10.1007/s10664-017-9522-4. URLhttp://link.springer.com/10.1007/ s10664-017-9522-4

work page doi:10.1007/s10664-017-9522-4 2018

[29] [29]

In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp

Jiang, Y., Adams, B.: Co-evolution of Infrastructure and Source Code - An Empirical Study. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 45–55. IEEE, Florence, Italy (2015). DOI 10.1109/MSR.2015.12. URLhttp: //ieeexplore.ieee.org/document/7180066/

work page doi:10.1109/msr.2015.12 2015

[30] [30]

John Wiley & Sons (2002)

Kalbfleisch, J.D., Prentice, R.L.: The Statistical Analysis of Failure Time Data. John Wiley & Sons (2002). Remark: Time-to Google-Books-ID: 38C DwAAQBAJ 38 Wei Minn et al

work page 2002

[31] [31]

Journal of the American Statistical Association53(282), 457–481 (1958)

Kaplan, E.L., Meier, P.: Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association53(282), 457–481 (1958). DOI 10.1080/01621459.1958.10501452. URLhttps://www.tandfonline.com/doi/full /10.1080/01621459.1958.10501452. Remark: Kaplan-Meier

work page doi:10.1080/01621459.1958.10501452 1958

[32] [32]

In: Proceedings of the 21st International Conference on Mining Software Repositories, MSR ’24, pp

Ksontini, E., Abid, A., Khalsi, R., Kessentini, M.: DRMiner: A Tool For Identifying And Analyzing Refactorings In Dockerfile. In: Proceedings of the 21st International Conference on Mining Software Repositories, MSR ’24, pp. 584–594. Association for Computing Machinery, New York, NY, USA (2024). DOI 10.1145/3643991.3644921. URLhttps://dl.acm.org/doi/10.11...

work page doi:10.1145/3643991.3644921 2024

[33] [33]

Biometrics33(1), 159–174 (1977)

Landis, J.R., Koch, G.G.: The Measurement of Observer Agreement for Categorical Data. Biometrics33(1), 159–174 (1977). DOI 10.2307/2529310. URLhttps://www.js tor.org/stable/2529310

work page doi:10.2307/2529310 1977

[34] [34]

The Co-Evolution of Test Maintenance and Code Maintenance through the lens of Fine-Grained Semantic Changes

Levin, S., Yehudai, A.: The Co-Evolution of Test Maintenance and Code Maintenance through the lens of Fine-Grained Semantic Changes (2017). DOI 10.48550/arXiv.1709. 09029. URLhttp://arxiv.org/abs/1709.09029. Remark: ICSME’17 arXiv:1709.09029 [cs]

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1709 2017

[35] [35]

DOI 10.48550/arXiv.2502.10802

Li, K., Yuan, Y., Yu, H., Guo, T., Cao, S.: CoCoEvo: Co-Evolution of Programs and Test Cases to Enhance Code Generation (2025). DOI 10.48550/arXiv.2502.10802. URLhttp: //arxiv.org/abs/2502.10802. ArXiv:2502.10802 [cs] TLDR: CoEvo is introduced, a novel LLM-based co-evolution framework that simultaneously evolves programs and test cases and proposes optimi...

work page doi:10.48550/arxiv.2502.10802 2025

[36] [36]

ACM Trans

Li, Q., Yin, Z., Yang, Y., Li, C., Shen, Z., Ge, J., Zhong, W., Luo, B., Ng, V.: IMPACT: Identifying and Classifying Multiple Sourced and Categorized Self-Admitted Technical Debts. ACM Trans. Softw. Eng. Methodol. (2025). DOI 10.1145/3747180. URL https://dl.acm.org/doi/10.1145/3747180. Just Accepted

work page doi:10.1145/3747180 2025

[37] [37]

Empirical Software Engineering28(3), 65 (2023)

Li, Y., Soliman, M., Avgeriou, P.: Automatic identification of self-admitted technical debt from four different sources. Empirical Software Engineering28(3), 65 (2023). DOI 10.1007/s10664-023-10297-9. URLhttps://doi.org/10.1007/s10664-023-10297-9

work page doi:10.1007/s10664-023-10297-9 2023

[38] [38]

In: Pro- ceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Society, pp

Liu, J., Huang, Q., Xia, X., Shihab, E., Lo, D., Li, S.: Is using deep learning frame- works free?: characterizing technical debt in deep learning frameworks. In: Pro- ceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Society, pp. 1–10. ACM, Seoul South Korea (2020). DOI 10.1145/3377815.3381377. URLhtt...

work page doi:10.1145/3377815.3381377 2020

[39] [39]

Empirical Software Engineering26(2), 16 (2021)

Liu, J., Huang, Q., Xia, X., Shihab, E., Lo, D., Li, S.: An exploratory study on the in- troduction and removal of different types of technical debt in deep learning frameworks. Empirical Software Engineering26(2), 16 (2021). DOI 10.1007/s10664-020-09917-5. URLhttp://link.springer.com/10.1007/s10664-020-09917-5

work page doi:10.1007/s10664-020-09917-5 2021

[40] [40]

In: 2009 6th IEEE International Working Con- ference on Mining Software Repositories, pp

Lubsen, Z., Zaidman, A., Pinzger, M.: Using association rules to study the co-evolution of production & test code. In: 2009 6th IEEE International Working Con- ference on Mining Software Repositories, pp. 151–154. IEEE, Vancouver, BC, Canada (2009). DOI 10.1109/MSR.2009.5069493. URLhttp://ieeexplore.ieee.org/docume nt/5069493/

work page doi:10.1109/msr.2009.5069493 2009

[41] [41]

Empirical Software Engineering25(5), 3770–3798 (2020)

Maipradit, R., Treude, C., Hata, H., Matsumoto, K.: Wait for it: identifying “On-Hold” self-admitted technical debt. Empirical Software Engineering25(5), 3770–3798 (2020). DOI 10.1007/s10664-020-09854-3. URLhttps://doi.org/10.1007/s10664-020-09854 -3

work page doi:10.1007/s10664-020-09854-3 2020

[42] [42]

In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp

Maldonado, E.D.S., Abdalkareem, R., Shihab, E., Serebrenik, A.: An Empirical Study on the Removal of Self-Admitted Technical Debt. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 238–248 (2017). DOI 10.1109/ ICSME.2017.8. URLhttps://ieeexplore.ieee.org/document/8094425

work page arXiv 2017

[43] [43]

IEEE Transactions on Software Engineering43(11), 1044–1062 (2017)

Maldonado, E.D.S., Shihab, E., Tsantalis, N.: Using Natural Language Processing to Automatically Detect Self-Admitted Technical Debt. IEEE Transactions on Software Engineering43(11), 1044–1062 (2017). DOI 10.1109/TSE.2017.2654244. URLhttp: //ieeexplore.ieee.org/document/7820211/

work page doi:10.1109/tse.2017.2654244 2017

[44] [44]

In: 2014 IEEE 14th International Working Conference Title Suppressed Due to Excessive Length 39 on Source Code Analysis and Manipulation, pp

Marsavina, C., Romano, D., Zaidman, A.: Studying Fine-Grained Co-evolution Patterns of Production and Test Code. In: 2014 IEEE 14th International Working Conference Title Suppressed Due to Excessive Length 39 on Source Code Analysis and Manipulation, pp. 195–204. IEEE, Victoria, BC, Canada (2014). DOI 10.1109/SCAM.2014.28. URLhttp://ieeexplore.ieee.org/do...

work page doi:10.1109/scam.2014.28 2014

[45] [45]

2015 12th Working IEEE/IFIP Conference on Software Architec- ture pp

Martini, A., Bosch, J.: The Danger of Architectural Technical Debt: Contagious Debt and Vicious Circles. 2015 12th Working IEEE/IFIP Conference on Software Architec- ture pp. 1–10 (2015). DOI 10.1109/WICSA.2015.31. URLhttp://ieeexplore.ieee. org/document/7158498/. Conference Name: 2015 12th Working IEEE/IFIP Conference on Software Architecture (WICSA) ISB...

work page doi:10.1109/wicsa.2015.31 2015

[46] [46]

Mastropaolo, A., Di Penta, M., Bavota, G.: Towards Automatically Addressing Self- Admitted Technical Debt: How Far Are We? In: Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, ASE ’23, pp. 585–597. IEEE Press, Echternach, Luxembourg (2024). DOI 10.1109/ASE56229.2023.00103. URLhttps://dl.acm.org/doi/10.1109/ASE56...

work page doi:10.1109/ase56229.2023.00103 2024

[47] [47]

In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp

Mcintosh, S., Adams, B., Nagappan, M., Hassan, A.E.: Mining Co-change Informa- tion to Understand When Build Changes Are Necessary. In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp. 241–250. IEEE, Victoria, BC, Canada (2014). DOI 10.1109/ICSME.2014.46. URLhttp://ieeexplore.ieee.org/do cument/6976090/

work page doi:10.1109/icsme.2014.46 2014

[48] [48]

In: Proceedings of the 33rd International Conference on Software Engineering, pp

McIntosh, S., Adams, B., Nguyen, T.H., Kamei, Y., Hassan, A.E.: An empirical study of build maintenance effort. In: Proceedings of the 33rd International Conference on Software Engineering, pp. 141–150. ACM, Waikiki, Honolulu HI USA (2011). DOI 10.1145/1985793.1985813. URLhttps://dl.acm.org/doi/10.1145/1985793.1985813

work page doi:10.1145/1985793.1985813 2011

[49] [49]

URLhttps://osf.io/sh5xd /overview?view_only=e06572d75ee54348807f3925c14b0371

Minn, W.: Dockerfile SATD-source co-evolution dataset. URLhttps://osf.io/sh5xd /overview?view_only=e06572d75ee54348807f3925c14b0371

work page

[50] [50]

Empirical Software Engineering27(6), 130 (2022)

Muse, B.A., Nagy, C., Cleve, A., Khomh, F., Antoniol, G.: FIXME: synchronize with database! An empirical study of data access self-admitted technical debt. Empirical Software Engineering27(6), 130 (2022). DOI 10.1007/s10664-022-10119-4. URL https://doi.org/10.1007/s10664-022-10119-4

work page doi:10.1007/s10664-022-10119-4 2022

[51] [51]

ACM Trans

Openja, M., Khomh, F., Foundjem, A., Jiang, Z.M.J., Abidi, M., Hassan, A.E.: An Empirical Study of Testing Machine Learning in the Wild. ACM Trans. Softw. Eng. Methodol.34(1), 7:1–7:63 (2024). DOI 10.1145/3680463. URLhttps://dl.acm.org/d oi/10.1145/3680463

work page doi:10.1145/3680463 2024

[52] [52]

Proceedings of the Royal Society of London 58, 240–242

Pearson, K.: VII. Note on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London58(347-352), 240–242 (1895). DOI 10.1098/rspl.1895.0041. URLhttps://royalsocietypublishing.org/rspl/articl e/58/347-352/240/43470/VII-Note-on-regression-and-inheritance-in-the-case. Remark: Pearson Correlation

work page doi:10.1098/rspl.1895.0041

[53] [53]

URLhttps://kubernetes.io/

Penfound, K., Dagger, Nickerson, C., Kubeshop: Production-Grade Container Orches- tration. URLhttps://kubernetes.io/

work page

[54] [54]

Journal of the Royal Statistical Society

Peto, R., Peto, J.: Asymptotically Efficient Rank Invariant Test Procedures. Journal of the Royal Statistical Society. Series A (General)135(2), 185 (1972). DOI 10.2307/ 2344317. URLhttps://www.jstor.org/stable/10.2307/2344317?origin=crossref. Remark: Log-Rank

work page doi:10.2307/2344317 1972

[55] [55]

In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp

Potdar, A., Shihab, E.: An Exploratory Study on Self-Admitted Technical Debt. In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp. 91–

work page 2014

[56] [56]

DOI 10.1109/ICSME.2014.31

IEEE, Victoria, BC, Canada (2014). DOI 10.1109/ICSME.2014.31. URLhttp: //ieeexplore.ieee.org/document/6976075/

work page doi:10.1109/icsme.2014.31 2014

[57] [57]

URLhttps://semver.org/

Preston-Werner, T.: Semantic Versioning 2.0.0. URLhttps://semver.org/

work page

[58] [58]

In: Proceedings of the 28th International Conference on Program Comprehension, pp

Ren, H., Li, Y., Chen, L.: An Empirical Study on Critical Blocking Bugs. In: Proceedings of the 28th International Conference on Program Comprehension, pp. 72–82. ACM, Seoul Republic of Korea (2020). DOI 10.1145/3387904.3389267. URLhttps://dl.acm .org/doi/10.1145/3387904.3389267

work page doi:10.1145/3387904.3389267 2020

[59] [59]

ACM Trans

Ren, X., Xing, Z., Xia, X., Lo, D., Wang, X., Grundy, J.: Neural Network-based De- tection of Self-Admitted Technical Debt: From Performance to Explainability. ACM Trans. Softw. Eng. Methodol.28(3), 15:1–15:45 (2019). DOI 10.1145/3324916. URL https://dl.acm.org/doi/10.1145/3324916

work page doi:10.1145/3324916 2019

[60] [60]

In: Proceedings of 40 Wei Minn et al

Rong, G., Yu, Y., Liu, S., Tan, X., Zhang, T., Shen, H., Hu, J.: Code Comment Incon- sistency Detection and Rectification Using a Large Language Model. In: Proceedings of 40 Wei Minn et al. the IEEE/ACM 47th International Conference on Software Engineering, ICSE ’25, pp. 1832–1843. IEEE Press, Ottawa, Ontario, Canada (2025). DOI 10.1109/ICSE55347.20 25.00...

work page doi:10.1109/icse55347.20 2025

[61] [61]

In: Proceedings of the 21st International Conference on Mining Software Repositories, pp

Rosa, G., Scalabrino, S., Robles, G., Oliveto, R.: Not all Dockerfile Smells are the Same: An Empirical Evaluation of Hadolint Writing Practices by Experts. In: Proceedings of the 21st International Conference on Mining Software Repositories, pp. 231–241. ACM, Lisbon Portugal (2024). DOI 10.1145/3643991.3644905. URLhttps://dl.acm.org/d oi/10.1145/3643991.3644905

work page doi:10.1145/3643991.3644905 2024

[62] [62]

Empirical Software Engineering29(5), 108 (2024)

Rosa, G., Zappone, F., Scalabrino, S., Oliveto, R.: Fixing Dockerfile smells: an empirical study. Empirical Software Engineering29(5), 108 (2024). DOI 10.1007/s10664-024-1 0471-7. URLhttps://doi.org/10.1007/s10664-024-10471-7

work page doi:10.1007/s10664-024-1 2024

[63] [63]

In: 2025 IEEE/ACM 33rd International Conference on Program Comprehension (ICPC), pp

Russo, B., Melegati, J., Mock, M.: Leveraging multi-task learning to improve the de- tection of SATD and vulnerability. In: 2025 IEEE/ACM 33rd International Conference on Program Comprehension (ICPC), pp. 01–12 (2025). DOI 10.1109/ICPC66645.2025 .00017. URLhttp://arxiv.org/abs/2501.15934. ArXiv:2501.15934 [cs]

work page doi:10.1109/icpc66645.2025 2025

[64] [64]

DOI 10.48550/arXiv.2408.05379

Shabani, T., Nashid, N., Alian, P., Mesbah, A.: Dockerfile Flakiness: Characterization and Repair (2025). DOI 10.48550/arXiv.2408.05379. URLhttp://arxiv.org/abs/24 08.05379. ArXiv:2408.05379 [cs]

work page doi:10.48550/arxiv.2408.05379 2025

[65] [65]

DOI 10 .48550/arXiv.2405.06806

Sheikhaei, M.S., Tian, Y., Wang, S., Xu, B.: An Empirical Study on the Effectiveness of Large Language Models for SATD Identification and Classification (2024). DOI 10 .48550/arXiv.2405.06806. URLhttp://arxiv.org/abs/2405.06806. ArXiv:2405.06806 [cs]

work page arXiv 2024

[66] [66]

In: Proceedings of the 3rd ACM/IEEE International Confer- ence on Automation of Software Test, AST ’22, pp

Shimmi, S., Rahimi, M.: Leveraging code-test co-evolution patterns for automated test case recommendation. In: Proceedings of the 3rd ACM/IEEE International Confer- ence on Automation of Software Test, AST ’22, pp. 65–76. Association for Comput- ing Machinery, New York, NY, USA (2022). DOI 10.1145/3524481.3527222. URL https://dl.acm.org/doi/10.1145/352448...

work page doi:10.1145/3524481.3527222 2022

[67] [67]

ACM Trans

Sun, W., Yan, M., Liu, Z., Xia, X., Lei, Y., Lo, D.: Revisiting the Identification of the Co-evolution of Production and Test Code. ACM Trans. Softw. Eng. Methodol.32(6), 152:1–152:37 (2023). DOI 10.1145/3607183. URLhttps://dl.acm.org/doi/10.1145 /3607183. TLDR: An empirical study investigating the reasons for test code updates occurring after the associa...

work page doi:10.1145/3607183 2023

[68] [68]

In: Proceedings of the 11th Working Conference on Mining Software Repositories, pp

Valdivia Garcia, H., Shihab, E.: Characterizing and predicting blocking bugs in open source projects. In: Proceedings of the 11th Working Conference on Mining Software Repositories, pp. 72–81. ACM, Hyderabad India (2014). DOI 10.1145/2597073.2597099. URLhttps://dl.acm.org/doi/10.1145/2597073.2597099

work page doi:10.1145/2597073.2597099 2014

[69] [69]

In: 2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE), pp

Vidacs, L., Pinzger, M.: Co-evolution analysis of production and test code by learning association rules of changes. In: 2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE), pp. 31–36. IEEE, Campobasso (2018). DOI 10.1109/MALTESQUE.2018.8368456. URLhttps://ieeexplore.ieee.org/docu ment/8368456/

work page doi:10.1109/maltesque.2018.8368456 2018

[70] [70]

In: 2021 IEEE International Con- ference on Software Analysis, Evolution and Reengineering (SANER), pp

Wang, S., Wen, M., Liu, Y., Wang, Y., Wu, R.: Understanding and Facilitating the Co-Evolution of Production and Test Code. In: 2021 IEEE International Con- ference on Software Analysis, Evolution and Reengineering (SANER), pp. 272–283. IEEE, Honolulu, HI, USA (2021). DOI 10.1109/SANER50967.2021.00033. URL https://ieeexplore.ieee.org/document/9425945/

work page doi:10.1109/saner50967.2021.00033 2021

[71] [71]

ACM Trans

Watanabe, M., Li, H., Kashiwa, Y., Reid, B., Iida, H., Hassan, A.E.: On the Use of Agentic Coding: An Empirical Study of Pull Requests on GitHub. ACM Trans. Softw. Eng. Methodol. (2026). DOI 10.1145/3798166. URLhttps://dl.acm.org/doi/10.11 45/3798166. Just Accepted

work page doi:10.1145/3798166 2026

[72] [72]

IEEE Access8, 34127–34139 (2020)

Wu, Y., Zhang, Y., Wang, T., Wang, H.: Characterizing the Occurrence of Dockerfile Smells in Open-Source Software: An Empirical Study. IEEE Access8, 34127–34139 (2020). DOI 10.1109/ACCESS.2020.2973750. URLhttps://ieeexplore.ieee.org/do cument/8998208/

work page doi:10.1109/access.2020.2973750 2020

[73] [73]

In: 2020 27th Asia-Pacific Software Title Suppressed Due to Excessive Length 41 Engineering Conference (APSEC), pp

Wu, Y., Zhang, Y., Wang, T., Wang, H.: Dockerfile Changes in Practice: A Large-Scale Empirical Study of 4,110 Projects on GitHub. In: 2020 27th Asia-Pacific Software Title Suppressed Due to Excessive Length 41 Engineering Conference (APSEC), pp. 247–256 (2020). DOI 10.1109/APSEC51365.2 020.00033. URLhttps://ieeexplore.ieee.org/document/9359307. ISSN: 2640-0715

work page doi:10.1109/apsec51365.2 2020

[74] [74]

IEEE Transactions on Software Engineering48(10), 4214–4228 (2022)

Xiao, T., Wang, D., McIntosh, S., Hata, H., Kula, R.G., Ishio, T., Matsumoto, K.: Characterizing and Mitigating Self-Admitted Technical Debt in Build Systems. IEEE Transactions on Software Engineering48(10), 4214–4228 (2022). DOI 10.1109/TSE. 2021.3115772. URLhttps://ieeexplore.ieee.org/document/9551792/. TLDR: A qualitative analysis of 500 SATD comments ...

work page doi:10.1109/tse 2022

[75] [75]

In: 2008 1st International Conference on Software Testing, Verification, and Validation, pp

Zaidman, A., Van Rompaey, B., Demeyer, S., Van Deursen, A.: Mining Software Re- positories to Study Co-Evolution of Production & Test Code. In: 2008 1st International Conference on Software Testing, Verification, and Validation, pp. 220–229. IEEE, Lille- hammer (2008). DOI 10.1109/ICST.2008.47. URLhttps://ieeexplore.ieee.org/do cument/4539549/

work page doi:10.1109/icst.2008.47 2008

[76] [76]

Empirical Software Engineering16(3), 325–364 (2011)

Zaidman, A., Van Rompaey, B., van Deursen, A., Demeyer, S.: Studying the co-evolution of production and test code in open source and industrial developer test processes through repository mining. Empirical Software Engineering16(3), 325–364 (2011). DOI 10.1007/s10664-010-9143-7. URLhttps://doi.org/10.1007/s10664-010-9143-7. TLDR: This paper proposes three...

work page doi:10.1007/s10664-010-9143-7 2011

[77] [77]

In: Proceedings of the 15th International Conference on Mining Software Repositories, pp

Zampetti, F., Serebrenik, A., Di Penta, M.: Was self-admitted technical debt removal a real removal?: an in-depth perspective. In: Proceedings of the 15th International Conference on Mining Software Repositories, pp. 526–536. ACM, Gothenburg Sweden (2018). DOI 10.1145/3196398.3196423. URLhttps://dl.acm.org/doi/10.1145/31963 98.3196423

work page doi:10.1145/3196398.3196423 2018

[78] [78]

ACM Trans

Zhou, Y., Zhan, W., Li, Z., Han, T., Chen, T., Gall, H.: DRIVE: Dockerfile Rule Mining and Violation Detection. ACM Trans. Softw. Eng. Methodol.33(2), 30:1–30:23 (2023). DOI 10.1145/3617173. URLhttps://dl.acm.org/doi/10.1145/3617173

work page doi:10.1145/3617173 2023