pith. sign in

arxiv: 2604.06363 · v1 · submitted 2026-04-07 · 💻 cs.SE

A Survey of Algorithm Debt in Machine and Deep Learning Systems: Definition, Smells, and Future Work

Pith reviewed 2026-05-10 18:28 UTC · model grok-4.3

classification 💻 cs.SE
keywords Algorithm DebtTechnical DebtMachine LearningDeep LearningSoftware MaintenanceSurveySmells
0
0 comments X

The pith

Review of 42 studies expands Algorithm Debt definition and catalogs its smells in ML systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper surveys 42 primary studies to clarify Algorithm Debt as a type of Technical Debt that harms performance and scalability in Machine and Deep Learning systems. The work broadens the definition of AD, shows how it often appears without being named, lists its recognizable smells, and maps out open research questions. These results supply a clearer starting point for later projects that aim to reduce AD and make ML and DL systems easier to maintain.

Core claim

Analysis of the 42 studies yields an expanded definition of Algorithm Debt, evidence of its frequent implicit presence, a set of associated smells, and a list of future research directions that together support more reliable ML and DL systems.

What carries the argument

Systematic review of 42 primary studies that extracts, redefines, and classifies Algorithm Debt instances and their indicators.

If this is right

  • Developers gain a shared vocabulary for spotting Algorithm Debt before it degrades system performance.
  • The listed smells supply concrete targets for tools that detect and measure AD in ML pipelines.
  • Future empirical work can test how addressing the identified smells affects scalability and maintenance cost.
  • The future directions section supplies a ready agenda for targeted experiments on AD remediation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Teams building production ML systems could add the reported smells to their code review checklists.
  • The survey links Algorithm Debt to other technical debt types, suggesting combined management strategies.
  • Re-running the review with papers published after the original cutoff would test whether the smells remain stable.

Load-bearing premise

The 42 selected studies give a complete and unbiased picture of Algorithm Debt research in ML and DL systems.

What would settle it

A new literature search that locates many relevant studies omitted from the set of 42 and that reports Algorithm Debt behaviors or smells not described in this survey.

Figures

Figures reproduced from arXiv: 2604.06363 by Alex Potanin, Chirath Hettiarachchi, Emmanuel Iko-Ojo Simon, Fatemeh Fard, Hanna Suominen.

Figure 1
Figure 1. Figure 1: Overall methodology for our survey process. [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the search and selection process detailed in Section 3.3 [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: PRISMA diagram illustrating the selection process of the study, adapted from Moher et al. [54]. [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Distribution of selected papers categorised as control, primary and snowballed papers. [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Distribution of selected papers based on the venue. The others refer to the venues of the papers that we found from [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Coding framework for the SATD data Overall, this integrated approach of using the literature and dataset ensured that the literature-based deinition was supported with empirical components from the dataset, providing a comprehensive understanding of AD in ML/DL systems. The entire process is illustrated in [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: RQ1 Methodology, showing the literature review complemented by analysis of Liu et al. [49]’s dataset to derive the [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: Proportion of papers that explicitly mention AD Compared to those that refer to it implicitly through related concepts. [PITH_FULL_IMAGE:figures/full_fig_p022_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Classification of ML/DL code smells as AD items based on their characteristics. [PITH_FULL_IMAGE:figures/full_fig_p025_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Conceptual framework linking causes, smells, efects, and the proposed mitigation strategies for AD. [PITH_FULL_IMAGE:figures/full_fig_p029_11.png] view at source ↗
read the original abstract

The adoption of Machine and Deep Learning (ML/DL) technologies introduces maintenance challenges, leading to Technical Debt (TD). Algorithm Debt (AD) is a TD type that impacts the performance and scalability of ML/DL systems. A review of 42 primary studies expanded AD's definition, uncovered its implicit presence, identified its smells, and highlighted future directions. These findings will guide an AD-focused study, enhancing the reliability of ML/DL systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a survey of 42 primary studies on Algorithm Debt (AD) as a form of Technical Debt in Machine and Deep Learning (ML/DL) systems. It expands the definition of AD, identifies its implicit presence across the literature, catalogs associated smells, and outlines future research directions to enhance the reliability and maintainability of ML/DL systems.

Significance. If the review methodology is systematic and the 42 studies are representative, the work offers a useful synthesis of an under-explored TD subtype specific to ML/DL. Explicitly naming smells and future directions provides concrete guidance for subsequent empirical studies, which could improve system performance and scalability in practice.

major comments (2)
  1. [§3] §3 (Research Method): The search strategy, databases queried, keywords, and inclusion/exclusion criteria are not described in sufficient detail. Without this information it is impossible to assess whether the selection of exactly 42 primary studies is complete and unbiased, which directly underpins the claims of an expanded definition, implicit presence, and smell identification.
  2. [§4] §4 (Results): The mapping from individual primary studies to the newly identified smells and the expanded AD definition is not provided (e.g., no table or appendix listing which studies support each smell). This traceability gap prevents verification that the synthesis accurately reflects the reviewed literature rather than interpretive overreach.
minor comments (2)
  1. [Abstract] The abstract states the number of studies and high-level outcomes but omits any reference to the review protocol; adding one sentence on the systematic approach would improve transparency without lengthening the abstract excessively.
  2. [§2] Notation for 'smells' is introduced without an explicit definition or comparison to the well-known TD smell literature (e.g., Fowler's code smells); a short clarifying paragraph in §2 would aid readers unfamiliar with the TD subfield.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive review of our manuscript. We address each major comment below and outline the revisions we will make to strengthen the paper.

read point-by-point responses
  1. Referee: [§3] §3 (Research Method): The search strategy, databases queried, keywords, and inclusion/exclusion criteria are not described in sufficient detail. Without this information it is impossible to assess whether the selection of exactly 42 primary studies is complete and unbiased, which directly underpins the claims of an expanded definition, implicit presence, and smell identification.

    Authors: We agree that §3 requires substantially more detail to enable evaluation of the selection process. In the revised manuscript we will expand the section to describe the full systematic review protocol, including the specific databases searched, the complete search strings and keywords, the inclusion/exclusion criteria with explicit justifications, and a PRISMA-style flow diagram showing how the final set of 42 studies was obtained. These additions will make the sample selection transparent and allow readers to assess completeness and potential bias. revision: yes

  2. Referee: [§4] §4 (Results): The mapping from individual primary studies to the newly identified smells and the expanded AD definition is not provided (e.g., no table or appendix listing which studies support each smell). This traceability gap prevents verification that the synthesis accurately reflects the reviewed literature rather than interpretive overreach.

    Authors: We accept this point on traceability. We will add a new table (or appendix) in the revised §4 that explicitly maps each of the 42 primary studies to the specific elements of the expanded Algorithm Debt definition and to the smells they support. This mapping will be derived directly from the data extracted during the review and will allow independent verification of the synthesis. revision: yes

Circularity Check

0 steps flagged

No significant circularity; survey of external studies

full rationale

This paper is a literature survey that synthesizes findings from 42 external primary studies to expand the definition of Algorithm Debt, identify its implicit presence and smells, and outline future directions. No equations, fitted parameters, predictions, or derivations are present. The central claims rest on systematic review of outside literature rather than any self-referential reduction, self-citation chain, or renaming of results by construction. Standard survey methodology is followed with no load-bearing internal logic that collapses to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a survey paper, the work relies on standard literature review practices and the content of the 42 cited studies without introducing new free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5384 in / 1134 out tokens · 65864 ms · 2026-05-10T18:28:47.112687+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

90 extracted references · 90 canonical work pages

  1. [1]

    Danyllo Albuquerque, Everton Guimaraes, Graziela Tonin, Mirko Perkusich, Hyggo Almeida, and Angelo Perkusich. 2022. Comprehend- ing the use of intelligent techniques to support technical debt management. In Proceedings of the International Conference on Technical Debt. 21ś30

  2. [2]

    Abdullah Aldaeej and Mohammad Alshayeb. 2024. Familiarity, Common Causes and Efects of Technical Debt: A Replicated Study in the Saudi Software Industry. Arabian Journal for Science and Engineering 49, 3 (2024), 4459ś4477

  3. [3]

    Nicolli SR Alves, Thiago S Mendes, Manoel G de Mendonça, Rodrigo O Spínola, Forrest Shull, and Carolyn Seaman. 2016. Identiication and management of technical debt: A systematic mapping study. Information and Software Technology 70 (2016), 100ś121

  4. [4]

    Alves, Leilane F

    Nicolli S.R. Alves, Leilane F. Ribeiro, Vivyane Caires, Thiago S. Mendes, and Rodrigo O. Spínola. 2014. Towards an Ontology of Terms on Technical Debt. In International Workshop on Managing Technical Debt . 1ś7

  5. [5]

    Chintan Amrit and Ashwini Kolar Narayanappa. 2025. An analysis of the challenges in the adoption of MLOps. Journal of Innovation & Knowledge 10, 1 (2025), 100637

  6. [6]

    Paris Avgeriou, Philippe Kruchten, Ipek Ozkaya, and Carolyn Seaman. 2016. Managing technical debt in software engineering (dagstuhl seminar 16162). In Dagstuhl reports, Vol. 6. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik

  7. [7]

    Hideaki Azuma, Shinsuke Matsumoto, Yasutaka Kamei, and Shinji Kusumoto. 2022. An empirical study on self-admitted technical debt in Dockeriles. Empirical Software Engineering 27, 2 (2022), 49

  8. [8]

    Tita Alissa Bach, Amna Khan, Harry Hallock, Gabriela Beltrão, and Sonia Sousa. 2022. A systematic literature review of user trust in AI-enabled systems: An HCI perspective. International Journal of HumanśComputer Interaction (2022), 1ś16

  9. [9]

    Mrwan BenIdris. 2018. Investigate, identify and estimate the technical debt: a systematic mapping study. Econometrics: Econometric & Statistical Methods - Special Topics eJournal (2018). https://api.semanticscholar.org/CorpusID:53070347

  10. [10]

    Terese Besker, Antonio Martini, and Jan Bosch. 2018. Managing architectural technical debt: A uniied model and systematic literature review. Journal of Systems and Software 135 (2018), 1ś16

  11. [11]

    Aaditya Bhatia, Foutse Khomh, Bram Adams, and Ahmed E Hassan. 2023. An Empirical Study of Self-Admitted Technical Debt in Machine Learning Software. arXiv preprint arXiv:2311.12019 (2023)

  12. [12]

    Aaditya Bhatia, Foutse Khomh, Bram Adams, and Ahmed E Hassan. 2024. An empirical study of self-admitted technical debt in machine learning software. ACM Transaction on Software Engineering Methodologies 1, 1 (2024)

  13. [13]

    João Paulo Biazotto, Daniel Feitosa, Paris Avgeriou, and Elisa Yumi Nakagawa. 2024. Technical debt management automation: State of the art and future perspectives. Information and Software Technology 167, C (March 2024), 20 pages. doi:10.1016/j.infsof.2023.107375

  14. [14]

    Sumon Biswas and Hridesh Rajan. 2020. Do the machine learning models on a crowd sourced platform exhibit bias? an empirical study on model fairness. In Proceedings of the 28th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering . 642ś653. ACM Comput. Surv. A Survey of Algorithm Debt in Mac...

  15. [15]

    Justus Bogner, Roberto Verdecchia, and Ilias Gerostathopoulos. 2021. Characterizing technical debt and antipatterns in ai-based systems: A systematic mapping study. In 2021 IEEE/ACM International Conference on Technical Debt (TechDebt) . IEEE, 64ś73

  16. [16]

    Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology 3, 2 (2006), 77ś101

  17. [17]

    Eric Breck, Shanqing Cai, Eric Nielsen, Michael Salib, and D Sculley. 2017. The ML test score: A rubric for ML production readiness and technical debt reduction. In 2017 IEEE International Conference on Big Data (Big Data) . IEEE, 1123ś1132

  18. [18]

    Pearl Brereton, Barbara A Kitchenham, David Budgen, Mark Turner, and Mohamed Khalil. 2007. Lessons from applying the systematic literature review process within the software engineering domain. Journal of systems and software 80, 4 (2007), 571ś583

  19. [19]

    Nicolás Cardozo, Ivana Dusparic, and Christian Cabrera. 2023. Prevalence of Code Smells in Reinforcement Learning Projects. arXiv preprint arXiv:2303.10236 (2023)

  20. [20]

    Bruno Cartaxo, Gustavo Pinto, and Sergio Soares. 2020. Rapid reviews in software engineering. In Contemporary Empirical Methods in Software Engineering. Springer, 357ś384

  21. [21]

    Dev Kumar Chaudhary, Sandeep Srivastava, and Vikas Kumar. 2018. A review on hidden debts in machine learning systems. In 2018 Second International Conference on Green Computing and Internet of Things (ICGCIoT) . IEEE, 619ś624

  22. [22]

    Md Shain Shahid Chowdhury, Md Naseef Ur Rahman Chowdhury, Fariha Ferdous Neha, and Ahshanul Haque. 2024. AI-Powered Code Reviews: Leveraging Large Language Models. In 2024 International Conference on Signal Processing and Advance Research in Computing (SPARC), Vol. 1. IEEE, 1ś6

  23. [23]

    Jacob Cohen. 1960. A coeicient of agreement for nominal scales. Educational and Psychological Measurement 20, 1 (1960), 37ś46

  24. [24]

    Pierre-Olivier Côté, Amin Nikanjam, Rached Bouchoucha, Ilan Basta, Mouna Abidi, and Foutse Khomh. 2023. Quality issues in machine learning software systems. arXiv preprint arXiv:2306.15007 (2023)

  25. [25]

    MAJ Iain Cruickshank and MAJ Shane Kohtz. 2023. Acquiring Maintainable AI-Enabled Systems . Technical Report. Acquisition Research Program

  26. [26]

    Ward Cunningham. 1992. The WyCash portfolio management system. ACM SIGPLAN OOPS Messenger 4, 2 (1992), 29ś30

  27. [27]

    Diego Augusto de Jesus Pacheco, Carla Schwengber ten Caten, Carlos Fernando Jung, Claudio Sassanelli, and Sergio Terzi. 2019. Overcoming barriers towards Sustainable Product-Service Systems in Small and Medium-sized enterprises: State of the art and a novel Decision Matrix. Journal of Cleaner Production 222 (2019), 903ś921

  28. [28]

    J. L. Fleiss. 1971. Measuring nominal scale agreement among many raters. Psychological Bulletin 76, 5 (1971), 378Ð-382

  29. [29]

    Francesca Arcelli Fontana, Vincenzo Ferme, and Stefano Spinelli. 2012. Investigating the impact of code smells debt on quality code evaluation. In 2012 Third International Workshop on Managing Technical Debt (MTD) . IEEE, 15ś22

  30. [30]

    Naiseh Ghasemi, Jon Alvarez Justo, Marco Celesti, Laurent Despoisse, and Jens Nieke. 2025. Onboard processing of hyperspectral imagery: Deep learning advancements, methodologies, challenges, and emerging trends. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2025)

  31. [31]

    Görkem Giray. 2021. A software engineering perspective on engineering machine learning systems: State of the art and challenges. Journal of Systems and Software 180 (2021), 111031

  32. [32]

    Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep learning. Vol. 1. MIT press Cambridge

  33. [33]

    Lorenz Graf-Vlachy and Stefan Wagner. 2024. Diferent Debt: An Addition to the technical debt dataset and a demonstration using developer personality. In Proceedings of the 7th ACM/IEEE International Conference on Technical Debt . 31ś35

  34. [34]

    Khan Mohammad Habibullah, Hans-Martin Heyn, Gregory Gay, Jennifer Horkof, Eric Knauss, Markus Borg, Alessia Knauss, Håkan Sivencrona, and Polly Jing Li. 2024. Requirements and software engineering for automotive perception systems: an interview study. Requirements Engineering (2024), 1ś24

  35. [35]

    Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2020. Taxonomy of real faults in deep learning systems. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering . 1110ś1121

  36. [36]

    Clemente Izurieta, Ipek Ozkaya, Carolyn Seaman, Philippe Kruchten, Robert Nord, Will Snipes, and Paris Avgeriou. 2016. Perspectives on managing technical debt: A transition point and roadmap from Dagstuhl. In Joint of the 4th International Workshop on Quantitative Approaches to Software Quality, QuASoQ 2016 and 1st International Workshop on Technical Debt...

  37. [37]

    Li Jia, Hao Zhong, Xiaoyin Wang, Linpeng Huang, and Xuansheng Lu. 2021. The symptoms, causes, and repairs of bugs inside a deep learning library. Journal of Systems and Software 177 (2021), 110935

  38. [38]

    Wenxin Jiang, Vishnu Banna, Naveen Vivek, Abhinav Goel, Nicholas Synovic, George K Thiruvathukal, and James C Davis. 2024. Challenges and practices of deep learning model reengineering: A case study on computer vision. Empirical Software Engineering 29, 6 (2024), 142

  39. [39]

    AKM Kamruzzaman Khan. 2024. AI in Finance Disruptive Technologies and Emerging Opportunities. Journal of Artiicial Intelligence General science (JAIGS) ISSN: 3006-4023 3, 1 (2024), 155ś170

  40. [40]

    Barbara Kitchenham and Pearl Brereton. 2013. A systematic review of systematic review process research in software engineering. Information and software technology 55, 12 (2013), 2049ś2075

  41. [41]

    B Kitchenham and S Charters. 2007. Guidelines for performing systematic literature reviews in software engineering (ebse technical report version 2.3, ebse-2007-01). Technical Report. Technical report, Keele University, University of Durham, Keele, United Kingdom. ACM Comput. Surv. 34 • E. Simon et al. [43] B.A. Kitchenham, T. Dyba, and M. Jorgensen. 2004...

  42. [42]

    Barbara Kitchenham, Lech Madeyski, and David Budgen. 2022. How should software engineering secondary studies include grey material? IEEE Transactions on Software Engineering 49, 2 (2022), 872ś882

  43. [43]

    Kmet, Robert C

    Leanne M. Kmet, Robert C. Lee, and Linda S. Cook. 2004. Standard quality assessment criteria for evaluating primary research papers from a variety of ields . Technical Report HTA Initiative No. 13. Alberta Heritage Foundation for Medical Research, Edmonton, Alberta, Canada. https://ia601308.us.archive.org/33/items/standardqualitya00kmet_0/standardqualitya...

  44. [44]

    Valentina Lenarduzzi, Terese Besker, Davide Taibi, Antonio Martini, and Francesca Arcelli Fontana. 2021. A systematic literature review on technical debt prioritization: Strategies, processes, factors, and tools. Journal of Systems and Software 171 (2021), 110827

  45. [45]

    Xiang Li, Shuo Chen, Xiaolin Hu, and Jian Yang. 2019. Understanding the Disharmony Between Dropout and Batch Normalization by Variance Shift. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . 2677ś2685. doi:10.1109/CVPR.2019.00279

  46. [46]

    Chao Liu, Runfeng Cai, Yiqun Zhou, Xin Chen, Haibo Hu, and Meng Yan. 2024. Understanding the implementation issues when using deep learning frameworks. Information and Software Technology 166 (2024), 107367

  47. [47]

    Jiakun Liu, Qiao Huang, Xin Xia, Emad Shihab, David Lo, and Shanping Li. 2020. Is using deep learning frameworks free? characterizing technical debt in deep learning frameworks. In ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Society. 1ś10

  48. [48]

    Jiakun Liu, Qiao Huang, Xin Xia, Emad Shihab, David Lo, and Shanping Li. 2021. An exploratory study on the introduction and removal of diferent types of technical debt in deep learning frameworks. Empirical Software Engineering 26 (2021), 1ś36

  49. [49]

    Patruno Liugi. 2019. The ultimate guide to model retraining . https://mlinproduction.com/model-retraining/

  50. [50]

    Francesca Lonetti, Antonia Bertolino, and Felicita Di Giandomenico. 2023. Model-based security testing in IoT systems: A rapid review. Information and Software Technology (2023), 107326

  51. [51]

    Philipp Mayring. 2000. Qualitative content analysis. Forum: Qualitative Social Research 1, 2 (2000), 20. https://www.qualitative- research.net/index.php/fqs/article/view/1089

  52. [52]

    David Moher, Lesley Stewart, and Paul Shekelle. 2015. All in the family: systematic reviews, rapid reviews, scoping reviews, realist reviews, and more. Systematic reviews 4, 1 (2015), 1ś2

  53. [53]

    Mohammad Mehdi Morovati, Amin Nikanjam, Foutse Khomh, and Zhen Ming Jiang. 2023. Bugs in machine learning-based systems: a faultload benchmark. Empirical Software Engineering 28, 3 (2023), 62

  54. [54]

    Erica Mourão, Marcos Kalinowski, Leonardo Murta, Emilia Mendes, and Claes Wohlin. 2017. Investigating the use of a hybrid search strategy for systematic reviews. In 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) . IEEE, 193ś198

  55. [55]

    Erica Mourão, João Felipe Pimentel, Leonardo Murta, Marcos Kalinowski, Emilia Mendes, and Claes Wohlin. 2020. On the performance of hybrid search strategies for systematic literature reviews in software engineering. Information and software technology 123 (2020), 106294

  56. [56]

    Aiswarya Munappy, Jan Bosch, Helena Holmström Olsson, Anders Arpteg, and Björn Brinne. 2019. Data management challenges for deep learning. In 2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA) . IEEE, 140ś147

  57. [57]

    Tolga Muratdağı. 2024. IDENTIFYING TECHNICAL DEBT AND TOOLS FOR TECHNICAL DEBT MANAGEMENT IN SOFTW ARE DEVELOPMENT. (2024)

  58. [58]

    Nadia Nahar, Shurui Zhou, Grace Lewis, and Christian Kästner. 2022. Collaboration challenges in building ml-enabled systems: Communication, documentation, engineering, and process. In Proceedings of the 44th international conference on software engineering . 413ś425

  59. [59]

    Roger Nazir, Alessio Bucaioni, and Patrizio Pelliccione. 2024. Architecting ML-enabled systems: Challenges, best practices, and design decisions. Journal of Systems and Software 207 (2024), 111860

  60. [60]

    Amin Nikanjam and Foutse Khomh. 2021. Design smells in Deep Learning programs: an empirical study. In 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME) . IEEE, 332ś342

  61. [61]

    Nikolaos Nikolaidis, Nikolaos Mittas, Apostolos Ampatzoglou, Elvira-Maria Arvanitou, and Alexander Chatzigeorgiou. 2023. Assessing TD Macro-Management: A Nested Modeling Statistical Approach. IEEE Transactions on Software Engineering 49, 4 (2023), 2996ś3007

  62. [62]

    David OBrien, Sumon Biswas, Sayem Imtiaz, Rabe Abdalkareem, Emad Shihab, and Hridesh Rajan. 2022. 23 shades of self-admitted technical debt: an empirical study on machine learning software. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering . 734ś746

  63. [63]

    Andrei Paleyes, Raoul-Gabriel Urma, and Neil D Lawrence. 2022. Challenges in deploying ML: a survey of case studies. ACM computing surveys 55, 6 (2022), 1ś29

  64. [64]

    Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Fausto Fasano, Rocco Oliveto, and Andrea De Lucia. 2018. On the difuseness and the impact on maintainability of code smells: a large scale empirical investigation. In Proceedings of the 40th International Conference on Software Engineering. 482ś482. ACM Comput. Surv. A Survey of Algorithm Debt in Mach...

  65. [65]

    Federica Pepe, Fiorella Zampetti, Antonio Mastropaolo, Gabriele Bavota, and Massimiliano Di Penta. 2024. A Taxonomy of Self-Admitted Technical Debt in Deep Learning Systems, In 40th IEEE International Conference on Software Maintenance and Evolution (ICSME 2024). arXiv preprint arXiv:2409.11826

  66. [66]

    Jennie Popay, Helen Roberts, Amanda Sowden, Mark Petticrew, Lisa Arai, Mark Rodgers, Nicky Britten, Katrina Roen, Steven Dufy, et al. 2006. Guidance on the conduct of narrative synthesis in systematic reviews. A product from the ESRC methods programme Version 1, 1 (2006), b92

  67. [67]

    Aniket Potdar and Emad Shihab. 2014. An exploratory study on self-admitted technical debt. In International Conference on Software Maintenance and Evolution. IEEE, 91ś100

  68. [68]

    Gilberto Recupito, Fabiano Pecorelli, Gemma Catolino, Valentina Lenarduzzi, Davide Taibi, Dario Di Nucci, and Fabio Palomba. 2024. Technical debt in AI-enabled systems: On the prevalence, severity, impact, and management strategies for code and architecture. Journal of Systems and Software (2024), 112151

  69. [69]

    Per Runeson and Martin Höst. 2009. Guidelines for conducting and reporting case study research in software engineering. Empirical software engineering 14 (2009), 131ś164

  70. [70]

    Hina Saeeda, Muhammad Ovais Ahamd, and Tomas Gustavsson. 2024. A Multivocal Literature Review on Non-Technical Debt in Software Development: An Insight into Process, Social, People, Organizational, and Culture Debt. e-Informatica Software Engineering Journal 18, 1 (2024), 240101

  71. [71]

    David Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean- Francois Crespo, and Dan Dennison. 2015. Hidden technical debt in machine learning systems. Advances in neural information processing systems 28 (2015)

  72. [72]

    Rishab Sharma, Ramin Shahbazi, Fatemeh H Fard, Zadia Codabux, and Melina Vidoni. 2022. Self-admitted technical debt in R: detection and causes. Automated Software Engineering 29, 2 (2022), 53

  73. [73]

    Karthik Shivashankar and Antonio Martini. 2022. Maintainability Challenges in ML: A Systematic Literature Review. In 2022 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA) . IEEE, 60ś67

  74. [74]

    Emmanuel Iko-Ojo Simon. 2025. Characterising Algorithm Debt in Machine and Deep Learning Systems. In 2025 IEEE/ACM 47th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). 216ś217. doi:10.1109/ICSE-Companion66252. 2025.00066

  75. [75]

    Emmanuel Iko-Ojo Simon, Chirath Hettiarachchi, Alex Potanin, Hanna Suominen, and Fatemeh Fard. 2024. Automated Detection of Algorithm Debt in Deep Learning Frameworks: An Empirical Study. In Registered Report Track at the 40th IEEE International Conference on Software Maintenance and Evolution (ICSME 2024), October 6-11 . arXiv:2408.10529 [cs.SE]

  76. [76]

    Emmanuel Iko-Ojo Simon, Melina Vidoni, and Fatemeh H Fard. 2023. Algorithm Debt: Challenges and Future Paths. In 2023 IEEE/ACM 2nd International Conference on AI EngineeringśSoftware Engineering for AI (CAIN) . IEEE, 90ś91

  77. [77]

    Xiaobing Sun, Tianchi Zhou, Gengjie Li, Jiajun Hu, Hui Yang, and Bin Li. 2017. An empirical study on real bugs for machine learning programs. In 2017 24th Asia-Paciic Software Engineering Conference (APSEC) . IEEE, 348ś357

  78. [78]

    Yiming Tang, Rai Khatchadourian, Mehdi Bagherzadeh, Rhia Singh, Ajani Stewart, and Anita Raja. 2021. An empirical study of refactorings and technical debt in Machine Learning systems. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 238ś250

  79. [79]

    Bart Van Oort, Luís Cruz, Maurício Aniche, and Arie Van Deursen. 2021. The prevalence of code smells in machine learning projects. In 2021 IEEE/ACM 1st Workshop on AI Engineering-Software Engineering for AI (W AIN). IEEE, 1ś8

  80. [80]

    Tatiana Castro Vélez, Rai Khatchadourian, Mehdi Bagherzadeh, and Anita Raja. 2022. Challenges in migrating imperative deep learning programs to graph execution: an empirical study. In Proceedings of the 19th International Conference on Mining Software Repositories . 469ś481

Showing first 80 references.