arxiv: 2604.16371 · v1 · submitted 2026-03-23 · 💻 cs.SE

Recognition: 1 theorem link

· Lean Theorem

A Systematic Review of MLOps Tools: Tool Adoption, Lifecycle Coverage, and Critical Insights

Zakkarija Micallef , Keerthiga Rajenthiram , Ilias Gerostathopoulos

Authors on Pith no claims yet

Pith reviewed 2026-05-15 00:40 UTC · model grok-4.3

classification 💻 cs.SE

keywords MLOpstool adoptionlifecycle coveragesystematic revieworchestration frameworksexperiment trackingdata versioningcloud platforms

0 comments

The pith

No single MLOps tool covers the full development lifecycle, so practitioners combine multiple specialized tools into pipelines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This systematic review examines academic papers on MLOps tools to map their coverage across the machine learning lifecycle stages. It finds that orchestration frameworks, data versioning systems, experiment tracking platforms, and managed cloud services appear most often in reported workflows. The review shows that each tool addresses only a subset of needed functions, leading teams to stitch together combinations that introduce their own integration challenges. A sympathetic reader would care because the absence of a complete solution affects how quickly organizations can move models from experiment to reliable production use.

Core claim

The paper establishes that academic literature on MLOps consistently reports the use of multiple tools rather than any single integrated platform. Tools are mapped to lifecycle components such as data preparation, model training, deployment, monitoring, and governance. The most frequently cited categories are orchestration frameworks for pipeline management, data versioning for reproducibility, experiment tracking for comparing runs, and managed cloud platforms for scalable execution. Reported benefits include improved reproducibility and collaboration, while limitations center on interoperability gaps and the overhead of maintaining several distinct systems.

What carries the argument

The systematic mapping of individual MLOps tools onto the stages of the machine learning lifecycle, which reveals partial coverage and the resulting need for tool combinations.

If this is right

Teams building MLOps pipelines must allocate time and expertise to integrate tools across stages instead of relying on one vendor solution.
Interoperability standards between tools become a practical requirement for reducing maintenance overhead.
Tool selection decisions should prioritize coverage gaps identified in the lifecycle mapping rather than feature lists alone.
Future tool development is likely to focus on filling the uncovered lifecycle stages or improving seamless hand-offs between existing components.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Organizations may benefit from internal audits that track which lifecycle stages remain unsupported by their current tool stack.
The pattern of tool combination could drive demand for open interfaces that let new specialized tools plug into existing pipelines without custom glue code.
If adoption patterns shift toward fewer but broader platforms, the review's findings would need updating through repeated literature scans.

Load-bearing premise

The academic papers reviewed accurately reflect the tools and challenges that real practitioners encounter in production MLOps work.

What would settle it

A large-scale practitioner survey or industry benchmark that shows most production pipelines rely on a single integrated platform rather than combinations of separate tools.

Figures

Figures reproduced from arXiv: 2604.16371 by Ilias Gerostathopoulos, Keerthiga Rajenthiram, Zakkarija Micallef.

**Figure 2.** Figure 2: Heatmap of MLOps tools mapped to their corresponding pipeline components adapted from Najafabadi et al.[ [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

read the original abstract

Machine Learning Operations (MLOps) has become increasingly critical as more organisations move ML models into production. However, the growing landscape of MLOps solutions has introduced complexity for practitioners trying to select appropriate tools. To investigate how and why these tools are adopted in practice, this paper conducts a systematic review of the academic literature focused on MLOps tools. We map tools to MLOps lifecycle components to reveal their function, scope, and the challenges they are designed to address. We identify usage trends and synthesise reported benefits and limitations. The most commonly used components, according to the findings, are orchestration frameworks, data versioning, experiment tracking, and managed cloud platforms. No single tool covers the entire lifecycle, so researchers often combine multiple tools to build complete pipelines. This highlights the importance of interoperability across MLOps tools in real-world MLOps pipelines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper presents a systematic review of academic literature on MLOps tools. It maps identified tools to MLOps lifecycle components, extracts usage trends, and synthesizes reported benefits and limitations. The central claims are that no single tool covers the full lifecycle (leading practitioners to combine multiple tools) and that the most commonly used components are orchestration frameworks, data versioning, experiment tracking, and managed cloud platforms.

Significance. If the synthesis is methodologically sound and representative, the review would provide a practical reference for tool selection and interoperability needs in MLOps pipelines. It could help researchers and practitioners identify coverage gaps and common integration patterns across the lifecycle.

major comments (3)

[Abstract / Methods] The abstract (and methods section) provides no details on search strategy, databases, inclusion/exclusion criteria, screening process, or total number of papers reviewed. Without these, the extracted usage trends and the claim that specific components are 'most commonly used' cannot be evaluated for completeness or bias.
[Results / Discussion] The central claim that academic literature reflects real-world tool adoption and lifecycle coverage rests on the unexamined assumption that papers are representative of production practice. Academic prototypes often favor research-oriented tools, creating selection bias that undermines the interoperability conclusion and the mapping of 'most common' components.
[Mapping and Synthesis] The paper does not report how tools were classified into lifecycle components or how conflicts in reported benefits/limitations were resolved. This makes the synthesis of challenges and the 'no single tool covers the entire lifecycle' finding difficult to reproduce or verify.

minor comments (2)

[Abstract] The abstract states findings without quantifying them (e.g., exact counts or percentages of papers mentioning each component). Adding these numbers would strengthen the usage-trend claims.
[Methods] No mention of quality assessment of included studies or risk-of-bias evaluation, which is standard in systematic reviews and would help readers gauge the reliability of synthesized limitations.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. We agree that the methods section requires substantial expansion for transparency and reproducibility. We will revise the manuscript to include detailed search strategy, classification procedures, and a limitations discussion on academic versus industry representativeness while preserving the core findings.

read point-by-point responses

Referee: [Abstract / Methods] The abstract (and methods section) provides no details on search strategy, databases, inclusion/exclusion criteria, screening process, or total number of papers reviewed. Without these, the extracted usage trends and the claim that specific components are 'most commonly used' cannot be evaluated for completeness or bias.

Authors: We agree that the current version lacks these details. In the revised manuscript we will add a full methods section describing the search strategy (databases including IEEE Xplore, ACM, Scopus and Google Scholar), inclusion/exclusion criteria, PRISMA-based screening process, and the exact number of papers identified, screened and included. This will allow readers to evaluate completeness and bias in the usage trends and identification of the most common components. revision: yes
Referee: [Results / Discussion] The central claim that academic literature reflects real-world tool adoption and lifecycle coverage rests on the unexamined assumption that papers are representative of production practice. Academic prototypes often favor research-oriented tools, creating selection bias that undermines the interoperability conclusion and the mapping of 'most common' components.

Authors: We acknowledge the risk of selection bias. While many included papers report production deployments, academic literature may over-represent research-oriented tools. We will add an explicit limitations subsection discussing this bias and its implications for generalising the 'most common' components and interoperability conclusions. We will qualify the claims but retain the core observation that academic reports show no single tool covering the full lifecycle. revision: partial
Referee: [Mapping and Synthesis] The paper does not report how tools were classified into lifecycle components or how conflicts in reported benefits/limitations were resolved. This makes the synthesis of challenges and the 'no single tool covers the entire lifecycle' finding difficult to reproduce or verify.

Authors: We agree that the classification and conflict-resolution procedures must be documented. The revised methods section will describe the lifecycle framework used for mapping (based on standard MLOps stages from the literature) and the process for resolving conflicts in benefits/limitations (author consensus on the most frequently reported aspects). This will make the synthesis, including the no-single-tool finding, reproducible. revision: yes

Circularity Check

0 steps flagged

No circularity: literature synthesis aggregates external sources without self-referential reduction

full rationale

The paper is a systematic review that maps MLOps tools to lifecycle components by aggregating usage trends, benefits, and limitations reported in external academic literature. No equations, fitted parameters, derivations, or self-citations appear in the provided text. The central claim (no single tool covers the full lifecycle; common components are orchestration, data versioning, experiment tracking, and managed cloud platforms) is presented as a synthesis of reviewed papers rather than a quantity defined by the present work's own inputs. This satisfies the default expectation of no significant circularity for non-derivational papers.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that the reviewed academic papers accurately capture real-world tool usage and that standard systematic review practices were applied to select and synthesize them. No free parameters or invented entities are introduced.

axioms (1)

domain assumption Standard systematic review methodology was followed for literature selection, mapping, and synthesis of benefits and limitations.
Invoked by the statement that the paper conducts a systematic review focused on MLOps tools and maps them to lifecycle components.

pith-pipeline@v0.9.0 · 5460 in / 1212 out tokens · 31689 ms · 2026-05-15T00:40:25.765508+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

No single tool covers the entire lifecycle, so researchers often combine multiple tools to build complete pipelines. The most commonly used components are orchestration frameworks, data versioning, experiment tracking, and managed cloud platforms.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

[1]

[n. d.]. The Winding Road to Better Machine Learning Infrastructure Through Tensorflow Extended and Kubeflow. https://engineering.atspotify.com/2019/ 12/the-winding-road-to-better-machine-learning-infrastructure-through- tensorflow-extended-and-kubeflow

work page 2019
[2]

2023.Building an Automated MLOps Pipeline and Recommending an Open-Source Stack to Deploy a Machine Learning Application

William Inouye Almeida. 2023.Building an Automated MLOps Pipeline and Recommending an Open-Source Stack to Deploy a Machine Learning Application. Master’s thesis. Universidade do Porto (Portugal)

work page 2023
[3]

Apostolos Ampatzoglou, Stamatia Bibi, Paris Avgeriou, Marijn Verbeek, and Alexander Chatzigeorgiou. 2019. Identifying, categorizing and mitigating threats to validity in software engineering secondary studies.Information and Software Technology106 (2019), 201–230. doi:10.1016/j.infsof.2018.10.006

work page doi:10.1016/j.infsof.2018.10.006 2019
[4]

Vidushi Arora. 2024. Exploring real-world challenges in MLOps implementation: a case study approach to design effective data pipelines. (2024)

work page 2024
[5]

Michal Bacigál. 2024. Design and Implementation of Machine Learning Opera- tions. (2024)

work page 2024
[6]

Rahul Bagai, Ankit Masrani, Piyush Ranjan, Madhavi Najana, and Ankit Mas- rani. 2024. Implementing Continuous Integration and Deployment (CI/CD) for Machine Learning Models on AWS.International Journal of Global Innovations and Solutions (IJGIS)(2024). doi:10.21428/e90189c8.9cb39c55

work page doi:10.21428/e90189c8.9cb39c55 2024
[7]

Lisana Berberi, Valentin Kozlov, Giang Nguyen, Judith Sáinz-Pardo Díaz, Amanda Calatrava, Germán Moltó, Viet Tran, and Álvaro López García. 2025. Machine learning operations landscape: platforms and tools.Artificial Intelligence Review 58, 6 (March 2025), 167. doi:10.1007/s10462-025-11164-3

work page doi:10.1007/s10462-025-11164-3 2025
[8]

Ralph Bergmann, Felix Theusch, Paul Heisterkamp, and Narek Grigoryan. 2024. Comparative Analysis of Open-Source ML Pipeline Orchestration Platforms. (2024)

work page 2024
[9]

Anas Bodor, Meriem Hnida, and Daoudi Najima. 2023. From Development to Deployment: An Approach to MLOps Monitoring for Machine Learning Model Operationalization. In2023 14th International Conference on Intelligent Systems: Theories and Applications (SITA). 1–7. doi:10.1109/SITA60746.2023.10373733

work page doi:10.1109/sita60746.2023.10373733 2023
[10]

Anas Bodor, Meriem Hnida, and Daoudi Najima. 2023. MLOps: Overview of Current State and Future Directions. InInnovations in Smart Cities Applications Volume 6. Springer, Cham, 156–165. doi:10.1007/978-3-031-26852-6_14

work page doi:10.1007/978-3-031-26852-6_14 2023
[11]

Burgueño-Romero, Cristóbal Barba-González, and José F

Antonio M. Burgueño-Romero, Cristóbal Barba-González, and José F. Aldana- Montes. 2025. Big Data-driven MLOps workflow for annual high-resolution land cover classification models.Future Generation Computer Systems163 (2025), 107499. doi:10.1016/j.future.2024.107499

work page doi:10.1016/j.future.2024.107499 2025
[12]

Ji-hyun Cha, Heung-gyun Jeong, Seung-woo Han, Dong-chul Kim, Jung-hun Oh, Seok-hee Hwang, and Byeong-ju Park. 2023. Development of MLOps Platform Based on Power Source Analysis for Considering Manufacturing Environment Changes in Real-Time Processes. InHuman-Computer Interaction. Springer, Cham, 224–236. doi:10.1007/978-3-031-35572-1_15

work page doi:10.1007/978-3-031-35572-1_15 2023
[13]

Swati Choudhary. 2021. Kubernetes-Based Architecture For An On-premises Machine Learning Platform. (2021)

work page 2021
[14]

Thomas Davenport and Katie Malone. 2021. Deployment as a Critical Business Data Science Discipline.Harvard Data Science Review3, 1 (2021)

work page 2021
[15]

2023.Machine learning operations – domain analysis, reference architecture, and example implementation

Daniel Deutsch. 2023.Machine learning operations – domain analysis, reference architecture, and example implementation

work page 2023
[16]

Christof Ebert, Gorka Gallardo, Josune Hernantes, and Nicolas Serrano. 2016. DevOps.IEEE Software33, 3 (2016), 94–100. doi:10.1109/MS.2016.68

work page doi:10.1109/ms.2016.68 2016
[17]

Kanwarpartap Singh Gill, Vatsala Anand, Rahul Chauhan, Ruchira Rawat, and Pao-Ann Hsiung. 2023. Utilization of Kubeflow for Deploying Machine Learning Models Across Several Cloud Providers. In2023 3rd International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON). 1–7. doi:10.1109/SMARTGENCON60755.2023.10442069

work page doi:10.1109/smartgencon60755.2023.10442069 2023
[18]

Google Cloud Tech. 2020. Introduction to Kubeflow. https://www.youtube.com/ watch?v=cTZArDgbIWw

work page 2020
[19]

Kim Hazelwood, Sarah Bird, David Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, James Law, Kevin Lee, Jason Lu, Pieter Noordhuis, Misha Smelyanskiy, Liang Xiong, and Xiaodong Wang. 2018. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. In2018 IEEE International Sy...

work page doi:10.1109/hpca.2018.00059 2018
[20]

Hannes Jämtner and Stefan Brynielsson. 2022. An Empirical Study on AI Work- flow Automation for Positioning. (2022)

work page 2022
[21]

Dominik Kreuzberger, Niklas Kühl, and Sebastian Hirschl. 2022. Machine Learn- ing Operations (MLOps): Overview, Definition, and Architecture. doi:10.48550/ arXiv.2205.02302

work page arXiv 2022
[22]

2022.Evaluation of MLOps Tools for Kubernetes: A Rudimentary Comparison Between Open Source Kubeflow, Pachyderm and Polyaxon

Anders Köhler. 2022.Evaluation of MLOps Tools for Kubernetes: A Rudimentary Comparison Between Open Source Kubeflow, Pachyderm and Polyaxon

work page 2022
[23]

Yumo Luo. 2023. An Open-Source and Portable MLOps Pipeline for Continuous Training and Continuous Deployment. (2023)

work page 2023
[24]

Giulio Mallardi, Fabio Calefato, Luigi Quaranta, and Filippo Lanubile. 2024. An MLOps Approach for Deploying Machine Learning Models in Healthcare Systems. In2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 6832–6837

work page 2024
[25]

Andres Felipe Varon Maya. [n. d.]. The State of MLOps. ([n. d.])

work page
[26]

Rick Merritt. 2020. What is MLOps? https://blogs.nvidia.com/blog/what-is- mlops/

work page 2020
[27]

A Systematic Review of MLOps Tools: Tool Adoption, Lifecycle Coverage, and Critical Insights

Zakkarija Micallef, Keerthiga Rajenthiram, and Ilias Gerostathopoulos. 2025. On- line appendix of the paper “A Systematic Review of MLOps Tools: Tool Adoption, Lifecycle Coverage, and Critical Insights”. doi:10.5281/zenodo.18319046

work page doi:10.5281/zenodo.18319046 2025
[28]

Widad El Moutaouakal and Karim Baïna. 2023. Comparative Experimentation of MLOps Power on Microsoft Azure, Amazon Web Services, and Google Cloud Platform. In2023 IEEE 6th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech). 1–8. doi:10.1109/ CloudTech58737.2023.10366138

work page arXiv 2023
[29]

Méndez, Jorge Camargo, and Hector Florez

Óscar A. Méndez, Jorge Camargo, and Hector Florez. 2025. Machine Learning Op- erations Applied to Development and Model Provisioning. InApplied Informatics. Vol. 2236. Springer Nature Switzerland, 73–88. doi:10.1007/978-3-031-75144-8_6

work page doi:10.1007/978-3-031-75144-8_6 2025
[30]

Faezeh Amou Najafabadi, Justus Bogner, Ilias Gerostathopoulos, and Patricia Lago. 2024. An Analysis of MLOps Architectures: A Systematic Mapping Study. Vol. 14889. 69–85. doi:10.1007/978-3-031-70797-1_5

work page doi:10.1007/978-3-031-70797-1_5 2024
[31]

Moses Openja, Forough Majidi, Foutse Khomh, Bhagya Chembakottu, and Heng Li. 2022. Studying the Practices of Deploying Machine Learning Projects on Docker. InProceedings of the 26th International Conference on Evaluation and Assessment in Software Engineering. ACM, 190–200. doi:10.1145/3530019.3530039

work page doi:10.1145/3530019.3530039 2022
[32]

2022.Streamline machine learning projects to production using cutting-edge MLOps best practices on A WS

Alessandro Palladini. 2022.Streamline machine learning projects to production using cutting-edge MLOps best practices on A WS. Ph. D. Dissertation. Politecnico di Torino

work page 2022
[33]

Productdock d.o.o, Nataša Radaković, Ivana Šenk, University of Novi Sad, Faculty of Technical Sciences, Nina Romanić, and Productdock d.o.o. 2023. A Machine Learning Pipeline Implementation Using MLOps and GitOps Principles. In19th International Scientific Conference on Industrial Systems. Faculty of Technical Sciences, 94–99. doi:10.24867/IS-2023-T2.1-6_08141

work page doi:10.24867/is-2023-t2.1-6_08141 2023
[34]

Katja-Mari Ratilainen. 2023. Adopting Machine Learning Pipeline in Existing Environment. (2023)

work page 2023
[35]

Tamburri

Gilberto Recupito, Fabiano Pecorelli, Gemma Catolino, Sergio Moreschini, Dario Di Nucci, Fabio Palomba, and Damian A. Tamburri. 2022. A Multivo- cal Literature Review of MLOps Tools and Features. In2022 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). 84–91. doi:10.1109/SEAA56994.2022.00021

work page doi:10.1109/seaa56994.2022.00021 2022
[36]

Philipp Ruf, Manav Madan, Christoph Reich, and Djaffar Ould-Abdeslam. 2021. Demystifying MLOps and Presenting a Recipe for the Selection of Open-Source Tools.Applied Sciences11, 19 (2021), 8861. doi:10.3390/app11198861

work page doi:10.3390/app11198861 2021
[37]

Enrico Salvucci. 2021. MLOps-Standardizing the Machine Learning Workflow. (2021)

work page 2021
[38]

Luca Scotton. 2021. Engineering framework for scalable machine learning opera- tions. (2021)

work page 2021
[39]

Ladson Gomes Silva. 2022. A Review on How Machine Learning Operations (MLOps) are Changing the Landscape of Machine Learning Development for Production. (2022)

work page 2022
[40]

Afonso Rafael Carvalho Sousa. 2022. Orchestrator selection process for cloud- native machine learning experimentation. (2022)

work page 2022
[41]

Jasper Stone, Raj Patel, Farbod Ghiasi, Sudip Mittal, and Shahram Rahimi. 2025. Navigating MLOps: Insights into Maturity, Lifecycle, Tools, and Careers. doi:10. 48550/arXiv.2503.15577 arXiv:2503.15577 [cs]

work page arXiv 2025
[42]

Smart Recon: Network Traffic Fingerprinting for IoT Device Identification,

Georgios Symeonidis, Evangelos Nerantzis, Apostolos Kazakis, and George A. Papakostas. 2022. MLOps - Definitions, Tools and Challenges. In2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC). 0453–0460. doi:10.1109/CCWC54503.2022.9720902

work page doi:10.1109/ccwc54503.2022.9720902 2022
[43]

Matteo Testi. 2024. Machine Learning Operations (MLOps) in Healthcare. (2024)

work page 2024
[44]

Matteo Testi, Matteo Ballabio, Emanuele Frontoni, Giulio Iannello, Sara Moccia, Paolo Soda, and Gennaro Vessio. 2022. MLOps: A Taxonomy and a Methodology. IEEE Access10 (2022), 63606–63618. doi:10.1109/ACCESS.2022.3181730

work page doi:10.1109/access.2022.3181730 2022
[45]

T Vishwambari and Sonali Agrawal. 2023. Integration of Open-Source Machine Learning Operations Tools into a Single Framework. In2023 International Con- ference on Computing, Communication, and Intelligent Systems (ICCCIS). 335–340. doi:10.1109/ICCCIS60361.2023.10425558

work page doi:10.1109/icccis60361.2023.10425558 2023
[46]

Samar Wazir, Gautam Siddharth Kashyap, and Parag Saxena. 2023. MLOps: A Review. doi:10.48550/arXiv.2308.10908

work page doi:10.48550/arxiv.2308.10908 2023
[47]

Claes Wohlin. 2014. Guidelines for snowballing in systematic literature studies and a replication in software engineering. InProceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering. ACM, 1–10. doi:10.1145/2601248.2601268

work page doi:10.1145/2601248.2601268 2014
[48]

2023.Investigate the challenges and opportunities of MLOps

Ting Chun Yau. 2023.Investigate the challenges and opportunities of MLOps

work page 2023
[49]

Al-Sarayreh

Mohammad Zarour, Hamza Alzabut, and Khalid T. Al-Sarayreh. 2025. MLOps best practices, challenges and maturity models: A systematic literature review. Information and Software Technology183 (2025), 107733. doi:10.1016/j.infsof.2025. 107733

work page doi:10.1016/j.infsof.2025 2025
[50]

Yue Zhou, Yue Yu, and Bo Ding. 2020. Towards mlops: A case study of ml pipeline platform. In2020 International Conference on Artificial Intelligence and Computer Engineering (ICAICE). IEEE, 494–500

work page 2020