pith. sign in

arxiv: 2604.11445 · v1 · submitted 2026-04-13 · 💻 cs.DC

OpenDT: Exploring Datacenter Performance and Sustainability with a Self-Calibrating Digital Twin

Pith reviewed 2026-05-10 15:46 UTC · model grok-4.3

classification 💻 cs.DC
keywords datacenterdigital twinself-calibrationtelemetrydiscrete-event simulationperformance modelingenergy efficiencyopen source
0
0 comments X

The pith

OpenDT shows that self-calibrating digital twins can mirror datacenter performance and energy use with 4.39% MAPE accuracy from live telemetry.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Datacenters require ongoing monitoring to balance performance demands with energy costs and reliability, yet no open digital twin has previously been shown to operate continuously with live data. The paper builds OpenDT to close this gap by ingesting real-time telemetry, running a discrete-event simulation that recalibrates itself, and returning human-approved adjustments. Experiments using traces demonstrate that the system reproduces earlier peer-reviewed findings while adding performance and energy-efficiency results, and that the recalibration step lowers prediction error relative to prior published work. A reader would care because accurate, open twins could support safer live decisions in facilities that underpin much of modern computing. The authors release the code to allow others to test and extend the approach.

Core claim

OpenDT implements a continuous cycle of live telemetry collection from the physical datacenter, self-calibrating discrete-event simulation, and SLO-aware feedback approved by a human operator. In trace-driven prototype tests focused on the first two stages, the system reproduces results from earlier peer-reviewed studies while extending the analysis to include performance and energy-efficiency metrics. The central measured outcome is that online recalibration improves digital-twinning accuracy to a mean absolute percentage error of 4.39 percent, compared with 7.86 percent reported in the referenced prior work.

What carries the argument

Self-calibrating discrete-event simulation that continuously updates its parameters from live telemetry to keep the model aligned with the running physical system.

If this is right

  • OpenDT reproduces peer-reviewed datacenter experiments and extends them with added performance and energy-efficiency results.
  • Online recalibration of the simulation raises digital-twinning accuracy to 4.39 percent MAPE.
  • The system architecture supports a human-in-the-loop cycle for SLO-aware operational changes.
  • The implementation follows FAIR and FOSS principles and is released publicly for community use.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The self-calibration loop could be tested for stability when the feedback stage is fully closed in production settings.
  • The same telemetry-plus-simulation pattern may apply to monitoring other large computing installations beyond datacenters.
  • Extending the model to include additional sustainability indicators such as carbon intensity would be a direct next step.

Load-bearing premise

Live telemetry data is rich and timely enough for the simulation to recalibrate itself accurately without creating modeling errors, delays, or instability in any future feedback to the physical datacenter.

What would settle it

Deploy OpenDT on a datacenter whose telemetry is sparse or delayed and measure whether the mean absolute percentage error stays at or below 5 percent or whether the simulated and measured performance begin to diverge over time.

Figures

Figures reproduced from arXiv: 2604.11445 by Alexandru Iosup, Houcen Liu, Jules van der Toorn, Radu Nicolae, Stavriana Kraniti.

Figure 1
Figure 1. Figure 1: High-level overview of datacenter digital twinning: [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: High-level overview of OpenDT. (FR3) Simulator real-time re-calibration: OpenDT should re￾calibrate predictions in real-time, based on quantified differences between predictions and real-world measurements. (NFR1) Accurate, ground-truth adjusted predictions: OpenDT should stay within 10% error rate (community-accepted [22, 23, 28, 30]), at ≥90% of the operational time, with dynamic simulation re-calibratio… view at source ↗
Figure 3
Figure 3. Figure 3: Synchronization between simulator (service [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Section 3.3 experiment, adapted and re-run from [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Operational metrics for the compute cluster over [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Evolution of error rate in power-draw estimation of [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
read the original abstract

Datacenters are the backbone of our digital society, but raise numerous operational challenges. We envision digital twins becoming primary instruments in datacenter operations, continuously and autonomously helping with major operational decisions and with adapting ICT infrastructure, live, with a human-in-the-loop. Although fields such as aviation and autonomous driving successfully employ digital twins, an open-source digital twin for datacenters has not been demonstrated to the community. Addressing this challenge, we design, implement, and experiment using OpenDT, an Open-source, Digital Twin for monitoring and operating datacenters through a continuous integration cycle that includes: (1) live and continuous telemetry data; (2) discrete-event simulation using live telemetry from the physical ICT, with self-calibration; and (3) SLO-aware and human-approved feedback to physical ICT. Through trace-driven experiments with a prototype mainly covering stages 1 and 2 of the cycle, we show that (i) OpenDT can be used to reproduce peer-reviewed experiments and extend the analysis with performance and energy-efficiency results; (ii) OpenDT's online re-calibration can increase digital-twinning accuracy, quantified to a MAPE of 4.39% vs. 7.86% in peer-reviewed work. OpenDT adheres to FAIR/FOSS principles and is available at: https://github.com/atlarge-research/opendt/tree/hcp.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents OpenDT, an open-source digital twin for datacenters that implements a continuous integration cycle consisting of (1) live telemetry ingestion, (2) discrete-event simulation with online self-calibration, and (3) SLO-aware human-approved actuation back to the physical ICT infrastructure. Through trace-driven prototype experiments that primarily exercise stages 1 and 2, the authors show that OpenDT reproduces prior peer-reviewed datacenter experiments while extending them with performance and energy-efficiency results, and that its re-calibration reduces mean absolute percentage error (MAPE) from 7.86% in the baseline to 4.39%. The implementation is released on GitHub under FAIR/FOSS principles.

Significance. If the self-calibration and closed-loop claims hold under live conditions, the work would be significant as the first openly available digital-twin platform for datacenters, filling a gap where such tools exist in aviation and autonomous driving but have not been demonstrated for ICT infrastructure. The reproduction of external baselines plus the GitHub release provide concrete reproducibility value that could accelerate research on datacenter performance and sustainability.

major comments (2)
  1. Abstract and evaluation description: the central vision is a 'continuous integration cycle' that includes stage 3 (SLO-aware and human-approved feedback to physical ICT). However, the reported experiments are explicitly 'trace-driven' and 'mainly covering stages 1 and 2'; no results exercise the closed-loop path under live telemetry noise, timing jitter, or actuator delays. This leaves the stability of the self-calibration procedure and the overall claim of autonomous operation with human-in-the-loop untested.
  2. Evaluation of accuracy improvement: the MAPE reduction to 4.39% versus 7.86% is presented as a key result, yet the manuscript does not report statistical significance, confidence intervals, or sensitivity to trace selection and post-hoc parameter choices. Without these, it is unclear whether the improvement is robust or could be an artifact of the specific traces and calibration tuning.
minor comments (2)
  1. The abstract states that OpenDT 'can be used to reproduce peer-reviewed experiments and extend the analysis,' but the manuscript should include an explicit table or section listing which prior experiments were reproduced, the exact metrics matched, and the extensions added, to allow readers to verify fidelity.
  2. Notation and terminology around 'self-calibration' and 'online re-calibration' should be defined once with a clear algorithmic outline (inputs, update rule, convergence criterion) rather than scattered across the text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and insightful report. We address the major comments point by point in the following responses and outline the changes we will make in the revised manuscript.

read point-by-point responses
  1. Referee: Abstract and evaluation description: the central vision is a 'continuous integration cycle' that includes stage 3 (SLO-aware and human-approved feedback to physical ICT). However, the reported experiments are explicitly 'trace-driven' and 'mainly covering stages 1 and 2'; no results exercise the closed-loop path under live telemetry noise, timing jitter, or actuator delays. This leaves the stability of the self-calibration procedure and the overall claim of autonomous operation with human-in-the-loop untested.

    Authors: We agree that the current prototype experiments are trace-driven and focus primarily on stages 1 and 2, as already noted in the manuscript. The full closed-loop actuation in stage 3 under live conditions is part of the long-term vision but has not been evaluated here, owing to the practical and safety constraints of deploying human-approved actuators in a live datacenter setting. We will revise the abstract and evaluation sections to more explicitly delimit the scope of the presented results, state that live closed-loop testing is future work, and note the untested aspects of stability under real telemetry noise and delays as a limitation. revision: yes

  2. Referee: Evaluation of accuracy improvement: the MAPE reduction to 4.39% versus 7.86% is presented as a key result, yet the manuscript does not report statistical significance, confidence intervals, or sensitivity to trace selection and post-hoc parameter choices. Without these, it is unclear whether the improvement is robust or could be an artifact of the specific traces and calibration tuning.

    Authors: The MAPE values are computed directly from the traces used to reproduce the prior peer-reviewed experiments. We acknowledge that statistical significance, confidence intervals, and sensitivity analyses were not reported. In the revised manuscript we will add bootstrap confidence intervals for the MAPE figures and include sensitivity checks on trace subsets and calibration parameter ranges to assess robustness. revision: yes

Circularity Check

0 steps flagged

No significant circularity; central claims rest on prototype experiments against external baselines

full rationale

The paper's key results derive from trace-driven experiments with an implemented prototype that reproduces peer-reviewed baselines and reports improved MAPE (4.39% vs. 7.86%) via online re-calibration. No load-bearing equations, self-definitions, or fitted parameters are shown to reduce by construction to the same inputs; the evaluation uses external comparisons and implementation details rather than renaming or self-referential derivations. Any self-citations are incidental and not required to justify the accuracy claims.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on standard domain assumptions about telemetry fidelity and simulation representativeness rather than new free parameters or invented entities; no explicit fitting constants are described in the abstract.

axioms (1)
  • domain assumption Live telemetry from the physical ICT infrastructure is sufficiently complete and low-latency to support continuous self-calibration of the discrete-event simulation.
    This assumption underpins the claimed accuracy improvement and closed-loop operation.

pith-pipeline@v0.9.0 · 5562 in / 1233 out tokens · 43141 ms · 2026-05-10T15:46:10.573539+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages

  1. [1]

    Overheating datacenter stopped 2.5 million bank transactions

    2023. Overheating datacenter stopped 2.5 million bank transactions. Laura Dobberstein, The Register, https://www.theregister.com/2023/11/07/overheating_ datacenter_singapore/

  2. [2]

    B Danette Allen. 2021. Digital twins and living models at NASA. InDigital Twin Summit

  3. [3]

    Georgios Andreadis, Laurens Versluis, Fabian Mastenbroek, and Alexandru Iosup

  4. [4]

    A reference architecture for datacenter scheduling: design, validation, and experiments. InSC

  5. [5]

    Hakan Aydemir et al. 2020. The Digital Twin Paradigm for Aircraft – Review and Outlook. InAIAA SciTech Forum. AIAA

  6. [6]

    Calheiros et al

    Rodrigo N. Calheiros et al. 2011. CloudSim: a toolkit for modeling and simula- tion of cloud computing environments and evaluation of resource provisioning algorithms.SPE(2011)

  7. [7]

    Henri Casanova. 2001. Simgrid: A Toolkit for the Simulation of Application Scheduling. InCCGrid. IEEE Computer Society

  8. [8]

    Nuria de Lama Sanchez, Peter Haase, Dumitru Roman, and Radu Prodan. 2023. Boosting the Impact of Extreme and Sustainable Graph Processing for Urgent Societal Challenges in Europe Graph-Massivizer. InICPE

  9. [9]

    2026.Modern Distributed Systems

    edX Delft University of Technology (DelftX). 2026.Modern Distributed Systems

  10. [10]

    Xiaobo Fan et al. 2007. Power provisioning for a warehouse-sized computer. In ISCA

  11. [11]

    Silva Filho, , et al

    Manoel C. Silva Filho, , et al. 2017. CloudSim Plus: A cloud computing simulation framework pursuing software engineering principles for improved modularity, extensibility and correctness. InIFIP/IEEE

  12. [12]

    6G FNS. 2025. Future Network Services: 6G for and by the Netherlands. https: //futurenetworkservices.nl/en/

  13. [13]

    Sandeep K. S. Gupta, Rose Robin Gilbert, Ayan Banerjee, Zahra Abbasi, Tridib Mukherjee, and Georgios Varsamopoulos. 2011. GDCSim: A tool for analyzing Green Data Center design and resource management techniques. InIGCC

  14. [14]

    Abad, and Alexandru Iosup

    Nikolas Herbst, André Bauer, Samuel Kounev, Giorgos Oikonomou, Erwin Van Eyk, George Kousiouris, Athanasia Evangelinou, Rouven Krebs, Tim Brecht, Cristina L. Abad, and Alexandru Iosup. 2018. Quantifying Cloud Performance and Dependability: Taxonomy, Metric Design and Emerging Challenges.ToMPECS (2018)

  15. [15]

    Hewage, Shashikant Ilager, Maria Alejandra Rodriguez, and Rajku- mar Buyya

    Tharindu B. Hewage, Shashikant Ilager, Maria Alejandra Rodriguez, and Rajku- mar Buyya. 2024. CloudSim express: A novel framework for rapid low code simulation of cloud computing environments.Softw. Pract. Exp.(2024)

  16. [16]

    IDC. 2024. AI Datacenter Capacity, Energy Consumption, and Carbon Emission Projections. https://www.idc.com/getdoc.jsp?containerId=US52131624

  17. [17]

    Alexandru Iosup. 2024. A VU on Digital Twins to Improve the Performance and Technological Sustainability of Datacenters in the Continuum. InMODSIM. Seattle, USA

  18. [18]

    Rellermeyer, Lin Wang, Alexandru Uta, and Francesco Regazzoni

    Alexandru Iosup, Fernando Kuipers, Ana Lucia Varbanescu, Paola Grosso, Ani- mesh Trivedi, Jan S. Rellermeyer, Lin Wang, Alexandru Uta, and Francesco Regazzoni. 2022. Future Computer Systems and Networking Research in the Netherlands: A Manifesto.CoRRabs/2206.03259 (2022). arXiv:2206.03259 https://doi.org/10.48550/arXiv.2206.03259

  19. [19]

    Alexandru Iosup, Radu Prodan, Ana Lucia Varbanescu, Sacheendra Talluri, Gilles Magalhaes, Kailhan Hokstam, Hugo Zwaan, Vincent van Beek, Reza Farahani, and Dragi Kimovski. 2023. Graph Greenifier: Towards Sustainable and Energy-Aware Massive Graph Processing in the Computing Continuum. InICPE

  20. [20]

    Alexandru Iosup, Alexandru Uta, Laurens Versluis, Georgios Andreadis, Er- win Van Eyk, Tim Hegeman, Sacheendra Talluri, Vincent van Beek, and Lucian Toader. 2018. Massivizing Computer Systems. InICDCS

  21. [21]

    Alexandru Iosup, Laurens Versluis, Animesh Trivedi, Erwin Van Eyk, Lucian Toader, Vincent Van Beek, Giulia Frascaria, Ahmed Musaafir, and Sacheendra Talluri. 2019. The AtLarge vision on the design of distributed systems and ecosystems. InICDCS. IEEE

  22. [22]

    2011.Kafka : a Distributed Messaging System for Log Processing

    Jay Kreps. 2011.Kafka : a Distributed Messaging System for Log Processing. https://api.semanticscholar.org/CorpusID:18534081

  23. [23]

    Fabian Mastenbroek, Georgios Andreadis, Soufiane Jounaid, Wenchen Lai, Jacob Burley, Jaro Bosch, Erwin Van Eyk, Laurens Versluis, Vincent van Beek, and Alexandru Iosup. 2021. OpenDC 2.0: Convenient Modeling and Simulation of Emerging Technologies in Cloud Datacenters. InCCGrid

  24. [24]

    Fabian Mastenbroek, Tiziano De Matteis, Vincent van Beek, and Alexandru Iosup

  25. [25]

    RADiCe: A Risk Analysis Framework for Data Centers.FGCS(2025)

  26. [26]

    Rozanec, Ana Lucia Varbanescu, and Radu Prodan

    Martin Molan, Junaid Ahmed Khan, Andrea Bartolini, Roberta Turra, Giorgio Pedrazzi, Michael Cochez, Alexandru Iosup, Dumitru Roman, Joze M. Rozanec, Ana Lucia Varbanescu, and Radu Prodan. 2023. The Graph-Massivizer Approach Toward a European Sustainable Data Center Digital Twin. InCOMPSAC

  27. [27]

    Juan José Montaño Moreno et al. 2013. Using the R-MAPE index as a resistant measure of forecast accuracy.Psicothema(2013)

  28. [28]

    Sebastian Moss. 2023. Surgeries and procedures paused at Wi- chita hospitals due to data center outage. Data Center Dynamics, https://www.datacenterdynamics.com/en/news/surgeries-and-procedures- paused-at-wichita-hospitals-due-to-data-center-outage/

  29. [29]

    NeilMcAllister. 2013. Google goes dark for 2 minutes, kills 40% of world’s net traffic. https://www.theregister.com/2013/08/17/google_outage/

  30. [30]

    Radu Nicolae, Dante Niewenhuis, Sacheendra Talluri, and Alexandru Iosup. 2026. M3SA: Exploring Datacenter Performance and Climate-Impact with Multi-and Meta-Model Simulation and Analysis.A vailable at SSRN 5377101(2026)

  31. [31]

    Radu Nicolae, Jules van der Toorn, Stavriana Kraniti, Houcen Liu, and Alexandru Iosup. 2026. OpenDT: Exploring Datacenter Performance and Sustainability with a Self-Calibrating Digital Twin. Technical Report, https://atlarge-research.com/ pdfs/2026-hcp-opendt-techrep.pdf

  32. [32]

    Dante Niewenhuis, Sacheendra Talluri, Alexandru Iosup, and Tiziano De Matteis

  33. [33]

    FootPrinter: Quantifying Data Center Carbon Footprint. InICPE

  34. [34]

    OpenDC Team. 2025. Input: Workload — OpenDC Documentation. https://atlarge- research.github.io/opendc/docs/documentation/Input/Workload

  35. [35]

    Oracle. 2024. MAPE. https://docs.oracle.com/en/cloud/saas/planning-budgeting- cloud/pfusu/insights_metrics_MAPE.html

  36. [36]

    Kaan Sel et al . 2024. Building digital twins for cardiovascular health: From principles to clinical impact.Journal of the American Heart Association(2024)

  37. [37]

    Laurens Versluis, Mehmet Çetin, Caspar Greeven, Kristian Laursen, Damian Podareanu, Valeriu Codreanu, Alexandru Uta, and Alexandru Iosup. 2023. Less is not more: We need rich datasets to explore.FGCS(2023)