DataCenterGym: A Physics-Grounded Simulator for Multi-Objective Data Center Scheduling
Pith reviewed 2026-05-10 08:30 UTC · model grok-4.3
The pith
DataCenterGym is a Gymnasium-compatible simulator integrating compute queueing, building thermal dynamics, localized HVAC, and temperature-dependent degradation for multi-objective geo-distributed data center scheduling, demonstrated with an H-MPC algorithm that outperforms baselines.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We present DataCenterGym, a physics-grounded simulation environment for job scheduling in geo-distributed data centers... We also develop a Hierarchical Model Predictive Control (H-MPC) scheduling algorithm that performs distributed job placement while explicitly accounting for thermal and power dynamics. Through experiments on nominal operation and workload sensitivity, we demonstrate how H-MPC improves scheduling performance relative to baseline schedulers.
Load-bearing premise
The integrated models of compute queueing, building thermal dynamics, localized HVAC behavior, and temperature-dependent service degradation are sufficiently accurate representations of real geo-distributed data center physics to make simulation results transferable to practice.
Figures
read the original abstract
Modern datacenters schedule heterogeneous workloads across geo-distributed sites with diverse compute capacities, electricity prices, and thermal conditions. Compute utilization, heat generation, cooling demand, and energy consumption are tightly coupled, yet most existing schedulers abstract these effects and treat them independently. We present \textit{DataCenterGym}, a physics-grounded simulation environment for job scheduling in geo-distributed data centers, designed as a reusable testbed for future research. The simulator integrates compute queueing, building thermal dynamics, localized HVAC behavior, and temperature-dependent service degradation within a Gymnasium-compatible interface. We also develop a Hierarchical Model Predictive Control (H-MPC) scheduling algorithm that performs distributed job placement while explicitly accounting for thermal and power dynamics. Through experiments on nominal operation and workload sensitivity, we demonstrate how H-MPC improves scheduling performance relative to baseline schedulers.
Editorial analysis
A structured set of objections, weighed in public.
Circularity Check
No circularity in derivation or prediction chain
full rationale
The paper introduces DataCenterGym as a new Gymnasium-compatible simulator that integrates standard literature models for compute queueing, thermal dynamics, HVAC, and temperature-dependent degradation, plus a new H-MPC algorithm. No equations, first-principles derivations, or predictions are shown that reduce by construction to fitted parameters, self-definitions, or self-citation chains. Performance claims are simulator-internal comparisons under nominal and sensitivity workloads; the contribution is the reusable testbed and algorithm, not a tautological result. This is self-contained engineering work with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard models of compute queueing, building thermal dynamics, localized HVAC, and temperature-dependent service degradation are adequate for the simulation.
Reference graph
Works this paper leans on
-
[1]
Data centers carbon emissions at crossroads: An empirical study,
D. Maji, W. A. Hanafy, L. Wu, D. Irwin, P. Shenoy, and R. K. Sitaraman, “Data centers carbon emissions at crossroads: An empirical study,”ACM SIGENERGY Energy Informatics Review, 2025
work page 2025
-
[2]
Data centre energy use: Critical review of models and results,
G. Kamiya and V . C. Coroam ˘a, “Data centre energy use: Critical review of models and results,”IEA 4E TCP Efficient, Demand Flexible Networked Appliances (EDNA), 2025
work page 2025
-
[3]
J. D. Moore, J. S. Chase, P. Ranganathan, and R. K. Sharma, “Mak- ing scheduling “cool”: Temperature-aware workload placement in data centers,” inUSENIX Annual Technical Conference, 2005
work page 2005
-
[4]
Large-scale cluster management at google with borg,
A. Verma, L. Pedrosa, M. Korupolu, D. Oppenheimer, E. Tune, and J. Wilkes, “Large-scale cluster management at google with borg,” in ACM EuroSys, 2015
work page 2015
-
[5]
Mesos: A platform for{Fine- Grained}resource sharing in the data center,
B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica, “Mesos: A platform for{Fine- Grained}resource sharing in the data center,” in8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 11), 2011
work page 2011
-
[6]
Nvidia data center gpus: Power and thermal design,
NVIDIA Corporation, “Nvidia data center gpus: Power and thermal design,” 2022
work page 2022
-
[7]
Q. Zhang, Z. Meng, X. Hong, Y . Zhan, J. Liu, J. Dong, T. Bai, J. Niu, and M. J. Deen, “A survey on data center cooling systems: Technol- ogy, power consumption modeling and control strategy optimization,” Journal of Systems Architecture, 2021
work page 2021
-
[8]
pmapper: power and migration cost aware application placement in virtualized systems,
A. Verma, P. Ahuja, and A. Neogi, “pmapper: power and migration cost aware application placement in virtualized systems,” inACM Middleware. Springer, 2008
work page 2008
-
[9]
Data center cooling using model-predictive control,
N. Lazic, C. Boutilier, T. Lu, E. Wong, B. Roy, M. Ryu, and G. Imwalle, “Data center cooling using model-predictive control,”NeurIPS, 2018
work page 2018
-
[10]
Sus- taingym: Benchmarking reinforcement learning for sustainable energy systems,
Z. Li, M. Brady, A. Makarova, S. Choi, and C. Callison-Burch, “Sus- taingym: Benchmarking reinforcement learning for sustainable energy systems,” inNeurIPS, 2023
work page 2023
-
[11]
Cutting the electric bill for internet-scale systems,
A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, and B. Maggs, “Cutting the electric bill for internet-scale systems,” inACM SIGCOMM, 2009
work page 2009
-
[12]
Task scheduling in geo-distributed computing: A survey,
Y . Wu, S. Tang, C. Yu, B. Yang, C. Sun, J. Xiao, and H. Wu, “Task scheduling in geo-distributed computing: A survey,”arXiv preprint arXiv:2501.15504, 2025
-
[13]
Profit-sensitive spatial scheduling of multi-application tasks in distributed green clouds,
H. Yuan, J. Bi, and M. Zhou, “Profit-sensitive spatial scheduling of multi-application tasks in distributed green clouds,”IEEE Transactions on Automation Science and Engineering, 2020
work page 2020
-
[14]
M. Niu, B. Cheng, Y . Feng, and J. Chen, “Gmta: A geo-aware multi- agent task allocation approach for scientific workflows in container- based cloud,”IEEE Transactions on Network and Service Management, 2020
work page 2020
-
[15]
Joint data center cooling and workload management: A thermal- aware approach,
S. M. Mirhoseini Nejad, H. Moazamigoodarzi, G. H. Badawy, and D. G. Down, “Joint data center cooling and workload management: A thermal- aware approach,”Future Generation Computer Systems, 2020
work page 2020
-
[16]
Cooling-aware and thermal-aware workload placement for green hpc data centers,
A. Banerjee, T. Mukherjee, G. Varsamopoulos, and S. K. Gupta, “Cooling-aware and thermal-aware workload placement for green hpc data centers,” inInternational conference on green computing. IEEE, 2010
work page 2010
-
[17]
Forecasting gas usage for big buildings using generalized additive models and deep learning,
N. Pathak, A. Ba, J. Ploennigs, and N. Roy, “Forecasting gas usage for big buildings using generalized additive models and deep learning,” in IEEE SMARTCOMP, 2018
work page 2018
-
[18]
A bayesian data analytics approach to buildings’ thermal parameter estimation,
N. Pathak, J. Foulds, N. Roy, N. Banerjee, and R. Robucci, “A bayesian data analytics approach to buildings’ thermal parameter estimation,” in ACM e-Energy, 2019
work page 2019
-
[19]
Casper: Carbon- aware scheduling and provisioning for distributed web services,
A. Souza, S. Jasoria, B. Chakrabarty, A. Bridgwater, A. Lundberg, F. Skogh, A. Ali-Eldin, D. Irwin, and P. Shenoy, “Casper: Carbon- aware scheduling and provisioning for distributed web services,” inIEEE IGSC, 2023
work page 2023
-
[20]
Going green for less green: Optimizing the cost of reducing cloud carbon emissions,
W. A. Hanafyet al., “Going green for less green: Optimizing the cost of reducing cloud carbon emissions,” inACM ASPLOS, 2024
work page 2024
-
[21]
X. Zhaiet al., “F2s-wss: A forecast-driven two-stage workload schedul- ing scheme for carbon-aware geo-distributed data centers with wind power integration,”Sustainable Computing: Informatics and Systems, 2025
work page 2025
-
[22]
Carbon-aware energy cost optimization of data analytics across geo-distributed data centers,
Y .-T. Chen, L.-L. Luo, D.-K. Guo, and Q. He, “Carbon-aware energy cost optimization of data analytics across geo-distributed data centers,” Journal of Computer Science and Technology, 2025
work page 2025
-
[23]
Hierarchial demand response for colocation data centers,
H. Xu, X. Jin, and Q. Deng, “Hierarchial demand response for colocation data centers,” inIEEE SMARTCOMP, 2017
work page 2017
-
[24]
Resource manage- ment with deep reinforcement learning,
H. Mao, M. Alizadeh, I. Menache, and S. Kandula, “Resource manage- ment with deep reinforcement learning,” inACM HotNets, 2016
work page 2016
-
[25]
R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. F. De Rose, and R. Buyya, “Cloudsim: A toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algo- rithms,”Software: Practice and Experience, 2011
work page 2011
-
[26]
Greencloud: A packet-level simulator of energy-aware cloud computing data centers,
D. Kliazovich, P. Bouvry, and S. U. Khan, “Greencloud: A packet-level simulator of energy-aware cloud computing data centers,”The Journal of Supercomputing, 2012
work page 2012
-
[27]
An open-source simulation platform for benchmarking geo-distributed data center schedulers,
D. Alves, K. Obraczka, and A. Kabbani, “An open-source simulation platform for benchmarking geo-distributed data center schedulers,”Sim- ulation, 2024
work page 2024
-
[28]
M. Valdez-Vivas, V . Sharma, N. Stanisha, S. Li, L. Mi, W. Jiang, A. Kalinin, and J. Metzler, “Clockwork: A delay-based global scheduling framework for more consistent landing times in the data warehouse,” in ACM SigKDD, 2021
work page 2021
-
[29]
Opendc 2.0: Convenient modeling and simulation of emerging technologies in cloud datacenters,
F. Mastenbroek, G. Andreadis, S. Jounaid, W. Lai, J. Burley, J. Bosch, E. Van Eyk, L. Versluis, V . Van Beek, and A. Iosup, “Opendc 2.0: Convenient modeling and simulation of emerging technologies in cloud datacenters,” in2021 IEEE/ACM CCGrid. IEEE, 2021
work page 2021
-
[30]
Alibaba cluster trace program,
Alibaba Group, “Alibaba cluster trace program,” 2018, production clus- ter trace data from Alibaba cloud infrastructure
work page 2018
-
[31]
Learning- based model predictive control: Toward safe learning in control,
L. Hewing, K. P. Wabersich, M. Menner, and M. N. Zeilinger, “Learning- based model predictive control: Toward safe learning in control,”Annual Review of Control, Robotics, and Autonomous Systems, 2020
work page 2020
-
[32]
L. A. Barroso, U. H ¨olzle, and P. Ranganathan,The datacenter as a computer: Designing warehouse-scale machines. Springer Nature, 2019
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.