Recognition: 2 theorem links
· Lean TheoremHotspot-Aware Scheduling of Virtual Machines with Overcommitment for Ultimate Utilization in Cloud Datacenters
Pith reviewed 2026-05-13 22:14 UTC · model grok-4.3
The pith
A probabilistic bin-packing method lets cloud schedulers overcommit VMs while bounding hotspot risk to a chosen probability.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors formulate online VM scheduling as the Probabilistic k-Bins Packing problem and solve it with CloseRadiusFit, which applies Γ-robustness to dynamic CPU utilization so that the probability of any physical machine forming a hotspot stays below a chosen threshold α; cold-start AI predictions supply the utilization estimates needed for online decisions, while an offline MILP model and bounding techniques confirm that the online solutions remain within 1.6 percent of the lower bound and 3.1 percent of the upper bound.
What carries the argument
Probabilistic k-Bins Packing (PkBP), a bin-packing variant that folds Γ-robustness sets around predicted CPU demands to enforce a probability-α hotspot guard on each physical machine.
If this is right
- Schedulers can pack more VMs onto each physical machine while keeping the hotspot probability under explicit control.
- Datacenter operators obtain higher average utilization without raising the chance of performance degradation.
- Online decisions become practical once cold-start predictions replace static flavor sizes.
- The same robustness construction supplies a tunable safety knob instead of fixed conservative margins.
Where Pith is reading between the lines
- The robustness wrapper could be applied to memory or network bandwidth to create multi-resource versions of the same guarantee.
- Energy savings would follow if the denser packing reduces the number of active physical machines.
- Live traces from existing clouds could be used to calibrate the Γ parameter and test whether the predicted α matches observed hotspot rates.
- The MILP formulation for the offline case offers a benchmark that other heuristic or learning-based schedulers could be measured against.
Load-bearing premise
Dynamic CPU utilization patterns of VMs can be reliably captured by Γ-robustness theory and cold-start AI predictions so that the probability-α hotspot protection holds in practice.
What would settle it
Deploy CloseRadiusFit on a production cluster and record that the measured fraction of time any physical machine exceeds its hotspot threshold exceeds the target probability α.
Figures
read the original abstract
We address the problem of under-utilization of resources in datacenters during cloud operations, specifically focusing on the challenge of online virtual machine (VM) scheduling. Rather than following the traditional approach of scheduling VMs based solely on their static flavors, we take into account their dynamic CPU utilization. We employ $\Gamma$-robustness theory to manage the dynamic nature and introduce a novel variant of bin packing - Probabilistic k-Bins Packing (PkBP), which theoretically protects the Physical Machines (PMs) from hotspots formation within a specified probability $\alpha$. We develop a scheduling algroithm named CloseRadiusFit and cold-start AI based prediction algorithms for the online version of PkBP. To verify the quality of our approach towards the optimal solutions, we solve the Offline PkBP problem by designing a novel Mixed Integer Linear Programming (MILP) model and a combination of numerical upper and lower bounds. Our experimental results demonstrate that CloseRadiusFit achieves narrow gaps of 1.6% and 3.1% when compared to the lower and upper bounds, respectively.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Probabilistic k-Bins Packing (PkBP), a variant of bin packing that incorporates Γ-robustness to schedule VMs online while protecting physical machines from hotspots with probability α. It proposes the CloseRadiusFit online algorithm augmented by cold-start AI predictions for dynamic CPU utilization, solves the offline PkBP exactly via a new MILP formulation, and reports that CloseRadiusFit achieves 1.6% and 3.1% gaps relative to derived lower and upper bounds.
Significance. If the claimed gaps hold against the relevant online optimum and the Γ-robustness guarantee is empirically verified under realistic prediction error, the work could meaningfully improve datacenter utilization under overcommitment. However, the current evidence base is too thin to support that assessment.
major comments (3)
- [Abstract, §5] Abstract and experimental section: the headline optimality gaps (1.6% to lower bound, 3.1% to upper bound) are measured exclusively against an offline MILP relaxation of PkBP. Because CloseRadiusFit is an online algorithm that must commit without future arrivals or exact utilization traces, these gaps do not establish proximity to the online optimum; a direct comparison to an online lower bound or to realized cost under the same arrival sequence is required.
- [§3, §4] §3 (PkBP formulation) and §4 (robustness claim): the paper asserts that Γ-robustness plus the AI predictor delivers hotspot protection at probability α, yet no experiment reports the realized frequency of hotspots (or any other α-related metric) on the test traces. Without this measurement, the central robustness guarantee remains unverified.
- [§4] §4 (prediction model): the cold-start AI predictor is described as providing the inputs to the Γ-robustness model, but the manuscript supplies neither the training/validation split, the prediction-error distribution, nor any sensitivity analysis showing how prediction error propagates into the α-protection guarantee.
minor comments (2)
- [Abstract] Abstract: typo 'algroithm' should be 'algorithm'.
- [§2] Notation: the relationship between the free parameter α and the Γ-robustness budget is never made explicit; a short clarifying paragraph or equation would help.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major point below, indicating where revisions will be made to strengthen the evaluation of the online algorithm and the empirical verification of the robustness guarantees.
read point-by-point responses
-
Referee: [Abstract, §5] the headline optimality gaps (1.6% to lower bound, 3.1% to upper bound) are measured exclusively against an offline MILP relaxation of PkBP. Because CloseRadiusFit is an online algorithm that must commit without future arrivals or exact utilization traces, these gaps do not establish proximity to the online optimum; a direct comparison to an online lower bound or to realized cost under the same arrival sequence is required.
Authors: We agree that the reported gaps are relative to the offline MILP relaxation, which provides a valid lower bound but does not directly quantify distance to the online optimum. In the revision we will add comparisons against an online lower bound constructed from the same arrival sequences (e.g., via a greedy online relaxation) and will also report realized packing costs under the exact test traces to demonstrate how CloseRadiusFit performs relative to what an online algorithm can achieve. revision: yes
-
Referee: [§3, §4] the paper asserts that Γ-robustness plus the AI predictor delivers hotspot protection at probability α, yet no experiment reports the realized frequency of hotspots (or any other α-related metric) on the test traces. Without this measurement, the central robustness guarantee remains unverified.
Authors: We acknowledge that the manuscript currently lacks direct empirical measurement of realized hotspot frequency. In the revised version we will add a dedicated subsection in §5 that computes and reports the observed hotspot occurrence rate on the test traces, comparing it against the target probability α under the Γ-robustness model and the AI predictions. revision: yes
-
Referee: [§4] the cold-start AI predictor is described as providing the inputs to the Γ-robustness model, but the manuscript supplies neither the training/validation split, the prediction-error distribution, nor any sensitivity analysis showing how prediction error propagates into the α-protection guarantee.
Authors: We will expand §4 to include the training/validation split used for the cold-start predictor, the empirical distribution of prediction errors on held-out data, and a sensitivity analysis that varies the error magnitude to show its effect on the resulting α-protection level. These additions will make the link between prediction quality and the robustness guarantee explicit. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper introduces PkBP as a novel variant of bin packing that incorporates Γ-robustness to enforce probabilistic hotspot protection at level α. It then presents a new MILP model to compute offline bounds for PkBP and develops the CloseRadiusFit online algorithm plus cold-start AI predictors as separate contributions. None of these steps reduce by construction to their own inputs, self-citations, or fitted parameters renamed as predictions; the MILP bounds are solved independently of the online algorithm, and the experimental gaps are reported against those external bounds rather than being forced by the formulation itself. The derivation therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- α
axioms (1)
- domain assumption Γ-robustness theory applies to VM CPU utilization time series
invented entities (1)
-
PkBP
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We employ Γ-robustness theory to manage the dynamic nature and introduce a novel variant of bin packing - Probabilistic k-Bins Packing (PkBP), which theoretically protects the Physical Machines (PMs) from hotspots formation within a specified probability α.
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
CloseRadiusFit achieves narrow gaps of 1.6% and 3.1% when compared to the lower and upper bounds, respectively.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Bin packing and cutting stock problems: Mathematical models and exact algorithms,
M. Delorme, M. Iori, and S. Martello, “Bin packing and cutting stock problems: Mathematical models and exact algorithms,”European Journal of Operational Research, vol. 255, no. 1, pp. 1–20, 2016
work page 2016
-
[2]
Approximation and online algorithms for multidimen- sional bin packing: A survey,
H. I. Christensen, A. Khan, S. Pokutta, and P. Tetali, “Approximation and online algorithms for multidimen- sional bin packing: A survey,”Computer Science Review, vol. 24, pp. 63–79, 2017
work page 2017
-
[3]
Protean: Vm allocation service at scale,
O. Hadary, L. Marshall, I. Menache, A. Pan, E. E. Greeff, D. Dion, S. Dorminey, S. Joshi, Y . Chen, M. Russinovich et al., “Protean: Vm allocation service at scale,” inPro- ceedings of the 14th USENIX Conference on Operating Systems Design and Implementation, 2020, pp. 845–861
work page 2020
-
[4]
Bal- con—resource balancing algorithm for vm consolida- tion,
A. Gudkov, P. Popov, and S. Romanov, “Bal- con—resource balancing algorithm for vm consolida- tion,”Future Generation Computer Systems, vol. 147, pp. 265–274, 2023
work page 2023
-
[5]
S.-Y . Hsieh, C.-S. Liu, R. Buyya, and A. Y . Zomaya, “Utilization-prediction-aware virtual machine consolida- tion approach for energy-efficient cloud data centers,” Journal of Parallel and Distributed Computing, vol. 139, pp. 99–109, 2020
work page 2020
-
[6]
A. Beloglazov, J. Abawajy, and R. Buyya, “Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing,”Future generation computer systems, vol. 28, no. 5, pp. 755–768, 2012
work page 2012
-
[7]
D. Saxena, A. K. Singh, and R. Buyya, “Op-mlb: an online vm prediction-based multi-objective load balanc- ing framework for resource management at cloud data center,”IEEE Transactions on Cloud Computing, vol. 10, no. 4, pp. 2804–2816, 2021. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING 12
work page 2021
-
[8]
An energy-efficient vm prediction and migration frame- work for overcommitted clouds,
M. Dabbagh, B. Hamdaoui, M. Guizani, and A. Rayes, “An energy-efficient vm prediction and migration frame- work for overcommitted clouds,”IEEE Transactions on Cloud Computing, vol. 6, no. 4, pp. 955–966, 2016
work page 2016
-
[9]
A. Beloglazov, R. Buyyaet al., “Adaptive threshold- based approach for energy-efficient consolidation of vir- tual machines in cloud data centers.”MGC@ Middle- ware, vol. 4, no. 10.1145, pp. 1 890 799–803, 2010
work page 2010
-
[10]
A comprehensive survey for scheduling techniques in cloud computing,
M. Kumar, S. C. Sharma, A. Goel, and S. P. Singh, “A comprehensive survey for scheduling techniques in cloud computing,”Journal of Network and Computer Applications, vol. 143, pp. 1–33, 2019
work page 2019
-
[11]
E. Cortez, A. Bonde, A. Muzio, M. Russinovich, M. Fon- toura, and R. Bianchini, “Resource central: Understand- ing and predicting workloads for improved resource management in large cloud platforms,” inProceedings of the 26th Symposium on Operating Systems Principles, 2017, pp. 153–167
work page 2017
-
[12]
Take it to the limit: peak prediction-driven resource overcommitment in datacenters,
N. Bashir, N. Deng, K. Rzadca, D. Irwin, S. Kodak, and R. Jnagal, “Take it to the limit: peak prediction-driven resource overcommitment in datacenters,” inProceedings of the Sixteenth European Conference on Computer Sys- tems, 2021, pp. 556–573
work page 2021
-
[13]
Deterministic equiva- lents for optimizing and satisficing under chance con- straints,
A. Charnes and W. W. Cooper, “Deterministic equiva- lents for optimizing and satisficing under chance con- straints,”Operations research, vol. 11, no. 1, pp. 18–39, 1963
work page 1963
-
[14]
Convex approximations of chance constrained programs,
A. Nemirovski and A. Shapiro, “Convex approximations of chance constrained programs,”SIAM Journal on Op- timization, vol. 17, no. 4, pp. 969–996, 2007
work page 2007
-
[15]
G. Calafiore and F. Dabbene,Probabilistic and random- ized methods for design under uncertainty. Springer, 2006
work page 2006
-
[16]
S. K ¨uc ¸¨ukyavuz and R. Jiang, “Chance-constrained op- timization under limited distributional information: A review of reformulations based on sampling and distri- butional robustness,”EURO Journal on Computational Optimization, vol. 10, p. 100030, 2022
work page 2022
-
[17]
Ambiguous chance- constrained binary programs under mean-covariance in- formation,
Y . Zhang, R. Jiang, and S. Shen, “Ambiguous chance- constrained binary programs under mean-covariance in- formation,”SIAM Journal on Optimization, vol. 28, no. 4, pp. 2922–2944, 2018
work page 2018
-
[18]
Chance- constrained binary packing problems,
Y . Song, J. R. Luedtke, and S. K ¨uc ¸¨ukyavuz, “Chance- constrained binary packing problems,”INFORMS Jour- nal on Computing, vol. 26, no. 4, pp. 735–747, 2014
work page 2014
-
[19]
Allocating bandwidth for bursty connections,
J. Kleinberg, Y . Rabani, and ´E. Tardos, “Allocating bandwidth for bursty connections,” inProceedings of the twenty-ninth annual ACM symposium on Theory of computing, 1997, pp. 664–673
work page 1997
-
[20]
Consolidating vir- tual machines with dynamic bandwidth demand in data centers,
M. Wang, X. Meng, and L. Zhang, “Consolidating vir- tual machines with dynamic bandwidth demand in data centers,” in2011 Proceedings IEEE INFOCOM. IEEE, 2011, pp. 71–75
work page 2011
-
[21]
Overcommitment in cloud services: Bin packing with chance constraints,
M. C. Cohen, P. W. Keller, V . Mirrokni, and M. Zadi- moghaddam, “Overcommitment in cloud services: Bin packing with chance constraints,”Management Science, vol. 65, no. 7, pp. 3255–3271, 2019
work page 2019
-
[22]
Mathematical models and approximate solution approaches for the stochastic bin packing problem,
J. Martinovic and M. Selch, “Mathematical models and approximate solution approaches for the stochastic bin packing problem,”Computers & Operations Research, vol. 135, p. 105439, 2021
work page 2021
-
[23]
J. Yan, Y . Lu, L. Chen, S. Qin, Y . Fang, Q. Lin, T. Moscibroda, S. Rajmohan, and D. Zhang, “Solving the batch stochastic bin packing problem in cloud: a chance- constrained optimization approach,” inProceedings of the 28th ACM SIGKDD Conference on Knowledge Dis- covery and Data Mining, 2022, pp. 2169–2179
work page 2022
-
[24]
Optimal allocation of surgery blocks to operating rooms under uncertainty,
B. T. Denton, A. J. Miller, H. J. Balasubramanian, and T. R. Huschka, “Optimal allocation of surgery blocks to operating rooms under uncertainty,”Operations research, vol. 58, no. 4-part-1, pp. 802–816, 2010
work page 2010
-
[25]
Stochastic operating room scheduling for high-volume specialties under block booking,
O. V . Shylo, O. A. Prokopyev, and A. J. Schaefer, “Stochastic operating room scheduling for high-volume specialties under block booking,”INFORMS Journal on Computing, vol. 25, no. 4, pp. 682–692, 2013
work page 2013
-
[26]
Branch and price for chance-constrained bin packing,
Z. Zhang, B. T. Denton, and X. Xie, “Branch and price for chance-constrained bin packing,”INFORMS Journal on Computing, vol. 32, no. 3, pp. 547–564, 2020
work page 2020
-
[27]
J. Sheng, L. Wang, F. Yang, B. Qiao, H. Dong, X. Wang, B. Jin, J. Wang, S. Qin, S. Rajmohanet al., “Learn- ing cooperative oversubscription for cloud by chance- constrained multi-agent reinforcement learning,” inPro- ceedings of the ACM Web Conference 2023, 2023, pp. 2927–2936
work page 2023
-
[28]
P. Guo, J. Xiong, Y . Wang, X. Meng, and L. Qian, “Intelligent scheduling for group distributed manufactur- ing systems: Harnessing deep reinforcement learning in cloud-edge cooperation,”IEEE Transactions on Emerg- ing Topics in Computational Intelligence, 2024
work page 2024
-
[29]
J. Xiong, P. Guo, Y . Wang, X. Meng, J. Zhang, L. Qian, and Z. Yu, “Multi-agent deep reinforcement learning for task offloading in group distributed manufacturing sys- tems,”Engineering Applications of Artificial Intelligence, vol. 118, p. 105710, 2023
work page 2023
-
[30]
Branch and price for submodular bin packing,
L. Xu, C. d’Ambrosio, S. Haddad-Vanier, and E. Traversi, “Branch and price for submodular bin packing,”EURO Journal on Computational Optimization, vol. 11, p. 100074, 2023
work page 2023
-
[31]
D. Bertsimas and M. Sim, “The price of robustness,” Operations Research, vol. 52, pp. 35–53, 02 2004
work page 2004
-
[32]
The online knapsack problem with departures,
B. Sun, L. Yang, M. Hajiesmaili, A. Wierman, J. C. Lui, D. Towsley, and D. H. Tsang, “The online knapsack problem with departures,”Proceedings of the ACM on Measurement and Analysis of Computing Systems, vol. 6, no. 3, pp. 1–32, 2022
work page 2022
-
[33]
Upper bounds and algorithms for the maximum cardinality bin packing problem,
M. Labb ´e, G. Laporte, and S. Martello, “Upper bounds and algorithms for the maximum cardinality bin packing problem,”European Journal of Operational Research, vol. 149, no. 3, pp. 490–498, 2003
work page 2003
-
[34]
Probabilistic bounds for dual bin-packing,
J. L. Bruno and P. J. Downey, “Probabilistic bounds for dual bin-packing,”Acta Informatica, vol. 22, no. 3, pp. 333–345, 1985
work page 1985
-
[35]
G. H. Hardy, J. E. Littlewood, and G. P ´olya,Inequalities. Cambridge university press, 1952
work page 1952
-
[36]
Gurobi Optimizer Refer- ence Manual,
Gurobi Optimization, LLC, “Gurobi Optimizer Refer- ence Manual,” 2023. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING 13 Jiaxi Wureceived his B.S. degree in Management Science from Tianjin University, Tianjin, China, in 2016, and the Ph.D degree in Industrial Engineering and Management from Peking University, Beijing, China, in 2021. He then work...
work page 2023
-
[37]
He held multiple engineering positions in hi- tech companies specializing in the areas of full-text search engines, distributed computing and big data. Currently he is employed at Mathematical Modeling Lab of Huawei Moscow Research Center, focusing on resource usage optimization in Huawei Cloud. Possessing deep experience in computing, he models cloud env...
work page 2016
-
[38]
For any symmetricˆv⪰uwith centerˆvc=uc+s v/2 it holds thatF u(x)≥F sym(u)(x−s v)
-
[39]
To prove fact 1 we start with Def
For anys v s.t.F u(x)≥F sym(u)(x−s v)there exists a symmetricˆv⪰uwith centerˆvc=uc+s v/2 From these two facts and requirement on minimal value of ss.t.F u(x)≥F sym(u)(x−s)the theorem is immediately proven. To prove fact 1 we start with Def. 4 of stochastic domination ofˆv Fˆv(x)≤F u(x).(9) Applying reflection of both sides around point(2·ˆvc,0.5)we have F...
-
[40]
IfΓ lb(Pi+1)=Γ lb(Pi) thenur lb(Pi+1) =ur lb(Pi)
-
[41]
To prove that iterative procedure satisfies Eq
IfΓ lb(Pi+1)≥Γ lb(Pi)+1 thenur lb(Pi+1)=ur lb(Pi)+ur i+1. To prove that iterative procedure satisfies Eq. 14 let us introduce a few definitions that we will use throughout the proof. Definition 6.LetVandWbe two sequences of VMs sorted by radiuses in non-increasing order. Denote byvr i, wri radiuses of the corresponding VMs in the sets. We say thatVdominat...
-
[42]
14 we search forSthat is dominated by M axSet(µ),i.e
P v∈V urv ≥ P w∈W urw 2)V∪v⪰W∀v 3)Ifur v ≥ur w, thenV∪v⪰W∪w To satisfy Eq. 14 we search forSthat is dominated by M axSet(µ),i.e. P v∈M axSet(µ) urv ≥P v∈S urv. Our itera- tive procedure for calculation ofur lb(Pn)convert to procedure ofSconstruction: 1)Γ lb(P0) = 0. SetS(P 0) =∅
- [43]
-
[44]
IfΓ lb(Pi+1)≥Γ lb(Pi)+1. SetS(P i+1) =S(P i)∪v i+1. The sum of radiuses ofS(P i)is equal tor lb(Pi). The number of elements|S(P i)|is not greater than the lower bound Γlb(Pi). To finish the proof of the theorem, it is remained to prove: Lemma 3.M axSet(µ)⪰S(P i)for anyP i and mappingµ. Proof.The proof is by induction. The base is obvious. Sup- pose thatS(...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.