pith. sign in

arxiv: 2604.15033 · v1 · submitted 2026-04-16 · 💻 cs.DC · cs.DS

Efficient calculation of available space for multi-NUMA virtual machines

Pith reviewed 2026-05-10 10:02 UTC · model grok-4.3

classification 💻 cs.DC cs.DS
keywords multi-NUMAvirtual machinescloud schedulingcapacity calculationclosed-form expressionsNUMA topology mappingresource allocationcombinatorial optimization
0
0 comments X

The pith

Closed-form expressions calculate how many multi-NUMA VMs of a given flavor fit on a physical server.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper derives direct formulas that replace combinatorial search when deciding how many additional multi-NUMA virtual machines can be placed on a host. It focuses on symmetric cases: 2-NUMA and 4-NUMA VMs mapped onto 4-NUMA and 8-NUMA physical servers. The expressions give the exact maximum count under the assumption that chosen mappings keep performance optimal and interference low. This matters because cloud schedulers and capacity dashboards need fast answers at scale rather than solving packing problems repeatedly. The results apply directly to real-time allocation checks and large-scale reorganization tools.

Core claim

For symmetric 2-NUMA and 4-NUMA VM flavors placed on 4-NUMA and 8-NUMA hosts, the maximum number of additional VMs that can be allocated is given by closed-form expressions that enumerate only the feasible NUMA mappings and then apply simple arithmetic counts of available cores, memory, and NUMA nodes. These expressions avoid brute-force enumeration of all possible topology assignments while still respecting the constraint that each VM must receive a contiguous, performance-optimal mapping.

What carries the argument

Closed-form expressions for maximum VM count under fixed symmetric NUMA mappings, obtained by enumerating only the optimal topology assignments and then computing remaining capacity.

If this is right

  • Schedulers can answer capacity queries for multi-NUMA flavors in constant time instead of solving an NP-hard packing problem on each host.
  • Real-time cluster dashboards can display exact remaining slots per flavor without periodic recomputation of full allocation graphs.
  • Reorganization tools can evaluate the benefit of migrating or repacking VMs by plugging current host states into the same formulas.
  • Capacity planning for new VM flavors reduces to substituting the flavor's NUMA requirements into the derived expressions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same algebraic approach could be extended to asymmetric or non-contiguous NUMA mappings if the interference model is updated.
  • These formulas might serve as fast lower bounds inside larger integer-linear-programming solvers for mixed NUMA and non-NUMA workloads.
  • Operators could use the expressions to set admission-control thresholds that guarantee a minimum number of co-located VMs without runtime measurement.
  • Testing the formulas on production traces would reveal how often real workloads deviate from the symmetry assumption.

Load-bearing premise

The derivations assume that both the VMs and the host have symmetric NUMA topologies and that the chosen mappings are always the ones that guarantee optimal performance with minimal interference.

What would settle it

Run an exhaustive enumeration of all possible NUMA mappings for a specific 4-NUMA VM flavor on an 8-NUMA host and compare the resulting maximum VM count against the number produced by the closed-form expression; any mismatch falsifies the formula.

Figures

Figures reproduced from arXiv: 2604.15033 by Alexis Pospelov, Andrei Gudkov, Elizaveta Ponomareva.

Figure 1
Figure 1. Figure 1: Possible (A) 4-NUMA and (B) 8-NUMA topologies [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Optimal solution for placing 2-vNUMA VMs (A) into partially filled [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: (A) Illustration for Lemma 1; (B) the sequence of reductions used to [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Ladder graph L4 Proof. The value is clearly an upper bound on VMCAP(L4, K2, b) since L4 is bipartite and normalization of vertex capacities doesn’t lead to the loss of generality according to Lemma 3. Now let us construct a matching that achieves this upper bound. After normalizing vertices 1 and 2 match them together as many times as possible and, if one of them still has non-zero capacity left, match it … view at source ↗
Figure 5
Figure 5. Figure 5: Sequence of steps to derive optimization problem 12 [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Illustration of deriving x−y value that optimizes expression 12 (A) (B) 2 1 3 4 8 7 5 6 min(b1,b2) (b2,b1,b4,b3) (b5,b6,b7,b8) ( b 8 , b 7 , b 1 , b 2 ) ( b 3 , b 4 , b 5 , b 6 ) min(b3,b4) min(b7,b8) min(b5,b6) [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: (A) Cycles in CQ3 and (B) corresponding equivalent transformation [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗
read the original abstract

Increasing demand for computational power has led cloud providers to employ multi-NUMA servers and offer multi-NUMA virtual machines to their customers. However, multi-NUMA VMs introduce additional complexity to scheduling algorithms. Beyond merely selecting a host for a VM, the scheduler has to map virtual NUMA topology onto the physical NUMA topology of the server to ensure optimal VM performance and minimize interference with co-located VMs. Under these constraints, maximizing the number of allocated multi-NUMA VMs on a host becomes a combinatorial optimization problem. In this paper, we derive closed-form expressions to compute the maximum number of VMs for a given flavor that can be additionally allocated onto a physical server. We consider nontrivial scenarios of mapping 2- and 4-NUMA symmetric VMs to 4- and 8-NUMA physical topologies. Our results have broad applicability, ranging from real-time dashboards (displaying available cluster capacity per VM flavor) to optimization tools for large-scale cloud resource reorganization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript derives closed-form expressions for the maximum number of symmetric 2-NUMA and 4-NUMA virtual machines that can be additionally allocated on 4-NUMA and 8-NUMA physical servers. It frames the problem as a combinatorial optimization task arising from the need to map virtual NUMA topologies onto physical ones while ensuring optimal VM performance and minimizing co-location interference, with results intended for real-time capacity dashboards and large-scale resource reorganization.

Significance. If the derivations are sound and the mappings provably achieve the claimed performance guarantees, the closed forms would eliminate the need for runtime combinatorial search in cloud schedulers, enabling efficient per-flavor capacity queries. This has clear practical value for multi-NUMA cloud environments. The work receives credit for focusing on nontrivial symmetric topologies rather than trivial cases, though the absence of machine-checked proofs or exhaustive enumeration results limits the strength of the combinatorial claims.

major comments (2)
  1. [§4] §4 (Derivations for 2-NUMA VMs on 4-NUMA hosts): the closed-form expression counts feasible assignments under the assumption that the chosen symmetric mappings simultaneously maximize density and guarantee minimal interference with no performance loss. No formal proof, exhaustive enumeration, or sensitivity analysis is provided to establish that these mappings remain optimal when workloads deviate from perfect symmetry or when co-location effects are modeled differently.
  2. [§5] §5 (4-NUMA VMs on 8-NUMA hosts): the closed form appears to rely on the same interference-minimizing mapping assumption without quantifying the penalty if symmetry is broken or if the mapping is not feasible under all topology constraints. This directly affects whether the expressions overestimate available capacity, as noted in the central claim.
minor comments (2)
  1. [Abstract] Abstract: states that closed forms are derived but provides no equations, key results, or validation metrics, reducing immediate readability.
  2. [§2] Notation: the definitions of symmetric topologies and the precise interference model could be stated more explicitly with a small example table or diagram in §2 to help readers follow the combinatorial counting.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive review and for recognizing the practical value of closed-form expressions in multi-NUMA cloud scheduling. We address each major comment point by point below, clarifying the scope of our symmetric-topology derivations while indicating targeted revisions to strengthen the presentation.

read point-by-point responses
  1. Referee: [§4] §4 (Derivations for 2-NUMA VMs on 4-NUMA hosts): the closed-form expression counts feasible assignments under the assumption that the chosen symmetric mappings simultaneously maximize density and guarantee minimal interference with no performance loss. No formal proof, exhaustive enumeration, or sensitivity analysis is provided to establish that these mappings remain optimal when workloads deviate from perfect symmetry or when co-location effects are modeled differently.

    Authors: The derivations in §4 compute the exact maximum number of symmetric 2-NUMA VMs that fit on a 4-NUMA host when each VM is mapped to a distinct pair of physical NUMA nodes, thereby enforcing locality and eliminating node sharing by construction. This yields the highest density achievable under the stated symmetric constraints and standard NUMA-affinity requirements. We acknowledge that the manuscript does not supply a machine-checked proof or exhaustive enumeration across all workload deviations. In the revision we will add a short verification subsection that exhaustively enumerates all feasible mappings for host sizes up to 8 VMs, confirming that the closed form matches the optimum under symmetry. Sensitivity to arbitrary co-location models lies outside the combinatorial scope of the paper, which focuses on topology-constrained packing rather than workload-specific interference; we will insert an explicit limitations paragraph noting this boundary. revision: partial

  2. Referee: [§5] §5 (4-NUMA VMs on 8-NUMA hosts): the closed form appears to rely on the same interference-minimizing mapping assumption without quantifying the penalty if symmetry is broken or if the mapping is not feasible under all topology constraints. This directly affects whether the expressions overestimate available capacity, as noted in the central claim.

    Authors: Section 5 likewise derives the closed form for the maximum number of symmetric 4-NUMA VMs on an 8-NUMA host under the mapping that assigns each VM to four distinct physical nodes. Because the formula enumerates only the feasible non-overlapping assignments, it reports the precise capacity ceiling for that symmetric case rather than an overestimate. We agree that the text should more explicitly bound the result to symmetric flavors and note the potential reduction in capacity if non-symmetric placements are forced. The revision will add a clarifying paragraph in §5 together with a small-scale exhaustive check (analogous to the one planned for §4) that verifies the formula against all admissible 8-NUMA configurations. No general penalty quantification for broken symmetry is provided because the manuscript restricts attention to the symmetric VM flavors stated in the abstract and title. revision: partial

standing simulated objections not resolved
  • Supplying machine-checked proofs or exhaustive enumeration for arbitrarily large topologies, which would require formal verification tooling beyond the scope of the current combinatorial derivations.

Circularity Check

0 steps flagged

No circularity: closed-form expressions derived from topology constraints

full rationale

The paper derives closed-form expressions for maximum VM allocation by enumerating feasible mappings under symmetric 2-/4-NUMA VM topologies onto 4-/8-NUMA hosts. No equations, parameters, or premises reduce to self-definition, fitted inputs renamed as predictions, or load-bearing self-citations. The combinatorial counting step is independent of the target result and rests on explicit symmetry assumptions rather than circular reduction. This is a standard self-contained mathematical derivation with no evidence of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are identifiable from the abstract alone; the work rests on standard assumptions of NUMA topology mapping and symmetric VM flavors.

pith-pipeline@v0.9.0 · 5468 in / 1123 out tokens · 36599 ms · 2026-05-10T10:02:27.337804+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages

  1. [1]

    Evaluation of virtual machine performance on numa multicore systems,

    Y . Cheng and W. Chen, “Evaluation of virtual machine performance on numa multicore systems,” in2013 Eighth International Conference on P2P , Parallel, Grid, Cloud and Internet Computing, pp. 136–143, 2013

  2. [2]

    Virtual- ization overhead of multithreading in x86 state-of-the-art & remaining challenges,

    S. Schildermans, J. Shan, K. Aerts, J. Jackrel, and X. Ding, “Virtual- ization overhead of multithreading in x86 state-of-the-art & remaining challenges,”IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 10, pp. 2557–2570, 2021

  3. [3]

    Evaluating job packing in warehouse-scale computing,

    A. Verma, M. Korupolu, and J. Wilkes, “Evaluating job packing in warehouse-scale computing,” in2014 IEEE International Conference on Cluster Computing (CLUSTER), pp. 48–56, IEEE, 2014

  4. [4]

    Phecon: Fine-grained vm consolidation with nimble resource defragmentation in public cloud platforms,

    J. Zhu, W. Tang, X. Meng, N. Gong, T. Ai, G. Li, B. Yu, and X. Yang, “Phecon: Fine-grained vm consolidation with nimble resource defragmentation in public cloud platforms,” inProceedings of the 53rd International Conference on Parallel Processing, pp. 712–721, 2024

  5. [5]

    Numa-aware virtual machine placement: New mmmk model and column generation-based decomposition approach,

    X. Sun, X. Cao, Q. Zhai, H. Tan, J. Hu, L. Zhu, L. Su, W. Zhou, F. Gao, and X. Guan, “Numa-aware virtual machine placement: New mmmk model and column generation-based decomposition approach,” IEEE Transactions on Automation Science and Engineering, 2024

  6. [6]

    Provisioning differentiated last-level cache allocations to vms in public clouds,

    M. Shahrad, S. Elnikety, and R. Bianchini, “Provisioning differentiated last-level cache allocations to vms in public clouds,” inProceedings of the ACM Symposium on Cloud Computing, pp. 319–334, 2021

  7. [7]

    Korte and J

    B. Korte and J. Vygen,Combinatorial Optimization. Springer Berlin Heidelberg, 2018

  8. [8]

    Ferdous,Algorithms for Degree-Constrained Subgraphs and Appli- cations

    S. Ferdous,Algorithms for Degree-Constrained Subgraphs and Appli- cations. PhD thesis, Purdue University, 2021

  9. [9]

    An efficient reduction technique for degree-constrained subgraph and bidirected network flow problems,

    H. N. Gabow, “An efficient reduction technique for degree-constrained subgraph and bidirected network flow problems,” inProceedings of the fifteenth annual ACM symposium on Theory of computing, pp. 448–456, 1983

  10. [10]

    Kunpeng 920: The first 7-nm chiplet-based 64-core arm soc for cloud services,

    J. Xia, C. Cheng, X. Zhou, Y . Hu, and P. Chun, “Kunpeng 920: The first 7-nm chiplet-based 64-core arm soc for cloud services,”IEEE Micro, vol. 41, no. 5, pp. 67–75, 2021

  11. [11]

    Enhanced hypercubes,

    N.-F. Tzeng, “Enhanced hypercubes,”IEEE Transactions on Computers, vol. 40, no. 03, pp. 284–294, 1991

  12. [12]

    The crossed cube architecture for parallel computation,

    K. Efe, “The crossed cube architecture for parallel computation,”IEEE Transactions on Parallel & Distributed Systems, vol. 3, no. 05, pp. 513– 524, 1992. 9

  13. [13]

    Topological properties of the crossed cube architecture,

    K. Efe, P. Blackwell, W. Slough, and T. Shiau, “Topological properties of the crossed cube architecture,”Parallel Computing, vol. 20, no. 12, pp. 1763–1775, 1994

  14. [14]

    New intel® xeon® scalable processor family (skylake- sp)

    A. Kumar, “New intel® xeon® scalable processor family (skylake- sp).” https://old.hotchips.org/wp-content/uploads/hc archives/ hc29/HC29.22-Tuesday-Pub/HC29.22.90-Server-Pub/HC29.22. 930-Xeon-Skylake-sp-Kumar-Intel.pdf, 2017. [Online; accessed 2-March-2026]

  15. [15]

    Socket sp3 platform numa topology for amd family 19h models 00h–0fh

    I. Advanced Micro Devices, “Socket sp3 platform numa topology for amd family 19h models 00h–0fh.” https://www.amd.com/content/ dam/amd/en/documents/processor-tech-docs/design-guides/56795 1 00-PUB.pdf, 2021. [Online; accessed 2-October-2025]

  16. [16]

    A short proof of the factor theorem for finite graphs,

    W. T. Tutte, “A short proof of the factor theorem for finite graphs,” Canadian Journal of mathematics, vol. 6, pp. 347–352, 1954

  17. [17]

    Learning to schedule multi-numa virtual machines via reinforcement learning,

    J. Sheng, Y . Hu, W. Zhou, L. Zhu, B. Jin, J. Wang, and X. Wang, “Learning to schedule multi-numa virtual machines via reinforcement learning,”Pattern Recognition, vol. 121, p. 108254, 2022

  18. [18]

    Topology-aware virtual machine placement for improving cloud servers resource utilization,

    D. Ma, X. Cao, J. Hu, T. Xia, Y . Zhou, K. Liu, L. Zhu, L. Su, and F. Gao, “Topology-aware virtual machine placement for improving cloud servers resource utilization,”Future Generation Computer Systems, p. 108361, 2026

  19. [19]

    Virtual machine consol- idation for numa systems: A hybrid heuristic grey wolf approach,

    K. Hu, W. Lin, T. Huang, K. Li, and L. Ma, “Virtual machine consol- idation for numa systems: A hybrid heuristic grey wolf approach,” in 2020 IEEE 26th International Conference on Parallel and Distributed Systems (ICPADS), pp. 569–576, IEEE, 2020

  20. [20]

    Generalized resource allocation for the cloud,

    A. Rai, R. Bhagwan, and S. Guha, “Generalized resource allocation for the cloud,” inproceedings of the Third ACM Symposium on Cloud Computing, pp. 1–12, 2012

  21. [21]

    Aws instance price guide

    “Aws instance price guide.” https://instaguide.io/. [Accessed 11-03- 2026]

  22. [22]

    Implementing weighted b- matching algorithms: insights from a computational study,

    M. M ¨uller-Hannemann and A. Schwartz, “Implementing weighted b- matching algorithms: insights from a computational study,”Journal of Experimental Algorithmics (JEA), vol. 5, pp. 8–es, 2000

  23. [23]

    Multiple hungarian method for k-assignment problem,

    B. Gabrov ˇsek, T. Novak, J. Povh, D. Rupnik Poklukar, and J. ˇZerovnik, “Multiple hungarian method for k-assignment problem,”Mathematics, vol. 8, no. 11, p. 2050, 2020

  24. [24]

    On realizability of a set of integers as degrees of the vertices of a linear graph. i,

    S. L. Hakimi, “On realizability of a set of integers as degrees of the vertices of a linear graph. i,”Journal of the society for industrial and applied mathematics, vol. 10, no. 3, pp. 496–506, 1962