AI-Driven Multi-Region Provisioning for Cloud Services Using Spot Fleets

Enrique Molina-Gim\'enez; Javier Fabra; Pedro Garc\'ia-L\'opez

arxiv: 2605.22778 · v1 · pith:GVCK33NPnew · submitted 2026-05-21 · 💻 cs.DC

AI-Driven Multi-Region Provisioning for Cloud Services Using Spot Fleets

Javier Fabra , Enrique Molina-Gim\'enez , Pedro Garc\'ia-L\'opez This is my paper

Pith reviewed 2026-05-22 03:08 UTC · model grok-4.3

classification 💻 cs.DC

keywords spot fleetsmulti-region provisioningcloud cost optimizationpredictive modelsEC2 Spot Servicefleet configurationprice variability

0 comments

The pith

An AI-driven service estimates multi-region spot fleet configurations and prices before launch to enable cost-aware decisions while preserving EC2 Spot Service behavior.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an AI-driven provisioning service for spot fleets that span multiple cloud regions. It monitors existing provisioning plans and feeds the data into predictive models that forecast fleet sizes, prices, and configurations ahead of actual deployment. This setup supports choosing cheaper regions without changing how the underlying EC2 Spot Service operates. Validation experiments used fleets as large as 1500 vCPUs and reported 99.79 percent prediction accuracy together with potential savings reaching 64 percent by taking advantage of price differences between regions.

Core claim

The central claim is that combining monitoring of provisioning plans with predictive models produces accurate pre-launch estimates of fleet configurations and prices, allowing cost-aware multi-region deployment decisions while leaving the operational behavior of the EC2 Spot Service unchanged.

What carries the argument

The AI-driven provisioning service that monitors provisioning plans and applies predictive models to estimate fleet configurations and prices before launch.

If this is right

Provisioning decisions can now incorporate regional price variability instead of being limited to a single region.
Pre-launch cost estimates reach 99.79 percent accuracy relative to the actual EC2 Spot Service outcome.
Fleets up to 1500 vCPUs can be provisioned with potential cost reductions of up to 64 percent.
The original allocation strategies and interruption handling of the EC2 Spot Service remain unchanged.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar monitoring-plus-prediction loops could be applied to other variable-price cloud resources such as preemptible VMs on different providers.
Real-time price feeds could be added to the models to further reduce exposure during sudden regional spikes.
The approach might be combined with existing orchestration platforms so that users gain the savings without altering their deployment scripts.

Load-bearing premise

The predictive models trained on monitored provisioning data will generalize accurately to unseen fleet sizes, regions, and price fluctuations without post-hoc adjustments or overfitting to the specific validation workloads of up to 1500 vCPUs.

What would settle it

A deployment test on a fleet whose size, region set, or price pattern lies outside the monitored training data that shows prediction accuracy falling well below 99.79 percent or that fails to deliver the claimed cost savings.

Figures

Figures reproduced from arXiv: 2605.22778 by Enrique Molina-Gim\'enez, Javier Fabra, Pedro Garc\'ia-L\'opez.

**Figure 1.** Figure 1: Maximum price difference between AWS availability zones. 0 25 50 75 100 100 300 1000 Fleet size (#vCPU) Deny of fleet (%) HF EC2 Spot Service [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 3.** Figure 3: Deployment overview of the proposed AI-driven provisioning service on [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Estimated fleet cost across regions for different target capacities. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

read the original abstract

Cloud service platforms increasingly rely on elastic infrastructures to support dynamic workloads. Spot instances provide discounted computing resources but introduce uncertainty due to dynamic pricing, resource availability, and interruption risks that vary across geographical regions. In Amazon Web Services, the EC2 Spot Service simplifies fleet provisioning through allocation strategies, but it cannot estimate fleet costs before deployment and restricts provisioning to a single region. This paper presents an AI-driven provisioning service for multi-region spot fleets. The proposed approach combines monitoring of provisioning plans with predictive models to estimate fleet configurations and prices before launch, enabling cost-aware deployment decisions across regions while preserving the operational behavior of the EC2 Spot Service. The system was validated with fleets of up to 1500 vCPUs. Experimental results show a prediction accuracy of 99.79% compared to the EC2 Spot Service and potential cost savings of up to 64% by exploiting regional price variability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes an AI-driven provisioning service for multi-region AWS EC2 Spot fleets. It combines monitoring of provisioning plans with predictive models to estimate fleet configurations and prices before launch, enabling cost-aware deployment decisions across regions while preserving the operational behavior and semantics of the EC2 Spot Service. Validation is performed on fleets of up to 1500 vCPUs, with reported results of 99.79% prediction accuracy relative to the EC2 Spot Service and potential cost savings of up to 64% from exploiting regional price variability.

Significance. If the predictive models can be shown to generalize reliably, the work addresses a practical gap in the EC2 Spot Service by supporting pre-deployment cost estimation and multi-region decisions without changing allocation or interruption behavior. This could enable measurable cost reductions for elastic cloud workloads. The manuscript does not report machine-checked proofs, open reproducible code, or parameter-free derivations.

major comments (2)

[Experimental Evaluation] Experimental Evaluation section: The 99.79% prediction accuracy and 64% savings claims rest on validation limited to fleets of up to 1500 vCPUs using monitored provisioning data. The text provides no description of the train/test split, hold-out regions, AZs, or price regimes outside the training distribution, nor any cross-validation or temporal hold-out procedure. Because spot prices and availability are region-specific and non-stationary, this detail is load-bearing for the generalization claim.
[System Architecture] System Architecture section: The claim that the approach 'preserves the operational behavior of the EC2 Spot Service' is central but lacks concrete mechanisms showing how the predictive models are inserted without altering allocation strategies, interruption handling, or fleet semantics. No pseudocode, interface specification, or equivalence argument is supplied.

minor comments (2)

[Abstract] The abstract and introduction should explicitly state the machine-learning algorithms, feature sets, and training data sources used for the predictive models.
[Results] Cost-savings calculations in the results should include the exact baseline (single-region Spot Service) and the regional price traces or time windows used to obtain the 64% figure.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We respond to each major comment below and will revise the manuscript to address the identified gaps.

read point-by-point responses

Referee: Experimental Evaluation section: The 99.79% prediction accuracy and 64% savings claims rest on validation limited to fleets of up to 1500 vCPUs using monitored provisioning data. The text provides no description of the train/test split, hold-out regions, AZs, or price regimes outside the training distribution, nor any cross-validation or temporal hold-out procedure. Because spot prices and availability are region-specific and non-stationary, this detail is load-bearing for the generalization claim.

Authors: We agree that the Experimental Evaluation section requires additional detail on validation procedures to support the generalization claims. The manuscript currently omits explicit descriptions of these aspects. In the revision we will add a dedicated subsection that specifies the temporal train/test split (earlier historical periods for training, later periods for testing), k-fold cross-validation, and evaluations on hold-out regions, AZs, and price regimes outside the training distribution. These additions will directly address concerns about non-stationarity and region-specific behavior. revision: yes
Referee: System Architecture section: The claim that the approach 'preserves the operational behavior of the EC2 Spot Service' is central but lacks concrete mechanisms showing how the predictive models are inserted without altering allocation strategies, interruption handling, or fleet semantics. No pseudocode, interface specification, or equivalence argument is supplied.

Authors: We acknowledge that the System Architecture section would benefit from more explicit mechanisms and artifacts. The predictive models operate exclusively in a pre-deployment monitoring and estimation phase; actual fleet creation, allocation, and interruption handling are delegated to the unmodified EC2 Spot Service. In the revised manuscript we will insert pseudocode for the overall workflow, an interface specification between the prediction layer and the EC2 APIs, and a brief equivalence argument demonstrating that allocation strategies and fleet semantics remain unchanged. revision: yes

Circularity Check

0 steps flagged

No significant circularity; results benchmarked against external EC2 Spot Service

full rationale

The paper describes monitoring provisioning plans then training predictive models to estimate multi-region fleet configurations and prices in advance. Experimental validation reports 99.79% accuracy and up to 64% savings on fleets up to 1500 vCPUs by direct comparison to the actual EC2 Spot Service allocation and pricing behavior. No equations, self-citations, or derivation steps in the provided text reduce the claimed predictions to fitted inputs by construction or to author-only uniqueness theorems. The central results rest on external service behavior rather than internal redefinition, satisfying the criteria for a self-contained, externally falsifiable evaluation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no identifiable free parameters, axioms, or invented entities; all technical details remain opaque.

pith-pipeline@v0.9.0 · 5689 in / 1130 out tokens · 41851 ms · 2026-05-22T03:08:13.529265+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

multi-region provisioning... exploiting regional price variability... up to 64% cost savings

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages

[1]

See spot run: Using spot instances for{MapReduce}workflows,

N. Chohan, C. Castillo, M. Spreitzer, M. Steinder, A. Tantawi, and C. Krintz, “See spot run: Using spot instances for{MapReduce}workflows,” in2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 10), 2010

work page 2010
[2]

Deepspotcloud: Leveraging cross-region gpu spot instances for deep learning,

K. Lee and M. Son, “Deepspotcloud: Leveraging cross-region gpu spot instances for deep learning,” in2017 IEEE 10th International Conference on Cloud Computing (CLOUD), 2017, pp. 98–105

work page 2017
[3]

Flint: Batch-interactive data- intensive processing on transient servers,

P. Sharma, T. Guo, X. He, D. Irwin, and P. Shenoy, “Flint: Batch-interactive data- intensive processing on transient servers,” inProceedings of the Eleventh European Conference on Computer Systems, 2016, pp. 1–15

work page 2016
[4]

Reducing the price of resource provisioning using ec2 spot instances with prediction models,

J. Fabra, J. Ezpeleta, and P. ´Alvarez, “Reducing the price of resource provisioning using ec2 spot instances with prediction models,”Future Generation Computer Systems, vol. 96, pp. 348–367, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0167739X1831166X

work page 2019
[5]

Aws predspot: Machine learning for predicting the price of spot instances in aws cloud,

A. Baldominos G´ omez, Y. Saez, D. Quintana, and P. Isasi, “Aws predspot: Machine learning for predicting the price of spot instances in aws cloud,” 2022

work page 2022
[6]

Amazon ec2 spot price prediction using temporal convolution network,

X. Song, R. Lin, and H. Zou, “Amazon ec2 spot price prediction using temporal convolution network,” inICETIS 2022; 7th International Conference on Electronic Technology and Information Science. VDE, 2022, pp. 1–6

work page 2022
[7]

Making cloud spot instance interruption events visible,

K. Kim and K. Lee, “Making cloud spot instance interruption events visible,” inProceedings of the ACM Web Conference 2024, ser. WWW ’24. New York, NY, USA: Association for Computing Machinery, 2024, p. 2998–3009. [Online]. Available: https://doi.org/10.1145/3589334.3645548

work page doi:10.1145/3589334.3645548 2024
[8]

Spotlake: Diverse spot instance dataset archive service,

S. Lee, J. Hwang, and K. Lee, “Spotlake: Diverse spot instance dataset archive service,” in2022 IEEE International Symposium on Workload Characterization (IISWC), 2022, pp. 242–255

work page 2022
[9]

Autobot: Resilient and cost-effective scheduling of a bag of tasks on spot vms,

P. Varshney and Y. Simmhan, “Autobot: Resilient and cost-effective scheduling of a bag of tasks on spot vms,”IEEE Transactions on Parallel and Distributed Systems, vol. 30, no. 7, pp. 1512–1527, 2018

work page 2018
[10]

Spotverse: Optimizing bioinformatics workflows with multi-region spot instances in galaxy and beyond,

M. Son, G. G. Akbulut, and M. T. Kandemir, “Spotverse: Optimizing bioinformatics workflows with multi-region spot instances in galaxy and beyond,” inProceedings of the 25th International Middleware Conference, ser. Middleware ’24. New York, NY, USA: Association for Computing Machinery, 2024, p. 74–87. [Online]. Available: https://doi.org/10.1145/3652892.3...

work page doi:10.1145/3652892.3700750 2024
[11]

{SkyPilot}: An intercloud broker for sky com- puting,

Z. Yang, Z. Wu, M. Luo, W.-L. Chiang, R. Bhardwaj, W. Kwon, S. Zhuang, F. S. Luan, G. Mittal, S. Shenkeret al., “{SkyPilot}: An intercloud broker for sky com- puting,” in20th USENIX Symposium on Networked Systems Design and Imple- mentation (NSDI 23), 2023, pp. 437–455

work page 2023
[12]

Skyserve: Serving ai models across regions and clouds with spot instances,

Z. Mao, T. Xia, Z. Wu, W.-L. Chiang, T. Griggs, R. Bhardwaj, Z. Yang, S. Shenker, and I. Stoica, “Skyserve: Serving ai models across regions and clouds with spot instances,” 2024. [Online]. Available: https://arxiv.org/abs/2411.01438

work page arXiv 2024

[1] [1]

See spot run: Using spot instances for{MapReduce}workflows,

N. Chohan, C. Castillo, M. Spreitzer, M. Steinder, A. Tantawi, and C. Krintz, “See spot run: Using spot instances for{MapReduce}workflows,” in2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 10), 2010

work page 2010

[2] [2]

Deepspotcloud: Leveraging cross-region gpu spot instances for deep learning,

K. Lee and M. Son, “Deepspotcloud: Leveraging cross-region gpu spot instances for deep learning,” in2017 IEEE 10th International Conference on Cloud Computing (CLOUD), 2017, pp. 98–105

work page 2017

[3] [3]

Flint: Batch-interactive data- intensive processing on transient servers,

P. Sharma, T. Guo, X. He, D. Irwin, and P. Shenoy, “Flint: Batch-interactive data- intensive processing on transient servers,” inProceedings of the Eleventh European Conference on Computer Systems, 2016, pp. 1–15

work page 2016

[4] [4]

Reducing the price of resource provisioning using ec2 spot instances with prediction models,

J. Fabra, J. Ezpeleta, and P. ´Alvarez, “Reducing the price of resource provisioning using ec2 spot instances with prediction models,”Future Generation Computer Systems, vol. 96, pp. 348–367, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0167739X1831166X

work page 2019

[5] [5]

Aws predspot: Machine learning for predicting the price of spot instances in aws cloud,

A. Baldominos G´ omez, Y. Saez, D. Quintana, and P. Isasi, “Aws predspot: Machine learning for predicting the price of spot instances in aws cloud,” 2022

work page 2022

[6] [6]

Amazon ec2 spot price prediction using temporal convolution network,

X. Song, R. Lin, and H. Zou, “Amazon ec2 spot price prediction using temporal convolution network,” inICETIS 2022; 7th International Conference on Electronic Technology and Information Science. VDE, 2022, pp. 1–6

work page 2022

[7] [7]

Making cloud spot instance interruption events visible,

K. Kim and K. Lee, “Making cloud spot instance interruption events visible,” inProceedings of the ACM Web Conference 2024, ser. WWW ’24. New York, NY, USA: Association for Computing Machinery, 2024, p. 2998–3009. [Online]. Available: https://doi.org/10.1145/3589334.3645548

work page doi:10.1145/3589334.3645548 2024

[8] [8]

Spotlake: Diverse spot instance dataset archive service,

S. Lee, J. Hwang, and K. Lee, “Spotlake: Diverse spot instance dataset archive service,” in2022 IEEE International Symposium on Workload Characterization (IISWC), 2022, pp. 242–255

work page 2022

[9] [9]

Autobot: Resilient and cost-effective scheduling of a bag of tasks on spot vms,

P. Varshney and Y. Simmhan, “Autobot: Resilient and cost-effective scheduling of a bag of tasks on spot vms,”IEEE Transactions on Parallel and Distributed Systems, vol. 30, no. 7, pp. 1512–1527, 2018

work page 2018

[10] [10]

Spotverse: Optimizing bioinformatics workflows with multi-region spot instances in galaxy and beyond,

M. Son, G. G. Akbulut, and M. T. Kandemir, “Spotverse: Optimizing bioinformatics workflows with multi-region spot instances in galaxy and beyond,” inProceedings of the 25th International Middleware Conference, ser. Middleware ’24. New York, NY, USA: Association for Computing Machinery, 2024, p. 74–87. [Online]. Available: https://doi.org/10.1145/3652892.3...

work page doi:10.1145/3652892.3700750 2024

[11] [11]

{SkyPilot}: An intercloud broker for sky com- puting,

Z. Yang, Z. Wu, M. Luo, W.-L. Chiang, R. Bhardwaj, W. Kwon, S. Zhuang, F. S. Luan, G. Mittal, S. Shenkeret al., “{SkyPilot}: An intercloud broker for sky com- puting,” in20th USENIX Symposium on Networked Systems Design and Imple- mentation (NSDI 23), 2023, pp. 437–455

work page 2023

[12] [12]

Skyserve: Serving ai models across regions and clouds with spot instances,

Z. Mao, T. Xia, Z. Wu, W.-L. Chiang, T. Griggs, R. Bhardwaj, Z. Yang, S. Shenker, and I. Stoica, “Skyserve: Serving ai models across regions and clouds with spot instances,” 2024. [Online]. Available: https://arxiv.org/abs/2411.01438

work page arXiv 2024