ShuntServe reports 1.42x and 1.35x higher throughput than baselines plus 31.9 percent and 31.2 percent cost-efficiency gains over on-demand instances for Llama-3.1-70B and Qwen3-32B on heterogeneous AWS spot clusters.
Making cloud spot instance interruption events visible
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
fields
cs.DC 3years
2026 3representative citing papers
An AI-driven multi-region spot fleet provisioning system predicts costs with 99.79% accuracy and delivers up to 64% savings by exploiting regional price differences.
citing papers explorer
-
ShuntServe: Cost-Efficient LLM Serving on Heterogeneous Spot GPU Clusters
ShuntServe reports 1.42x and 1.35x higher throughput than baselines plus 31.9 percent and 31.2 percent cost-efficiency gains over on-demand instances for Llama-3.1-70B and Qwen3-32B on heterogeneous AWS spot clusters.
-
AI-Driven Multi-Region Provisioning for Cloud Services Using Spot Fleets
An AI-driven multi-region spot fleet provisioning system predicts costs with 99.79% accuracy and delivers up to 64% savings by exploiting regional price differences.
- Ding-Dong Ditch: Peeking Into Spot Instance Availability