BACC: Budget-Aware Calibration and Control for Horizontal Autoscaling
Pith reviewed 2026-07-01 07:45 UTC · model grok-4.3
The pith
BACC adjusts autoscaling aggressiveness via a PI controller driven by observed budget-consumption pace to track violation targets within 0.5 percentage points.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
BACC separates workload prediction, online uncertainty calibration with Adaptive Conformal Inference, and budget-paced capacity control. A proportional-integral controller continuously modulates how aggressively to add or remove replicas according to the observed pace of budget consumption. Across five traces, three compliance levels, and two forecasters, the resulting compliance gaps average 0.44 and 0.42 percentage points; the same controller also raises CPU-threshold compliance over native HPA in cluster experiments that include deployment effects.
What carries the argument
The proportional-integral controller that raises or lowers provisioning aggressiveness in proportion to the observed rate of violation-budget consumption, layered on top of an ACI-calibrated forecaster.
If this is right
- When budget consumption is slow, the controller provisions more aggressively and thereby reduces unnecessary replica counts.
- When consumption accelerates, the controller tightens provisioning to protect the remaining budget.
- The same controller logic applies unchanged to any forecaster because calibration and control are kept separate.
- In real Kubernetes deployments the controller compensates for measurement delay and replica readiness better than threshold-only HPA.
Where Pith is reading between the lines
- The separation of calibration and control could be reused for other resource types if a comparable period-level budget metric is defined.
- The approach may lower average over-provisioning in services whose SLOs are expressed as period violation budgets rather than instantaneous thresholds.
- Extending the controller with an explicit model of replica spin-up time could further reduce the compliance gap under high churn.
Load-bearing premise
That the observed pace of budget consumption supplies enough information for the proportional-integral controller to set the right aggressiveness level for any forecaster and any deployment dynamics without further system modeling.
What would settle it
A controlled experiment in which the controller's aggressiveness adjustments produce realized violation rates more than one percentage point away from the target despite accurate real-time budget tracking would falsify the central claim.
Figures
read the original abstract
Cloud services must continuously adapt replica counts to fluctuating demand while respecting fixed-period reliability budgets. Many horizontal autoscalers either react to instantaneous utilization or provision against a fixed predictive risk target. These policies do not explicitly account for how much of the period-level violation budget has already been consumed, so they can be overly conservative when the budget is healthy and insufficiently conservative when the budget is being depleted. We present BACC, a model-agnostic framework for budget-aware horizontal autoscaling. BACC separates three concerns that are often entangled in prior systems: workload prediction, online uncertainty calibration, and budget-paced capacity control. It wraps an arbitrary forecaster with Adaptive Conformal Inference (ACI) to calibrate workload uncertainty online, then uses a proportional--integral controller to adjust provisioning aggressiveness based on the observed pace of budget consumption. We instantiate BACC for CPU-threshold-based horizontal autoscaling in Kubernetes and evaluate it through trace-driven simulation and cluster replay experiments. Across five Azure Functions traces, three compliance levels, and two forecasting backends, BACC tracks the requested violation target closely, achieving mean absolute compliance gaps of 0.44 and 0.42 percentage points with ARIMA and Chronos, respectively. The Kubernetes experiments further show that the same controller improves CPU-threshold compliance over native HPA under deployment effects such as measurement delay and replica readiness.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents BACC, a model-agnostic framework for budget-aware horizontal autoscaling in cloud systems. It decouples workload forecasting from online uncertainty calibration via Adaptive Conformal Inference (ACI) and from capacity control via a proportional-integral (PI) controller that modulates provisioning aggressiveness according to the observed pace of violation-budget consumption. Trace-driven simulations on five Azure Functions traces across three compliance targets and two forecasters (ARIMA, Chronos) report mean absolute compliance gaps of 0.44 and 0.42 percentage points; Kubernetes replay experiments show improved CPU-threshold compliance relative to native HPA under measurement delay and replica-readiness effects.
Significance. If the empirical results hold, the work supplies a practical, forecaster-agnostic mechanism for explicitly managing fixed-period reliability budgets, which prior reactive or fixed-risk autoscalers do not address. The clean separation of prediction, ACI calibration, and budget-paced PI control is a coherent architectural contribution, and the combination of trace-driven simulation with real-cluster replay provides relevant evidence for deployment relevance.
major comments (2)
- [Evaluation section] Evaluation section: the central claim of close target tracking (mean absolute gaps 0.44/0.42 pp) is presented without error bars, per-trace standard deviations, or statistical significance tests across the five traces and three compliance levels; this weakens the ability to judge consistency of the result.
- [Design / Controller subsection] The PI controller's reliance on observed budget-consumption pace as the sole feedback signal is load-bearing for the claim of robustness across forecasters and deployment dynamics, yet no sensitivity analysis or ablation on controller gains or alternative feedback signals is reported.
minor comments (3)
- [§3] Notation for the ACI nonconformity score and the budget-consumption rate should be defined once in a single table or equation block rather than reintroduced in multiple sections.
- [Evaluation figures] Figure captions for the Kubernetes replay results should explicitly state the number of independent runs and the exact compliance target used in each panel.
- [Introduction / Related Work] The manuscript would benefit from a short related-work paragraph contrasting BACC with prior budget-aware or risk-aware autoscalers that also employ conformal methods.
Simulated Author's Rebuttal
We thank the referee for the positive assessment and recommendation for minor revision. We address each major comment below.
read point-by-point responses
-
Referee: [Evaluation section] Evaluation section: the central claim of close target tracking (mean absolute gaps 0.44/0.42 pp) is presented without error bars, per-trace standard deviations, or statistical significance tests across the five traces and three compliance levels; this weakens the ability to judge consistency of the result.
Authors: We agree that additional statistical detail would strengthen the evaluation. In the revised manuscript we will report per-trace compliance gaps, standard deviations across the five traces and three compliance targets, and error bars on the aggregate means. We will also include statistical significance tests (e.g., Wilcoxon signed-rank) with the explicit caveat that the small number of traces limits statistical power. revision: yes
-
Referee: [Design / Controller subsection] The PI controller's reliance on observed budget-consumption pace as the sole feedback signal is load-bearing for the claim of robustness across forecasters and deployment dynamics, yet no sensitivity analysis or ablation on controller gains or alternative feedback signals is reported.
Authors: We acknowledge the value of sensitivity analysis for the PI gains. The revised manuscript will add a dedicated subsection that varies the proportional and integral coefficients over a reasonable range and reports the resulting compliance gaps for both forecasters. We retain the position that budget-consumption pace is the natural feedback signal because it directly encodes the remaining reliability budget, but the new analysis will quantify robustness to gain choices. revision: yes
Circularity Check
No significant circularity; empirical controller design is self-contained
full rationale
The paper describes BACC as a model-agnostic framework that applies Adaptive Conformal Inference (ACI) for online uncertainty calibration around an arbitrary forecaster, followed by a standard proportional-integral controller driven by observed budget-consumption pace. No derivation chain reduces a claimed prediction or result to a fitted quantity defined from the same evaluation data; the reported compliance gaps (0.44/0.42 pp) are measured outcomes from trace-driven and Kubernetes experiments rather than outputs forced by construction. ACI and PI control are standard techniques invoked without load-bearing self-citation chains or uniqueness theorems from the authors' prior work. The separation of prediction, calibration, and budget-paced control is presented as a design choice whose performance is externally validated against native HPA and multiple forecasters, leaving the central claims independent of the inputs used for evaluation.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Adaptive Conformal Inference can be wrapped around an arbitrary forecaster to produce valid online uncertainty sets for workload prediction.
Reference graph
Works this paper leans on
-
[1]
2026.Auto Scaling Documentation
Amazon Web Services. 2026.Auto Scaling Documentation. AWS. https://docs. aws.amazon.com/autoscaling/ Accessed: 2026-02-17
2026
-
[2]
Maddix, Pablo Guer- ron, Tony Hu, Junming Yin, Nick Erickson, Prateek Mutalik Desai, Hao Wang, Huzefa Rangwala, George Karypis, Yuyang Wang, and Michael Bohlke-Schneider
Abdul Fatir Ansari, Oleksandr Shchur, Jaris Küken, Andreas Auer, Boran Han, Pedro Mercado, Syama Sundar Rangapuram, Huibin Shen, Lorenzo Stella, Xiyuan Zhang, Mononito Goswami, Shubham Kapoor, Danielle C. Maddix, Pablo Guer- ron, Tony Hu, Junming Yin, Nick Erickson, Prateek Mutalik Desai, Hao Wang, Huzefa Rangwala, George Karypis, Yuyang Wang, and Michael...
-
[3]
Chronos-2: From Univariate to Universal Forecasting
Chronos-2: From Univariate to Universal Forecasting.arXiv preprint arXiv:2510.15821(2025). https://arxiv.org/abs/2510.15821
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[4]
Mad- dix, Michael W
Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Shen, Oleksandr Shchur, Syama Syndar Rangapuram, Sebas- tian Pineda Arango, Shubham Kapoor, Jasper Zschiegner, Danielle C. Mad- dix, Michael W. Mahoney, Kari Torkkola, Andrew Gordon Wilson, Michael Bohlke-Schneider, and Yuyang Wang. 2024. Chronos: Learning the Language ...
2024
-
[5]
2016.Site reliability engineering: how Google runs production systems
Betsy Beyer, Chris Jones, Jennifer Petoff, and Niall Richard Murphy. 2016.Site reliability engineering: how Google runs production systems. O’Reilly Media, Inc
2016
-
[6]
Vivek M Bhasi, Jashwant Raj Gunasekaran, Prashanth Thinakaran, Cyan Subhra Mishra, Mahmut Taylan Kandemir, and Chita Das. 2021. Kraken: Adaptive container provisioning for deploying dynamic dags in serverless platforms. In Proceedings of the ACM Symposium on Cloud Computing. 153–167
2021
-
[7]
George EP Box, Gwilym M Jenkins, Gregory C Reinsel, and Greta M Ljung. 2015. Time series analysis: forecasting and control. John Wiley & Sons
2015
-
[8]
Brendan Burns, Brian Grant, David Oppenheimer, Eric Brewer, and John Wilkes
-
[9]
ACM59, 5 (2016), 50–57
Borg, omega, and kubernetes.Commun. ACM59, 5 (2016), 50–57
2016
-
[10]
Tao Chen, Rami Bahsoon, and Xin Yao. 2018. A survey and taxonomy of self- aware and self-adaptive cloud autoscaling systems.ACM Computing Surveys (CSUR)51, 3 (2018), 1–40
2018
- [11]
-
[12]
Guilherme Galante, Luis Carlos Erpen De Bona, Antonio Roberto Mury, Bruno Schulze, and Rodrigo da Rosa Righi. 2016. An analysis of public clouds elasticity in the execution of scientific applications: a survey.Journal of Grid Computing 14, 2 (2016), 193–216
2016
-
[13]
Yu Gan, Yanqi Zhang, Kelvin Hu, Dailun Cheng, Yuan He, Meghna Pancholi, and Christina Delimitrou. 2019. Seer: Leveraging big data to navigate the complexity of performance debugging in cloud microservices. InProceedings of the twenty- fourth international conference on architectural support for programming languages and operating systems. 19–33
2019
-
[14]
Alim Ul Gias, Giuliano Casale, and Murray Woodside. 2019. ATOM: Model- driven autoscaling for microservices. In2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). IEEE, 1994–2004
2019
-
[15]
Isaac Gibbs and Emmanuel Candes. 2021. Adaptive conformal inference under distribution shift.Advances in Neural Information Processing Systems34 (2021), 1660–1672. BACC: Budget-Aware Calibration and Control for Horizontal Autoscaling
2021
-
[16]
2026.Load Balancing and Autoscaling
Google Cloud Docs. 2026.Load Balancing and Autoscaling. Google Cloud. https:// docs.cloud.google.com/compute/docs/load-balancing-and-autoscaling Accessed: 2026-02-17
2026
-
[17]
2024.KEDA: Kubernetes Event-Driven Autoscaling
KEDA Project. 2024.KEDA: Kubernetes Event-Driven Autoscaling. CNCF. https: //keda.sh/docs/ Accessed: 2026-02-17
2024
-
[18]
Nane Kratzke and Peter-Christian Quint. 2017. Understanding cloud-native applications after 10 years of cloud computing-a systematic mapping study. Journal of Systems and Software126 (2017), 1–16
2017
-
[19]
2026.Horizontal Pod Autoscaling
Kubernetes Documentation. 2026.Horizontal Pod Autoscaling. Kuber- netes. https://kubernetes.io/docs/concepts/workloads/autoscaling/horizontal- pod-autoscale/ Accessed: 2026-02-17
2026
-
[20]
Tania Lorido-Botran, Jose Miguel-Alonso, and Jose A Lozano. 2014. A review of auto-scaling techniques for elastic applications in cloud environments.Journal of grid computing12, 4 (2014), 559–592
2014
-
[21]
Chengzhi Lu, Kejiang Ye, Guoyao Xu, Cheng-Zhong Xu, and Tongxin Bai. 2017. Imbalance in the cloud: An analysis on alibaba cluster trace. In2017 IEEE Inter- national Conference on Big Data (Big Data). IEEE, 2884–2892
2017
-
[22]
Shutian Luo, Huanle Xu, Kejiang Ye, Guoyao Xu, Liping Zhang, Jian He, Guodong Yang, and Chengzhong Xu. 2022. Erms: Efficient resource management for shared microservices with SLA guarantees. InProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1. 62–77
2022
-
[23]
Olesia Pozdniakova, Dalius Mažeika, and Aurimas Cholomskis. 2024. SLA- adaptive threshold adjustment for a Kubernetes horizontal pod autoscaler.Elec- tronics13, 7 (2024), 1242
2024
-
[24]
Haoran Qiu, Subho S Banerjee, Saurabh Jha, Zbigniew T Kalbarczyk, and Ravis- hankar K Iyer. 2020. {FIRM}: An intelligent fine-grained resource management framework for {SLO-Oriented} microservices. In14th USENIX symposium on operating systems design and implementation (OSDI 20). 805–825
2020
-
[25]
Krzysztof Rzadca, Pawel Findeisen, Jacek Swiderski, Przemyslaw Zych, Przemys- law Broniek, Jarek Kusmierek, Pawel Nowak, Beata Strack, Piotr Witusowski, Steven Hand, et al. 2020. Autopilot: workload autoscaling at google. Inproceed- ings of the fifteenth european conference on computer systems. 1–16
2020
-
[26]
Vighnesh Sachidananda and Anirudh Sivaraman. 2024. Erlang: Application- aware autoscaling for cloud microservices. InProceedings of the Nineteenth European Conference on Computer Systems. 888–923
2024
-
[27]
Glenn Shafer and Vladimir Vovk. 2008. A tutorial on conformal prediction. Journal of Machine Learning Research9 (2008)
2008
-
[28]
Mohammad Shahrad, Rodrigo Fonseca, Inigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the wild: Characterizing and optimizing the serverless workload at a large cloud provider. In2020 USENIX annual technical conference (USENIX ATC 20). 205–218
2020
-
[29]
Xiaoyang Sun, Chunming Hu, Renyu Yang, Peter Garraghan, Tianyu Wo, Jie Xu, Jianyong Zhu, and Chao Li. 2018. Rose: Cluster resource scheduling via speculative over-subscription. In2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS). IEEE, 949–960
2018
- [30]
-
[31]
2005.Algorithmic learning in a random world
Vladimir Vovk, Alexander Gammerman, and Glenn Shafer. 2005.Algorithmic learning in a random world. Springer
2005
-
[32]
Zibo Wang, Pinghe Li, Chieh-Jan Mike Liang, Feng Wu, and Francis Y. Yan
-
[33]
In21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24)
Autothrottle: A Practical Bi-Level Approach to Resource Management for SLO-Targeted Microservices. In21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24). USENIX Association, 149–165. https: //www.usenix.org/conference/nsdi24/presentation/wang-zibo
-
[34]
Chen Xu and Yao Xie. 2023. Sequential Predictive Conformal Inference for Time Series. InProceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research), Vol. 202. PMLR, 38707–38727. https: //proceedings.mlr.press/v202/xu23r.html
2023
-
[35]
Margaux Zaffran, Aymeric Dieuleveut, Olivier Féron, Yannig Goude, and Julie Josse. 2022. Adaptive conformal predictions for time series. InInternational Conference on Machine Learning. PMLR, 25834–25866
2022
-
[36]
Guilin Zhang, Srinivas Vippagunta, Raghavendra Nandagopal, Suchitra Raman, Jeff Xu, Marcus Pfeiffer, Shreeshankar Chatterjee, Ziqi Tan, Wulan Guo, and Hailong Jiang. 2025. AAPA: An Archetype-Aware Predictive Autoscaler with Uncertainty Quantification for Serverless Workloads on Kubernetes.arXiv preprint arXiv:2507.05653(2025)
-
[37]
Yanqi Zhang, Weizhe Hua, Zhuangzhuang Zhou, G Edward Suh, and Christina Delimitrou. 2021. Sinan: ML-based and QoS-aware resource management for cloud microservices. InProceedings of the 26th ACM international conference on architectural support for programming languages and operating systems. 167–181
2021
-
[38]
Zhuangzhuang Zhou, Yanqi Zhang, and Christina Delimitrou. 2022. Aquatope: Qos-and-uncertainty-aware resource management for multi-stage serverless workflows. InProceedings of the 28th ACM International Conference on Archi- tectural Support for Programming Languages and Operating Systems, Volume 1. 1–14
2022
-
[39]
Ding Zou, Wei Lu, Zhibo Zhu, Xingyu Lu, Jun Zhou, Xiaojin Wang, Kangyu Liu, Kefan Wang, Renen Sun, and Haiqing Wang. 2024. OptScaler: A Collabora- tive Framework for Robust Autoscaling in the Cloud.Proceedings of the VLDB Endowment17, 12 (2024), 4090–4103. https://doi.org/10.14778/3685800.3685829
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.