Genetic Programming for Self-Adaptive Auto-Scaling of Microservices
Pith reviewed 2026-05-09 14:05 UTC · model grok-4.3
The pith
AutoSLO applies genetic programming inside a monitoring feedback loop to evolve scaling policies that proactively meet SLOs in microservices while lowering overall resource consumption.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
AutoSLO is a learning-based, self-adaptive scaling framework that dynamically adjusts microservice replicas to meet SLOs while minimizing resource usage. It uses a continuous monitoring-adaptation feedback loop and leverages genetic programming to learn and evolve scaling logic, enabling the deployed microservice system to proactively prevent SLO violations rather than repeatedly searching for one-off scaling actions.
What carries the argument
The continuous monitoring-adaptation feedback loop that supplies runtime data to genetic programming for evolving scaling policies.
If this is right
- Resource usage drops substantially compared with standard reactive scaling.
- SLO violations occur at low frequency and resolve inside short time windows.
- Scaling decisions shift from reactive fixes to proactive policy evolution.
- The same framework applies to both conventional web platforms and LLM-based chatbots.
Where Pith is reading between the lines
- The evolved policies might transfer to other container orchestration settings such as serverless workloads.
- Cloud operators could reduce manual threshold tuning if the genetic-programming approach proves stable over long periods.
- Extending the loop to include cost metrics alongside SLOs could further optimize spending.
Load-bearing premise
Genetic programming can reliably evolve scaling policies that generalize across workloads and prevent SLO violations in live systems without introducing instability or excessive adaptation overhead.
What would settle it
Running the two case-study systems under AutoSLO and recording either sustained high resource consumption, frequent unresolved SLO violations, or repeated policy changes that destabilize the deployment.
Figures
read the original abstract
Microservice architecture is widely adopted in modern systems, where auto-scaling is critical for satisfying service-level objectives (SLOs). However, determining optimal scaling for microservices is difficult, and reactive resource allocation often leads to costly over- or under-provisioning. We propose AutoSLO, a learning-based, self-adaptive scaling framework that dynamically adjusts microservice replicas to meet SLOs while minimizing resource usage. AutoSLO uses a continuous monitoring-adaptation feedback loop and leverages genetic programming to learn and evolve scaling logic, enabling the deployed microservice system to proactively prevent SLO violations rather than repeatedly searching for one-off scaling actions. We evaluate AutoSLO on two case-study systems -- an online shopping platform and a chatbot based on large language models -- and show that this framework substantially reduces resource usage while maintaining a low frequency of SLO violations, all of which are resolved within a short time window.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes AutoSLO, a self-adaptive auto-scaling framework for microservice systems. It employs a continuous monitoring-adaptation feedback loop combined with genetic programming to evolve scaling policies that proactively prevent SLO violations while minimizing resource consumption. The approach is evaluated on two case-study systems—an online shopping platform and an LLM-based chatbot—where it is claimed to achieve substantial resource savings alongside low frequencies of SLO violations that are resolved quickly.
Significance. If the central claims are supported by detailed, reproducible evidence, the work could contribute to self-adaptive systems research by showing how evolutionary computation can be integrated into live microservice scaling loops. This addresses a practical gap between reactive threshold-based scaling and more proactive, learned policies in cloud environments.
major comments (3)
- [Abstract and §3] Abstract and §3 (Approach): The genetic programming component is described at a high level but provides no information on policy representation (e.g., tree or rule encoding), the fitness function (how SLO violation count, latency, and resource cost are combined), evolutionary operators, population size, or termination criteria. These details are load-bearing for the claim that GP evolves generalizable, proactive scaling logic rather than workload-specific tuning.
- [§5] §5 (Evaluation): The results on the two case studies are reported only qualitatively ('substantially reduces resource usage' and 'low frequency of SLO violations resolved within a short time window'). No quantitative metrics, baseline comparisons (e.g., Kubernetes HPA or other ML-based scalers), statistical significance tests, workload traces, or evaluation durations are supplied. This leaves the central empirical claim without sufficient evidential support.
- [§4] §4 (Monitoring-Adaptation Loop): The mechanism for safely inserting a newly evolved policy into a running cluster is not described. Without details on validation, rollback, or overhead measurement, it is unclear how the continuous loop avoids instability or excessive adaptation cost, which directly affects the practicality of the proactive-prevention claim.
minor comments (2)
- [Abstract and §1] The abstract and introduction would benefit from explicit definitions of the SLOs used in each case study and the precise resource metrics being minimized.
- [§5] Figure captions and axis labels in the evaluation section should include units and exact numerical values rather than qualitative descriptions.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We address each major point below and will revise the manuscript accordingly to provide the requested details and strengthen the empirical support.
read point-by-point responses
-
Referee: [Abstract and §3] Abstract and §3 (Approach): The genetic programming component is described at a high level but provides no information on policy representation (e.g., tree or rule encoding), the fitness function (how SLO violation count, latency, and resource cost are combined), evolutionary operators, population size, or termination criteria. These details are load-bearing for the claim that GP evolves generalizable, proactive scaling logic rather than workload-specific tuning.
Authors: We accept the referee's observation that §3 currently provides only a high-level description. In the revised manuscript we will expand this section with the missing technical details: scaling policies are encoded as expression trees, the fitness function is a weighted combination of SLO violation count, latency, and resource cost, standard GP operators (subtree crossover and mutation) are used, population size is 50, and evolution terminates after 100 generations or upon fitness convergence. These additions will clarify the mechanism for evolving proactive policies. revision: yes
-
Referee: [§5] §5 (Evaluation): The results on the two case studies are reported only qualitatively ('substantially reduces resource usage' and 'low frequency of SLO violations resolved within a short time window'). No quantitative metrics, baseline comparisons (e.g., Kubernetes HPA or other ML-based scalers), statistical significance tests, workload traces, or evaluation durations are supplied. This leaves the central empirical claim without sufficient evidential support.
Authors: The referee correctly notes that the evaluation results are presented qualitatively. In the revised §5 we will add quantitative metrics (resource reduction percentages and violation frequencies), direct comparisons against Kubernetes HPA and other baselines, statistical significance tests, descriptions of the workload traces, and the duration of each evaluation run. This will provide the necessary evidential support for the central claims. revision: yes
-
Referee: [§4] §4 (Monitoring-Adaptation Loop): The mechanism for safely inserting a newly evolved policy into a running cluster is not described. Without details on validation, rollback, or overhead measurement, it is unclear how the continuous loop avoids instability or excessive adaptation cost, which directly affects the practicality of the proactive-prevention claim.
Authors: We agree that the safety and overhead aspects of policy insertion require explicit description. In the revised §4 we will detail the validation step (shadow-mode testing on recent traces), the rollback trigger (SLO violation threshold), and measured adaptation overhead. These additions will demonstrate how the loop maintains stability and low cost. revision: yes
Circularity Check
No circularity: empirical framework without derivation chain
full rationale
The paper presents AutoSLO as an empirical, learning-based framework that applies genetic programming within a monitoring-adaptation loop to evolve scaling policies for microservices. No mathematical derivation, equations, or first-principles results are described that could reduce to the inputs by construction. Claims rest on case-study evaluations showing reduced resource usage and resolved SLO violations, with no fitted parameters renamed as predictions, no self-citation load-bearing uniqueness theorems, and no ansatz smuggled via prior work. The approach is self-contained as a practical method whose validity is assessed externally via the reported experiments rather than by internal redefinition.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
https://anonymous.4open.science/r/AutoSLO/README.md (2025)
AutoSLO. https://anonymous.4open.science/r/AutoSLO/README.md (2025)
2025
-
[2]
Aksakalli, I.K., Çelik, T., Can, A.B., Tekinerdogan, B.: Deployment and communication patterns in microservice architectures: A systematic literature review. J. Syst. Softw.180, 111014 (2021)
2021
-
[3]
https://istio.io (2024)
Authors, I.: Istio service mesh. https://istio.io (2024)
2024
-
[4]
Authors, P.: https://prometheus.io (2025)
2025
-
[5]
Balla, D., Simon, C., Maliosz, M.: Adaptive scaling of kubernetes pods. NOMS pp. 1–5 (2020)
2020
-
[6]
In: ICDCS 2019
Bauer, A., Lesch, V., Versluis, L., Ilyushkin, A., Herbst, N., Kounev, S.: Chamulteon: Coordinated auto-scaling of micro-services. In: ICDCS 2019. pp. 2015–2025
2019
-
[7]
Machine Learning45(1), 5–32 (2001)
Breiman, L.: Random forests. Machine Learning45(1), 5–32 (2001)
2001
-
[8]
In: ICSME
Chaudhary, D., Vadlamani, S.L., Thomas, D., Nejati, S., Sabetzadeh, M.: Developing a llama-based chatbot for CI/CD question answering: A case study at ericsson. In: ICSME
-
[9]
pp. 707–718. https://doi.org/10.1109/ICSME58944.2024.00075
-
[10]
Proceedings of the IEEE112(1), 12–46 (2024)
Deng,S.,Zhao,H.,Huang,B.,Zhang,C.,Chen,F.,Deng,Y.,Yin,J.,Dustdar,S.,Zomaya, A.Y.: Cloud-native computing: A survey from the perspective of services. Proceedings of the IEEE112(1), 12–46 (2024)
2024
-
[11]
In: ICAC 2019
Ding, J., Cao, R., Saravanan, I., Morris, N., Stewart, C.: Characterizing service level objectives for cloud services: Realities and myths. In: ICAC 2019. pp. 200–206
2019
-
[12]
Journal of Machine Learning Research13, 2171–2175 (jul 2012)
Fortin, F.A., De Rainville, F.M., Gardner, M.A., Parizeau, M., Gagné, C.: DEAP: Evolu- tionary algorithms made easy. Journal of Machine Learning Research13, 2171–2175 (jul 2012)
2012
-
[13]
https://github.com/GoogleCloudPlatform/microservices-demo (2024)
Google: Online boutique. https://github.com/GoogleCloudPlatform/microservices-demo (2024)
2024
-
[14]
In: HPDC 2022
Hossen, M.R., Islam, M.A., Ahmed, K.: Practical efficient microservice autoscaling with QoS assurance. In: HPDC 2022. pp. 240–252
2022
-
[15]
Kratzke, N., Quint, P.: Understanding cloud-native applications after 10 years of cloud computing - A systematic mapping study. J. Syst. Softw.126, 1–16 (2017). https://doi.org/10.1016/J.JSS.2017.01.001 Self-Adaptive Auto-Scaling of Microservices 15
-
[16]
Kubernetes: https://kubernetes.io (2023)
2023
-
[17]
IEEE Trans
Li, J., Moeini, B., Nejati, S., Sabetzadeh, M., McCallen, M.: A lean simulation framework for stress testing IoT cloud systems. IEEE Trans. Software Eng.50(7), 1827–1851 (2024)
2024
-
[18]
ACM Trans
Li, J., Nejati, S., Sabetzadeh, M.: Using genetic programming to build self-adaptivity into software-defined networks. ACM Trans. Auton. Adapt. Syst.19(1), 2:1–2:35 (2024)
2024
-
[19]
Liu, B., Nejati, S., Lucia, Briand, L.: Effective fault localization of automotive simulink models: achieving the trade-off between test oracle effort and fault localization accuracy. Empir. Softw. Eng.24(1), 444–490 (2019)
2019
-
[20]
IEEE Trans
Liu, J., Zhang, S., Wang, Q., Wei, J.: Coordinating fast concurrency adapting with autoscaling for SLO-oriented web applications. IEEE Trans. Parallel Distributed Syst. 33(12), 3349–3362 (2022)
2022
-
[21]
Lulu, second edn
Luke, S.: Essentials of Metaheuristics. Lulu, second edn. (2013)
2013
-
[22]
Luke, S., Panait, L.: A comparison of bloat control methods for genetic programming. Evol. Comput.14(3), 309–344 (2006)
2006
-
[23]
In: GLOBECOM 2020 (2020)
Marie-Magdelaine, N., Ahmed, T.: Proactive autoscaling for cloud-native applications using machine learning. In: GLOBECOM 2020 (2020)
2020
-
[24]
https://ai.meta.com/llama/ (2024)
Meta AI: The llama 3 herd of models. https://ai.meta.com/llama/ (2024)
2024
-
[25]
Electronics12(1) (2023)
Mo, H., Zhu, L., Shi, L., Tan, S., Wang, S.: Hetsev: Exploiting heterogeneity-aware au- toscaling and resource-efficient scheduling for cost-effective machine-learning model serv- ing. Electronics12(1) (2023)
2023
-
[26]
O’Reilly Media, Inc
Nadareishvili, I., Mitra, R., McLarty, M., Amundsen, M.: Microservice architecture: align- ing principles, practices, and culture. " O’Reilly Media, Inc." (2016)
2016
-
[27]
IST Journal163, 107286 (2023)
Nejati, S., Sorokin, L., Safin, D., Formica, F., Mahboob, M., Menghi, C.: Reflections on surrogate-assisted search-based testing: A taxonomy and two replication studies based on industrial ADAS and simulink models. IST Journal163, 107286 (2023)
2023
-
[28]
Sensors20(16) (2020)
Nguyen, T.T., Yeom, Y.J., Kim, T., Park, D.H., Kim, S.: Horizontal pod autoscaling in Kubernetes for elastic container orchestration. Sensors20(16) (2020)
2020
-
[29]
In: SEAMS 2024
Nunes, J.P.K.S., Nejati, S., Sabetzadeh, M., Nakagawa, E.Y.: Self-adaptive, requirements- driven autoscaling of microservices. In: SEAMS 2024. pp. 168–174
2024
-
[30]
International Journal of Science and Research Archive (2024)
Oyeniran, O.C., Modupe, O.T., Otitoola, A.A., Abiona, O.O., Adewusi, A.O., Oladapo, O.J.: A comprehensive review of leveraging cloud-native technologies for scalability and resilience in software development. International Journal of Science and Research Archive (2024)
2024
-
[31]
Poli, R., Langdon, W.B., McPhee, N.F., Koza, J.R.: A field guide to genetic programming. Lulu. com (2008)
2008
-
[32]
In: ICIST 2018
Pozdniakova, O., Mazeika, D., Cholomskis, A.: Adaptive resource provisioning and auto- scaling for cloud native software. In: ICIST 2018. vol. 920, pp. 113–129
2018
-
[33]
In: ICAICTA (2022)
Pramesti, A.A., Kistijantoro, A.I.: Autoscaling based on response time prediction for mi- croservice application in Kubernetes. In: ICAICTA (2022)
2022
-
[34]
In: OSDI 2020
Qiu, H., Banerjee, S.S., Jha, S., Kalbarczyk, Z.T., Iyer, R.K.: FIRM: an intelligent fine- grained resource management framework for SLO-oriented microservices. In: OSDI 2020. pp. 805–825
2020
-
[35]
In: SRDS 2023
Schmidt, H., Rejiba, Z., Eidenbenz, R., Förster, K.: Transparent fault tolerance for stateful applications in kubernetes with checkpoint/restore. In: SRDS 2023. pp. 129–139
2023
-
[36]
Scalable Comput
Singh, P., Gupta, P., Jyoti, K., Nayyar, A.: Research on auto-scaling of web applications in cloud: Survey, trends and future directions. Scalable Comput. Pract. Exp.20, 399–432 (2019)
2019
-
[37]
Journal of Educational and Behavioral Statistics25, 101–132 (2000)
Vargha, A., Delaney, H.D.: A critique and improvement of the cl common language effect size statistics of mcgraw and wong. Journal of Educational and Behavioral Statistics25, 101–132 (2000). https://doi.org/10.3102/10769986025002101
-
[38]
In: EPIA 2013
Veenhuis, C.: Structure-based constants in genetic programming. In: EPIA 2013. Lecture Notes in Computer Science, vol. 8154, pp. 126–137
2013
-
[39]
In: Breakthroughs in statistics: Methodology and distribution, pp
Wilcoxon, F.: Individual comparisons by ranking methods. In: Breakthroughs in statistics: Methodology and distribution, pp. 196–202 (1992)
1992
-
[40]
IEEE Trans
Xie, S., Wang, J., Li, B., Zhang, Z., Li, D., Hung, P.C.K.: PBScaler: A bottleneck-aware autoscaling framework for microservice-based applications. IEEE Trans. Serv. Comput. 17(2), 604–616 (2024)
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.