pith. sign in

arxiv: 2605.01533 · v1 · submitted 2026-05-02 · 💻 cs.SE

Genetic Programming for Self-Adaptive Auto-Scaling of Microservices

Pith reviewed 2026-05-09 14:05 UTC · model grok-4.3

classification 💻 cs.SE
keywords auto-scalingmicroservicesgenetic programmingservice-level objectivesself-adaptive systemsresource optimizationfeedback loop
0
0 comments X

The pith

AutoSLO applies genetic programming inside a monitoring feedback loop to evolve scaling policies that proactively meet SLOs in microservices while lowering overall resource consumption.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces AutoSLO as a self-adaptive framework that continuously monitors microservice performance and uses genetic programming to learn and refine scaling decisions. This replaces reactive one-off adjustments with evolved policies that adjust replica counts ahead of potential violations. The approach targets the common problem of over- or under-provisioning that drives up costs in cloud deployments. Evaluation on an online shopping platform and an LLM-based chatbot shows lower resource use alongside few SLO violations that resolve quickly when they occur. The central premise is that the evolved logic can generalize enough to keep systems stable under live workloads.

Core claim

AutoSLO is a learning-based, self-adaptive scaling framework that dynamically adjusts microservice replicas to meet SLOs while minimizing resource usage. It uses a continuous monitoring-adaptation feedback loop and leverages genetic programming to learn and evolve scaling logic, enabling the deployed microservice system to proactively prevent SLO violations rather than repeatedly searching for one-off scaling actions.

What carries the argument

The continuous monitoring-adaptation feedback loop that supplies runtime data to genetic programming for evolving scaling policies.

If this is right

  • Resource usage drops substantially compared with standard reactive scaling.
  • SLO violations occur at low frequency and resolve inside short time windows.
  • Scaling decisions shift from reactive fixes to proactive policy evolution.
  • The same framework applies to both conventional web platforms and LLM-based chatbots.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The evolved policies might transfer to other container orchestration settings such as serverless workloads.
  • Cloud operators could reduce manual threshold tuning if the genetic-programming approach proves stable over long periods.
  • Extending the loop to include cost metrics alongside SLOs could further optimize spending.

Load-bearing premise

Genetic programming can reliably evolve scaling policies that generalize across workloads and prevent SLO violations in live systems without introducing instability or excessive adaptation overhead.

What would settle it

Running the two case-study systems under AutoSLO and recording either sustained high resource consumption, frequent unresolved SLO violations, or repeated policy changes that destabilize the deployment.

Figures

Figures reproduced from arXiv: 2605.01533 by Jia Li, Mehrdad Sabetzadeh, Shiva Nejati.

Figure 1
Figure 1. Figure 1: Self-adaptation control loop of AutoSLO. strategies. To identify bottlenecks, we adopt Performance Bottleneck Analysis (PBA) [39], which detects performance degradation and resource wastage un￾der varying workloads. Accordingly, AutoSLO adjusts the replica counts of the identified bottleneck microservices to address both under- and over-provisioning. Prerequisite 2: A Surrogate Model for SLO Prediction. Ev… view at source ↗
Figure 2
Figure 2. Figure 2: A GP individual with formula trees for two bottleneck microservices. To illustrate, consider a system with two bottleneck microservices, microser￾vice1 and microservice2. A GP individual for this system is a two-element list, where each element is a formula – generated using the grammar described above – and represented as a parse tree view at source ↗
Figure 3
Figure 3. Figure 3: Experimental results for the Boutique Shop (CPU-intensive) system: (a) 90th￾percentile request latency (SLO1) and (b) number of pods across ten repeated appli￾cations of a dynamic workload. The results compare AutoSLO, AutoSLO-Ran, and HPA, showing how each approach responds to workload changes and scales pods while impacting SLO compliance. 0 1 2 3 4 1 6 11 16 21 26 31 36 41 46 51 56 61 66 factor(time) 0 … view at source ↗
Figure 4
Figure 4. Figure 4: Experimental results for the Chatbot (GPU-intensive) system:(a) Request suc￾cess rate (SLO2) and (b) number of pods across ten repeated applications of a dynamic workload. The results compare AutoSLO, AutoSLO-Ran, and HPA, illustrating differ￾ences in SLO performance and scaling behaviour under repeated high-load conditions view at source ↗
read the original abstract

Microservice architecture is widely adopted in modern systems, where auto-scaling is critical for satisfying service-level objectives (SLOs). However, determining optimal scaling for microservices is difficult, and reactive resource allocation often leads to costly over- or under-provisioning. We propose AutoSLO, a learning-based, self-adaptive scaling framework that dynamically adjusts microservice replicas to meet SLOs while minimizing resource usage. AutoSLO uses a continuous monitoring-adaptation feedback loop and leverages genetic programming to learn and evolve scaling logic, enabling the deployed microservice system to proactively prevent SLO violations rather than repeatedly searching for one-off scaling actions. We evaluate AutoSLO on two case-study systems -- an online shopping platform and a chatbot based on large language models -- and show that this framework substantially reduces resource usage while maintaining a low frequency of SLO violations, all of which are resolved within a short time window.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes AutoSLO, a self-adaptive auto-scaling framework for microservice systems. It employs a continuous monitoring-adaptation feedback loop combined with genetic programming to evolve scaling policies that proactively prevent SLO violations while minimizing resource consumption. The approach is evaluated on two case-study systems—an online shopping platform and an LLM-based chatbot—where it is claimed to achieve substantial resource savings alongside low frequencies of SLO violations that are resolved quickly.

Significance. If the central claims are supported by detailed, reproducible evidence, the work could contribute to self-adaptive systems research by showing how evolutionary computation can be integrated into live microservice scaling loops. This addresses a practical gap between reactive threshold-based scaling and more proactive, learned policies in cloud environments.

major comments (3)
  1. [Abstract and §3] Abstract and §3 (Approach): The genetic programming component is described at a high level but provides no information on policy representation (e.g., tree or rule encoding), the fitness function (how SLO violation count, latency, and resource cost are combined), evolutionary operators, population size, or termination criteria. These details are load-bearing for the claim that GP evolves generalizable, proactive scaling logic rather than workload-specific tuning.
  2. [§5] §5 (Evaluation): The results on the two case studies are reported only qualitatively ('substantially reduces resource usage' and 'low frequency of SLO violations resolved within a short time window'). No quantitative metrics, baseline comparisons (e.g., Kubernetes HPA or other ML-based scalers), statistical significance tests, workload traces, or evaluation durations are supplied. This leaves the central empirical claim without sufficient evidential support.
  3. [§4] §4 (Monitoring-Adaptation Loop): The mechanism for safely inserting a newly evolved policy into a running cluster is not described. Without details on validation, rollback, or overhead measurement, it is unclear how the continuous loop avoids instability or excessive adaptation cost, which directly affects the practicality of the proactive-prevention claim.
minor comments (2)
  1. [Abstract and §1] The abstract and introduction would benefit from explicit definitions of the SLOs used in each case study and the precise resource metrics being minimized.
  2. [§5] Figure captions and axis labels in the evaluation section should include units and exact numerical values rather than qualitative descriptions.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and will revise the manuscript accordingly to provide the requested details and strengthen the empirical support.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (Approach): The genetic programming component is described at a high level but provides no information on policy representation (e.g., tree or rule encoding), the fitness function (how SLO violation count, latency, and resource cost are combined), evolutionary operators, population size, or termination criteria. These details are load-bearing for the claim that GP evolves generalizable, proactive scaling logic rather than workload-specific tuning.

    Authors: We accept the referee's observation that §3 currently provides only a high-level description. In the revised manuscript we will expand this section with the missing technical details: scaling policies are encoded as expression trees, the fitness function is a weighted combination of SLO violation count, latency, and resource cost, standard GP operators (subtree crossover and mutation) are used, population size is 50, and evolution terminates after 100 generations or upon fitness convergence. These additions will clarify the mechanism for evolving proactive policies. revision: yes

  2. Referee: [§5] §5 (Evaluation): The results on the two case studies are reported only qualitatively ('substantially reduces resource usage' and 'low frequency of SLO violations resolved within a short time window'). No quantitative metrics, baseline comparisons (e.g., Kubernetes HPA or other ML-based scalers), statistical significance tests, workload traces, or evaluation durations are supplied. This leaves the central empirical claim without sufficient evidential support.

    Authors: The referee correctly notes that the evaluation results are presented qualitatively. In the revised §5 we will add quantitative metrics (resource reduction percentages and violation frequencies), direct comparisons against Kubernetes HPA and other baselines, statistical significance tests, descriptions of the workload traces, and the duration of each evaluation run. This will provide the necessary evidential support for the central claims. revision: yes

  3. Referee: [§4] §4 (Monitoring-Adaptation Loop): The mechanism for safely inserting a newly evolved policy into a running cluster is not described. Without details on validation, rollback, or overhead measurement, it is unclear how the continuous loop avoids instability or excessive adaptation cost, which directly affects the practicality of the proactive-prevention claim.

    Authors: We agree that the safety and overhead aspects of policy insertion require explicit description. In the revised §4 we will detail the validation step (shadow-mode testing on recent traces), the rollback trigger (SLO violation threshold), and measured adaptation overhead. These additions will demonstrate how the loop maintains stability and low cost. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical framework without derivation chain

full rationale

The paper presents AutoSLO as an empirical, learning-based framework that applies genetic programming within a monitoring-adaptation loop to evolve scaling policies for microservices. No mathematical derivation, equations, or first-principles results are described that could reduce to the inputs by construction. Claims rest on case-study evaluations showing reduced resource usage and resolved SLO violations, with no fitted parameters renamed as predictions, no self-citation load-bearing uniqueness theorems, and no ansatz smuggled via prior work. The approach is self-contained as a practical method whose validity is assessed externally via the reported experiments rather than by internal redefinition.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract describes an empirical engineering contribution with no mathematical derivations, fitted constants, or new postulated entities.

pith-pipeline@v0.9.0 · 5452 in / 1024 out tokens · 32226 ms · 2026-05-09T14:05:17.773674+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

40 extracted references · 3 canonical work pages

  1. [1]

    https://anonymous.4open.science/r/AutoSLO/README.md (2025)

    AutoSLO. https://anonymous.4open.science/r/AutoSLO/README.md (2025)

  2. [2]

    Aksakalli, I.K., Çelik, T., Can, A.B., Tekinerdogan, B.: Deployment and communication patterns in microservice architectures: A systematic literature review. J. Syst. Softw.180, 111014 (2021)

  3. [3]

    https://istio.io (2024)

    Authors, I.: Istio service mesh. https://istio.io (2024)

  4. [4]

    Authors, P.: https://prometheus.io (2025)

  5. [5]

    Balla, D., Simon, C., Maliosz, M.: Adaptive scaling of kubernetes pods. NOMS pp. 1–5 (2020)

  6. [6]

    In: ICDCS 2019

    Bauer, A., Lesch, V., Versluis, L., Ilyushkin, A., Herbst, N., Kounev, S.: Chamulteon: Coordinated auto-scaling of micro-services. In: ICDCS 2019. pp. 2015–2025

  7. [7]

    Machine Learning45(1), 5–32 (2001)

    Breiman, L.: Random forests. Machine Learning45(1), 5–32 (2001)

  8. [8]

    In: ICSME

    Chaudhary, D., Vadlamani, S.L., Thomas, D., Nejati, S., Sabetzadeh, M.: Developing a llama-based chatbot for CI/CD question answering: A case study at ericsson. In: ICSME

  9. [9]

    pp. 707–718. https://doi.org/10.1109/ICSME58944.2024.00075

  10. [10]

    Proceedings of the IEEE112(1), 12–46 (2024)

    Deng,S.,Zhao,H.,Huang,B.,Zhang,C.,Chen,F.,Deng,Y.,Yin,J.,Dustdar,S.,Zomaya, A.Y.: Cloud-native computing: A survey from the perspective of services. Proceedings of the IEEE112(1), 12–46 (2024)

  11. [11]

    In: ICAC 2019

    Ding, J., Cao, R., Saravanan, I., Morris, N., Stewart, C.: Characterizing service level objectives for cloud services: Realities and myths. In: ICAC 2019. pp. 200–206

  12. [12]

    Journal of Machine Learning Research13, 2171–2175 (jul 2012)

    Fortin, F.A., De Rainville, F.M., Gardner, M.A., Parizeau, M., Gagné, C.: DEAP: Evolu- tionary algorithms made easy. Journal of Machine Learning Research13, 2171–2175 (jul 2012)

  13. [13]

    https://github.com/GoogleCloudPlatform/microservices-demo (2024)

    Google: Online boutique. https://github.com/GoogleCloudPlatform/microservices-demo (2024)

  14. [14]

    In: HPDC 2022

    Hossen, M.R., Islam, M.A., Ahmed, K.: Practical efficient microservice autoscaling with QoS assurance. In: HPDC 2022. pp. 240–252

  15. [15]

    Kratzke, N., Quint, P.: Understanding cloud-native applications after 10 years of cloud computing - A systematic mapping study. J. Syst. Softw.126, 1–16 (2017). https://doi.org/10.1016/J.JSS.2017.01.001 Self-Adaptive Auto-Scaling of Microservices 15

  16. [16]

    Kubernetes: https://kubernetes.io (2023)

  17. [17]

    IEEE Trans

    Li, J., Moeini, B., Nejati, S., Sabetzadeh, M., McCallen, M.: A lean simulation framework for stress testing IoT cloud systems. IEEE Trans. Software Eng.50(7), 1827–1851 (2024)

  18. [18]

    ACM Trans

    Li, J., Nejati, S., Sabetzadeh, M.: Using genetic programming to build self-adaptivity into software-defined networks. ACM Trans. Auton. Adapt. Syst.19(1), 2:1–2:35 (2024)

  19. [19]

    Liu, B., Nejati, S., Lucia, Briand, L.: Effective fault localization of automotive simulink models: achieving the trade-off between test oracle effort and fault localization accuracy. Empir. Softw. Eng.24(1), 444–490 (2019)

  20. [20]

    IEEE Trans

    Liu, J., Zhang, S., Wang, Q., Wei, J.: Coordinating fast concurrency adapting with autoscaling for SLO-oriented web applications. IEEE Trans. Parallel Distributed Syst. 33(12), 3349–3362 (2022)

  21. [21]

    Lulu, second edn

    Luke, S.: Essentials of Metaheuristics. Lulu, second edn. (2013)

  22. [22]

    Luke, S., Panait, L.: A comparison of bloat control methods for genetic programming. Evol. Comput.14(3), 309–344 (2006)

  23. [23]

    In: GLOBECOM 2020 (2020)

    Marie-Magdelaine, N., Ahmed, T.: Proactive autoscaling for cloud-native applications using machine learning. In: GLOBECOM 2020 (2020)

  24. [24]

    https://ai.meta.com/llama/ (2024)

    Meta AI: The llama 3 herd of models. https://ai.meta.com/llama/ (2024)

  25. [25]

    Electronics12(1) (2023)

    Mo, H., Zhu, L., Shi, L., Tan, S., Wang, S.: Hetsev: Exploiting heterogeneity-aware au- toscaling and resource-efficient scheduling for cost-effective machine-learning model serv- ing. Electronics12(1) (2023)

  26. [26]

    O’Reilly Media, Inc

    Nadareishvili, I., Mitra, R., McLarty, M., Amundsen, M.: Microservice architecture: align- ing principles, practices, and culture. " O’Reilly Media, Inc." (2016)

  27. [27]

    IST Journal163, 107286 (2023)

    Nejati, S., Sorokin, L., Safin, D., Formica, F., Mahboob, M., Menghi, C.: Reflections on surrogate-assisted search-based testing: A taxonomy and two replication studies based on industrial ADAS and simulink models. IST Journal163, 107286 (2023)

  28. [28]

    Sensors20(16) (2020)

    Nguyen, T.T., Yeom, Y.J., Kim, T., Park, D.H., Kim, S.: Horizontal pod autoscaling in Kubernetes for elastic container orchestration. Sensors20(16) (2020)

  29. [29]

    In: SEAMS 2024

    Nunes, J.P.K.S., Nejati, S., Sabetzadeh, M., Nakagawa, E.Y.: Self-adaptive, requirements- driven autoscaling of microservices. In: SEAMS 2024. pp. 168–174

  30. [30]

    International Journal of Science and Research Archive (2024)

    Oyeniran, O.C., Modupe, O.T., Otitoola, A.A., Abiona, O.O., Adewusi, A.O., Oladapo, O.J.: A comprehensive review of leveraging cloud-native technologies for scalability and resilience in software development. International Journal of Science and Research Archive (2024)

  31. [31]

    Poli, R., Langdon, W.B., McPhee, N.F., Koza, J.R.: A field guide to genetic programming. Lulu. com (2008)

  32. [32]

    In: ICIST 2018

    Pozdniakova, O., Mazeika, D., Cholomskis, A.: Adaptive resource provisioning and auto- scaling for cloud native software. In: ICIST 2018. vol. 920, pp. 113–129

  33. [33]

    In: ICAICTA (2022)

    Pramesti, A.A., Kistijantoro, A.I.: Autoscaling based on response time prediction for mi- croservice application in Kubernetes. In: ICAICTA (2022)

  34. [34]

    In: OSDI 2020

    Qiu, H., Banerjee, S.S., Jha, S., Kalbarczyk, Z.T., Iyer, R.K.: FIRM: an intelligent fine- grained resource management framework for SLO-oriented microservices. In: OSDI 2020. pp. 805–825

  35. [35]

    In: SRDS 2023

    Schmidt, H., Rejiba, Z., Eidenbenz, R., Förster, K.: Transparent fault tolerance for stateful applications in kubernetes with checkpoint/restore. In: SRDS 2023. pp. 129–139

  36. [36]

    Scalable Comput

    Singh, P., Gupta, P., Jyoti, K., Nayyar, A.: Research on auto-scaling of web applications in cloud: Survey, trends and future directions. Scalable Comput. Pract. Exp.20, 399–432 (2019)

  37. [37]

    Journal of Educational and Behavioral Statistics25, 101–132 (2000)

    Vargha, A., Delaney, H.D.: A critique and improvement of the cl common language effect size statistics of mcgraw and wong. Journal of Educational and Behavioral Statistics25, 101–132 (2000). https://doi.org/10.3102/10769986025002101

  38. [38]

    In: EPIA 2013

    Veenhuis, C.: Structure-based constants in genetic programming. In: EPIA 2013. Lecture Notes in Computer Science, vol. 8154, pp. 126–137

  39. [39]

    In: Breakthroughs in statistics: Methodology and distribution, pp

    Wilcoxon, F.: Individual comparisons by ranking methods. In: Breakthroughs in statistics: Methodology and distribution, pp. 196–202 (1992)

  40. [40]

    IEEE Trans

    Xie, S., Wang, J., Li, B., Zhang, Z., Li, D., Hung, P.C.K.: PBScaler: A bottleneck-aware autoscaling framework for microservice-based applications. IEEE Trans. Serv. Comput. 17(2), 604–616 (2024)