A.S. Pazinin
ORCID: https://orcid.org/0009-0002-9506-953https://orcid.org/0009-0002-9506-953
Èlektron. model. 2026, 48(2):69-86
ABSTRACT
To optimize the use of computing resources, reactive and proactive approaches to service autoscaling in Kubernetes are considered: the standard reactive Horizontal Pod Autoscaler (HPA) and a proactive autoscaler based on machine learning using LSTM. A controller has been developed and proposed that collects CPU metrics from Prometheus, trains and updates the model, predicts short-term load dynamics, and adjusts the number of replicas via the Kubernetes API. Metrics for predictions and decisions are sent to Pushgateway and visualized in Grafana. Experimental studies in an Azure Kubernetes Service cluster with controlled container load showed a 30 % reduction in total vCPU usage compared to HPA while maintaining the same service level, reducing scaling latency (scaling up in 30-60 s versus 75-90 s; shrink time of 60-90 s versus 90-150 s) and elimination of “jitter.” The results confirm the effectiveness of applying proactive Kubernetes service autoscaling based on machine learning methods for services with stable or seasonal traffic patterns.
KEYWORDS
Kubernetes, autoscaling, HPA, LSTM, Prometheus, Pushgateway, Grafana.
REFERENCES
- Horizontal Pod autoscaling. (б. д.). Kubernetes. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
- Lorido-Botran, T., Miguel-Alonso, J., & Lozano, J.A. (2014). A review of auto-scaling techniques for elastic applications in cloud environments. Journal of Grid Computing, 12(4), 559- https://doi.org/10.1007/s10723-014-9314-7
- Tesauro, G., Das, R., Chan, H., Kephart, J., Levine, D., Rawson, F., & Lefurgy, C. (2007). Managing power consumption and performance of computing systems using reinforcement learning. Advances in Neural Information Processing Systems, 1497- https://papers.nips.cc/paper/3251-managing-power-consumption-and-performance-of-computing-systems-using-reinforcement-learning?utm_source=chatgpt.com
- Horizontal Pod Autoscaler walkthrough. (б. д.). Kubernetes. https://kubernetes.io/docs/ tasks/run-application/horizontal-pod-autoscale-walkthrough/
- Greff, K., Srivastava, R., Koutník, J., Steunebrink, B., & Schmidhuber, J. (2017). LSTM: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems., 28(10), 2222- https://doi.org/10.1109/TNNLS.2016.2582924
- Dang-Quang, N.-M., & Yoo, M. (2021). Deep learning-based autoscaling using bidirectional long short-term memory for kubernetes. Applied Sciences., 11(9), Стаття 3835. https://www.mdpi.com/2076-3417/11/9/3835
- Imdoukh, M., Ahmad, I., & Alfailakawi, M. (2020). Machine learning-based auto-scaling for containerized applications. Neural Computing and Applications., (32), 9745- https://link.springer.com/article/10.1007/s00521-019-04507-z
- Rolik, O., & Volkov, V. (2024). Method of horizontal pod scaling in kubernetes to omit overregulation. Information, Computing and Intelligent Systems (Journal Abbreviation: Inf. Comput. And Intell. Syst. J.), (5), 55- https://doi.org/10.20535/2786-8729.5.2024.315877
- Boyarchuk, S., & Tyshchenko, I. (2025). ARIMA and LSTM time series forecasting models in economics and finance. Computer Design Systems. Theory and Practice, 7(1), 172-180. https://doi.org/10.23939/cds2025.01.172
- What is azure kubernetes service (AKS)? — azure kubernetes service. (б. д.). Microsoft Learn: Build skills that open doors in your career. https://learn.microsoft.com/azure/aks/what-is-aks
- Data model | Prometheus. (б. д.). Prometheus — Monitoring system & time series database. https://prometheus.io/docs/concepts/data_model/
- Metrics for kubernetes object states. (б. д.). Kubernetes. https://kubernetes.io/docs/concepts/cluster-administration/kube-state-metrics/
- Pushing metrics | Prometheus. (б. д.). Prometheus — Monitoring system & time series database. https://prometheus.io/docs/instrumenting/pushing/
- Time series | Grafana documentation. (б. д.). Grafana Labs. https://grafana.com/docs/grafana/latest/panels-visualizations/visualizations/time-series/
- Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735- https://doi.org/10.1162/neco.1997.9.8.1735
- Fedoryshyn, B., & Krasko, O. (2024). Migration of services in a Kubernetes cluster based on load forecasting. Information and Communication Technologies and Electronic Engineering, 4(2), 82-92. https://doi.org/10.23939/ictee2024.02.082
- Majevsky, Ya., & Pravorska, N. (2022). Increasing the efficiency of microservices scaling automation in the Kubernetes containerized application management system. Bulletin of the Khmelnytskyi National University. Series: Technical Sciences, 313(5), 260-264. https://doi.org/10.31891/2307-5732-2022-313-5-260-264
- Islam, S., Keung, J., Lee, K., & Liu, A. (2012). Empirical prediction models for adaptive resource provisioning in the cloud. Future Generation Computer Systems, (1), 155- https://doi.org/10.1016/j.future.2011.05.027
- Resource metrics pipeline. (б. д.). Kubernetes. https://kubernetes.io/docs/tasks/debug/debug-cluster/resource-metrics-pipeline/
- Gutman, D., & Sirota, O. (2023). Proactive automatic scaling up for Kubernetes. Adaptive Automatic Control Systems, 1(42), 32-38. https://doi.org/10.20535/1560-8956.42.2023.278925