When using Kubernetes's Horizontal Pod Autoscaling (HPA), there is an invevitable time delay until the desired number of Pods are deployed. Until then, the service quality may degrade, for example, the response latency can become very large if there is a sudden increase in the traffic volume. This article assumes that you can roughly estimate the request volume trends by using e.g., historical data, and introduces a way to mitigate the service degradation by using Kubernetes's CronJob to adjust the HPA parameters before the expected traffic increase. We will write a template CronJob and a Python script that generates the CronJob manifests from the template using values from a CSV file.
In a previous article, I demonstrated the Horizontal Pod Autoscaler (HPA) of Kubernetes. In that article, I set up autoscaling for CPU utilization, but this time I will set the HPA target to the request processing time of the pod in question. Since it is time-consuming to manage metrics values on the application side, I will get the request processing time metrics from the service mesh Istio.