Skip to main content

Reducing Performance Degradation due to HPA by using CronJob

· 8 min read

When using Kubernetes's Horizontal Pod Autoscaling (HPA), there is an invevitable time delay until the desired number of Pods are deployed. Until then, the service quality may degrade, for example, the response latency can become very large if there is a sudden increase in the traffic volume. This article assumes that you can roughly estimate the request volume trends by using e.g., historical data, and introduces a way to mitigate the service degradation by using Kubernetes's CronJob to adjust the HPA parameters before the expected traffic increase. We will write a template CronJob and a Python script that generates the CronJob manifests from the template using values from a CSV file.

Background

Most of the web applications would have daily traffic trends, such as high requests per second at lunch time, moderate at office hours, and low during night. Kubernetes's feature called Horizontal Pod Autoscaling (HPA) allows you to dynamically changes the number of Pods based on the metric targets you configure. However, because of how HPA works, it is inevitable that there is a lag until the number of Pods become optimal. Until then, the request latency, for example, can become very high, if the traffic increases suddenly. However, In a real scenario, you might be able to roughly estimate the request volume in advance, by using the statistical metric values. This article introduces a way to mitigate the service degradation by using Kubernetes's CronJob to adjust the HPA metrics before the expected traffic increase.

Steps

1. Deploying our demo app

Let's use a simple ToDo app for demonstration. The app consists of the frontend React hosted on NGINX, the backend API server written in NodeJS, and the MongoDB for database.

Let's clone the repo:

# Clone the repo
git clone https://github.com/ryojp/todo.git

# Move into the repository
cd todo

Then, deploy the app by following the README:

  1. Create MongoDB-related secret:

    cp .env.dev .env
    vim .env # set secure passwords
    kubectl create ns todo
    kubectl -n todo create secret generic todosecret --from-env-file .env
  2. Install NGINX Ingress Controller

    helm upgrade --install ingress-nginx ingress-nginx --repo https://kubernetes.github.io/ingress-nginx --namespace ingress-nginx --create-namespace
  3. Apply Kubernetes configuration files:

    kubectl apply -f k8s/
  4. Visit https://localhost and you will be redirected to the login page if success.

In the steps above, we deployed the following into namespace todo:

  • ClusterIP, Deployment, and HPA for frontend and api
  • ClusterIP and StatefulSet for mongo
  • Ingress that (reverse-)proxies requests for /api to api and for / to frontend

Most importantly, we have two HPAs:

$ kubectl -n todo get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
frontend Deployment/frontend 0%/50% 1 10 1 2m13s
api Deployment/api 0%/50% 2 10 2 2m13s

2. Creating a ServiceAccount with appropriate Roles

We will run kubectl patch command to HPA from a CronJob. So, let's create a ServiceAccount that has appropriate roles to access the Kubernetes's API server:

apiVersion: v1
kind: ServiceAccount
metadata:
name: hpa-scheduler
namespace: todo
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: hpa-scheduler
namespace: todo
rules:
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- get
- list
- patch
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: hpa-scheduler
namespace: todo
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: hpa-scheduler
subjects:
- apiGroup: ""
kind: ServiceAccount
name: hpa-scheduler
namespace: todo

I believe the manifest above is self-explanatory.

One tip is that you can find apiGroups and resources from the output of kubectl api-resources command like this:

$ kubectl api-resources | grep hpa
horizontalpodautoscalers hpa autoscaling/v2 true HorizontalPodAutoscaler

Now, let's apply the manifest above:

$ kubectl apply -f k8s/hpa-scheduler/sa.yml
serviceaccount/hpa-scheduler created
role.rbac.authorization.k8s.io/hpa-scheduler created
rolebinding.rbac.authorization.k8s.io/hpa-scheduler created

3. Defining a daily schedule in CSV

Let's assume that we can expect the request volume suddenly increases at lunch time (12:00-13:00). To maintain low response latency without delay, we increase the minReplicas field of HPA five minutes before 12:00. Below is our daily schedule for the CronJob.

NAMESPACE,HPA,HOUR,MINUTE,MIN_REPLICAS,MAX_REPLICAS
todo,frontend,11,55,3,10
todo,frontend,13,10,1,10
todo,api,05,55,3,10
todo,api,11,55,6,10
todo,api,13,10,3,10
todo,api,22,10,1,10

4. Creating a CronJob template

We use the values in the CSV to generate CronJob manifests. Below is the template for that.

apiVersion: batch/v1
kind: CronJob
metadata:
name: cj-__HPA__-__HOURMIN__
namespace: __NAMESPACE__
labels:
target: __HPA__
app.kubernetes.io/name: cj-__HPA__-__HOURMIN__
spec:
schedule: __CRON__ # Timezone UTC(+0000)
startingDeadlineSeconds: 120
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 5
failedJobsHistoryLimit: 5
jobTemplate:
spec:
template:
metadata:
labels:
sidecar.istio.io/inject: "false"
spec:
serviceAccountName: hpa-scheduler
containers:
- name: hpa-scheduler
image: bitnami/kubectl:1.26
command:
- /bin/sh
- -c
- |
kubectl -n __NAMESPACE__ patch hpa/__HPA__ -p '{"spec":{"minReplicas":__MIN__, "maxReplicas":__MAX__}}'
restartPolicy: OnFailure

In this template, variables surrounded by double underscores, such as __HPA__, will be replaced with the values in the CSV file, using a Python script we will write in the next step.

Since Kubernetes v1.27, you can specify timezone in .spec.timeZone (doc). This template does not use it because v1.27 is not GA in cloud providers such as Azure at the time of this writing. Instead, we'll convert the time zone in the Python script.

5. Creating a Python script to generate CronJob manifests from CSV

Here is our script for reading CSV and generating CronJob manifests:

#!/usr/bin/env python3

from collections import namedtuple
import csv
import os


BASE_DIR = "k8s/hpa-scheduler/"
OUTPUT_DIR = os.path.join(BASE_DIR, "generated/")
CRONJOB_TEMPLATE_FILE = os.path.join(BASE_DIR, "cronjob-template.yml")
CRONJOB_CSV_FILE = os.path.join(BASE_DIR, "cronjob.csv")
TIMEZONE_DIFF = int(os.environ.get("TIMEZONE_DIFF", +9)) # UTC+0900

Config = namedtuple("Config", "NAMESPACE,HPA,HOUR,MINUTE,MIN_REPLICAS,MAX_REPLICAS")


def substitute_template_yml(
config: Config, template: str = CRONJOB_TEMPLATE_FILE
) -> str:
# read the template file content
with open(template, "r") as f:
content = f.read()

# convert "3" -> "03" etc.
minute, hour = config.MINUTE.rjust(2, "0"), config.HOUR.rjust(2, "0")

# convert timezone to UTC
utc_hour = (int(hour) - TIMEZONE_DIFF) % 24
utc_hour = str(utc_hour).rjust(2, "0")

return (
content.replace("__CRON__", f"{minute} {utc_hour} * * *")
.replace("__NAMESPACE__", config.NAMESPACE)
.replace("__HPA__", config.HPA)
.replace("__HOURMIN__", hour + minute)
.replace("__MIN__", config.MIN_REPLICAS)
.replace("__MAX__", config.MAX_REPLICAS)
)


def read_csv(csv_filename: str) -> list[Config]:
with open(csv_filename) as f:
reader = csv.reader(f)

# make sure the CSV header matches the `Config` type
assert Config._fields == tuple(next(reader)) # RHS: reading the CSV header

return list(map(Config._make, reader))


def write_yml(filename: str, content: str) -> None:
with open(filename, "w") as f:
f.write(content)


def generate_filename(config: Config) -> str:
return (
f'cj-{config.HPA}{config.HOUR.rjust(2, "0")}{config.MINUTE.rjust(2, "0")}.yml'
)


def generate_yml_from_csv(csv_filename: str, outdir: str) -> None:
os.makedirs(outdir, exist_ok=True)
for config in read_csv(csv_filename=csv_filename):
_filename = generate_filename(config)
_content = substitute_template_yml(config=config)
write_yml(os.path.join(outdir, _filename), _content)


if __name__ == "__main__":
generate_yml_from_csv(CRONJOB_CSV_FILE, OUTPUT_DIR)

Let's make it an executable:

chmod +x k8s/hpa-scheduler/generate.py

6. Verifying it works

First, let's run the Python script:

TIMEZONE_DIFF=0 ./k8s/hpa-scheduler/generate.py

Please change the TIMEZONE_DIFF for your location. In most systems, the CronJob uses the timezone UTC, so if your timezone is UTC+0900 you can pass TIMEZONE_DIFF=9. In MicroK8s, however, it is your local timezone according to a blog post.

Then, apply the generated CronJob manifests:

kubectl apply -f k8s/hpa-scheduler/generated/

Let's check the status of CronJobs and HPA:

# List CronJobs that target frontend or api
$ kubectl -n todo get cj -l "target in (frontend, api)"
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
cj-api-0555 55 05 * * * False 0 <none> 13s
cj-api-1155 55 11 * * * False 0 <none> 13s
cj-api-1310 10 13 * * * False 0 <none> 13s
cj-api-2226 26 13 * * * False 0 <none> 13s
cj-frontend-1155 55 11 * * * False 0 <none> 13s
cj-frontend-1310 10 13 * * * False 0 <none> 13s
cj-api-2235 35 22 * * * False 1 4s 13s

# Get the log of the job that ran 4 seconds ago
$ kubectl -n todo logs cj-api-2235-28126895-572c5
horizontalpodautoscaler.autoscaling/api patched

# Verify that the MINPODS for the hpa/api is reduced from 2 to 1
$ kubectl -n todo get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
frontend Deployment/frontend 0%/50% 1 10 1 120m
api Deployment/api 0%/50% 1 10 1 120m

From above, we can confirm that the CronJob cj-api-2235 scheduled at 22:35 for hpa/api completed successfully and changed the MINPODS for hpa/api from 2 to 1.

7. Clean up

kubectl delete ns todo ingress-nginx

Final Thoughts

This article introduced a simple yet robust solution for balancing cost and performance. You can still leverage HPA to fine-tune the number of Pods dynamically while maintaining the service quality (e.g., response latency). From learning perspectives, this article gave me a chance to learn ServiceAccount, Role, and CronJob.

Resources