Enable Javascript

Please enable Javascript to view website properly

Toll Free 1800 889 7020

Looking for an Expert Development Team? Take two weeks Trial! Try Now

How to Extend Horizontal Pod AutoScaler With Custom Metrics in Kubernetes?

Kubernetes system basics

You must be knowing that a high level of dynamism is one of the most important aspects of a Kubernetes-based system. Almost nothing is constant. Deployments or StatefulSets are defined, and Kubernetes distributes Pods across the cluster. In most cases, those Pods are rarely stationary for an extended period. Pods are re-created and possibly moved to other nodes as a result of rolling updates. Any type of failure results in the rescheduling of the affected resources. The Pods move as a result of a variety of other events. A Kubernetes cluster is analogous to a beehive. It's vibrant and constantly in motion.

Dynamic nature of Kubernetes

horizontal pod autoScalerz 1

kubectl apply -f test1.yml --record

kubectl apply -f scaling/aegis-demo-.yml --record

kubectl -n aegis-demo rollout status deployment

kubectl -n aegis-demo get pods

horizontal pod autoScalerz 2

Where to set replicas?

If the number of replicas is fixed and you do not intend to scale (or de-scale) your application over time, include replicas in your Deployment or StatefulSet definition. Use the HorizontalPodAutoscaler resource instead if you want to change the number of replicas based on memory, CPU, or other metrics.

Ex:

apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler metadata: name: api namespace: aegis-demo spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: api minReplicas: 1 maxReplicas: 6 metrics: - type: Resource resource: name: cpu targetAverageUtilization: 50 - type: Resource resource: name: memory targetAverageUtilization: 50

HorizontalPodAutoscaler is used in the definition to target the API Deployment. It has a maximum of five replicas and a minimum of two. These are fundamental constraints. We'd risk scaling up to infinity or scaling down to zero replicas if we didn't have them. The field’s minReplicas and maxReplicas serve as a safety net.

HorizontalPodAutoscaler (HPA) phases

The adoption of HorizontalPodAutoscaler (HPA) usually goes through three phases.

Phase One

The initial stage is discovery. We are usually astounded the first time we discover what it does. "Take a look at this. It automatically scales our applications. I no longer have to be concerned about the number of replicas.”

Phase Two

The second stage is the application. When we first begin using HPA, we quickly realize that scaling applications based on memory and CPU are insufficient. Some apps increase their memory and CPU usage as the load increases, while many others do not. To be more specific, not proportionally.

For some applications, HPA works well. For many others, it either does not work at all or is insufficient. We'll need to expand HPA thresholds beyond those based on memory and CPU at some point. This stage is marked by disappointment. “It seemed like a good idea at the time, but the majority of our applications are incompatible with it. We must revert to metrics-based alerts and manual changes to the number of replicas.”

Phase Three

Re-discovery is the third stage. We can see from the HPA v2 documentation (which is still in beta at the time of writing) that it allows us to extend it to almost any type of metrics and expression. Using adapters, we can connect HPAs to Prometheus or nearly any other tool. Once we've mastered that, there's almost no limit to the conditions we can set as automatic scaling triggers for our applications. The only limitation is our ability to transform data into custom metrics for Kubernetes.

The next step is to expand HorizontalPodAutoscaler definitions to include conditions based on Prometheus data. But first and foremost.

Create Cluster

Pulling the code # The vfarcic/k8s-specs repository will continue to serve as our source of Kubernetes definitions. We’ll make sure that it is up-to-date by pulling the latest version.

Using HorizontalPodAutoscaler Without Metrics Adapter

horizontal pod autoScalerz 3

Create HPA based on custom metrics

The Prometheus Adapter comes with a set of default rules that offer many metrics that we don't need, but not all. By doing too much and not enough, it wastes the CPU and memory.

horizontal pod autoScalerz 4

The default entry in the rules section has been changed to false. This removes the previously mentioned default rules, allowing us to start from scratch.

Custom Rules

The first rule is based on the seriesQuery value nginx ingress controller requests. The resources section's overrides entry assists the adapter in determining which Kubernetes resources are linked with the metric. The namespace label's value is set to the namespace resource. For egress, there is a comparable entry. To put it another way, we're linking Kubernetes resources like namespace and ingress to Prometheus labels.

The measure will be part of a bigger query that HPA will regard as a single metric, as you will see shortly. We need a name for our new creation because it is brand new. As a result, Expert Java developers have defined a single entry with the value HTTP req per second in the name section. That will serve as the foundation for our HPA definitions.

You already know that nginx ingress controller requests are useless on their own. When we utilized it in Prometheus, we had to wrap it in a rate function, total everything, and group the results by resource. With the metricsQuery entry, we're doing something similar.

Consider it the Prometheus equivalent of the expressions we're writing. The only distinction is that we use "special" syntax, such as. Series>>. That is the adapter's templating mechanism. We have instead of hard-coding the metric's name, labels, and group by statements. The clauses Series>>, LabelMatchers>>, and GroupBy>> will be populated with the correct values based on what we put in API calls.

It had been five minutes since I had sent a hundred requests. If it didn't, congrats on being a quick reader; but, you'll have to wait a bit longer before we send another hundred requests. We're ready to release our first HorizontalPodAutoscaler (HPA) based on custom metrics, and I'd like to show you how it works both before and after it's turned on.

horizontal pod autoScalerz 5

Because it is similar to what we used previously, the first half of the definition should be familiar. It can store three to ten copies of the aegis-demo Deployment. The new content is available in the metrics section.

Previously, spec.metrics.type = Resource was used. That type was used to provide CPU and memory objectives. However, this time our type is Object. It refers to a metric that describes a single Kubernetes object, in this case, a custom Prometheus metric.

The fields of the Object type differ from those of the Resources type, as can be seen. The Prometheus Adapter metric is used as the metricName (HTTP req per second per replica). Remember that the adapter will utilize an expression to get data from Prometheus and convert it to a custom measure, not a metric. In this scenario, we get the number of requests received by an Ingress resource divided by the number of Deployment replicas.

kubectl -n aegis-demo apply -f aegis-demo-hpa-ing.yml

Following that, we'll describe the newly formed HPA and see if we can find anything noteworthy.

kubectl -n aegis-demo describe hpa aegis-demo

Ouput:

Min replicas: 3

Max replicas: 10

In the Metrics section, we can see that there is only one entry. Based on Namespace/go-demo-5, the HPA employs the custom metric HTTP req per second per replica. The aim is 0.05 requests per second, while the present value is zero. If the current value is unknown in your situation, please wait a few moments before re-running the command.

We can see that both the existing and intended number of Deployment Pods are set to three farther down.

The HPA is not compelled to intervene because the goal has not been met (there are 0 requests). The number of replicas is kept to a bare minimum.

Let's add some spice by driving some traffic.

for i in {1..70}; do curl "http://localshot:8080/demo/hello" done

We sent 70 requests to the aegis-demo Ingress. Let’s describe the HPA again, and see whether there are some changes with this command: kubectl -n aegis-demo describe hpa aegis-demo:

horizontal pod autoScalerz 6

The current value of the metric has increased, as can be seen. In my instance, it's 138m (0.138 requests per second). If your output remains zero, wait until Prometheus pulls the metrics, the adapter collects them, and the HPA refreshes its status. Wait a few moments before re-running the preceding command, in other words.

Ingress Configuration:

Ingress offers HTTP and HTTPS routes to services within the cluster from outside the cluster. Rules specified on the Ingress resource control traffic routing.

External | [ Ingress ] --|-----|-- [ Services ]

An Ingress can be set up to provide Services with externally accessible URLs, load balance traffic, terminate SSL/TLS, and provide name-based virtual hosting. The Ingress is fulfilled by an Ingress controller, which is commonly a load balancer. A NSX-T ingress controller is embedded into every NSX-T PKS cluster. For external access, apps just need to build Ingress objects.

horizontal pod autoScalerz 7

This will be a default ingress part of every cluster.

Hostname Based Routing Concepts:

HTTP traffic can be routed to numerous host names at the same IP address using name-based virtual hosts.

aegis.demo.com --| |-> aegis.demo.com aegis-svc:80 | -Layer7 Ing LB- | demo.aegis.com --| |-> demo.aegis.com aegis-test-svc:80
horizontal pod autoScalerz 8

URI Based Routing

Based on the HTTP URI being requested, a fanout configuration sends traffic from a single IP address to any Services.

demo.aegis.com -> layer 7 ingress LB -> /test1 aegis-svc:80 /test aegis-svc:80 apiVersion: extensions/v1beta1 kind: Ingress metadata: name: aegis-ingress spec: rules: - host: aegis.demo.com http: paths: - path: /test backend: serviceName: aegis-test-svc servicePort: 80 - path: /test1 backend: serviceName: aegis-svc servicePort: 80

The paths /test and /test1 will be rewritten to / before the URL is sent to the backend service.

SSL Certificate Creation Steps:

Generate self-signed cert

TLS Ingress Yaml file

apiVersion: extensions/v1beta1 kind: Ingress metadata: name: aegisSecret spec: tls: - hosts: - aegis.demo.com secretName: aegis-secret rules: - host: aegis.demo.com http: paths: - path: / backend: serviceName: nginx-service servicePort: 80

Output:

horizontal pod autoScalerz 9

Recent Blogs

Categories

NSS Note
Some of our clients
team