From faba95db42539fb6aa206ba8a41501e120c57c44 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Tue, 4 Jun 2024 17:53:38 +0530 Subject: [PATCH 01/34] Autoscaler helpers --- Autoscaler101/helpers.md | 7 +++++++ 1 file changed, 7 insertions(+) create mode 100644 Autoscaler101/helpers.md diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md new file mode 100644 index 00000000..1e324851 --- /dev/null +++ b/Autoscaler101/helpers.md @@ -0,0 +1,7 @@ +# Autoscaling helpers + +We have now discussed HPA's, VPA's, and you might have even read the section on [KEDA](../Keda101/what-is-keda.md) to learn about advanced scaling. Now that you know all there is to about scaling, let's take a step back and look at a few important things to consider when it comes to scaling. In this section, we will discuss: + +- Readiness/liveness/startup probes +- Graceful shutdowns +- Annotations that help with scaling. \ No newline at end of file From bb113df187b07d78c1f0b8f5d543c8b1a4e3143e Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Thu, 6 Jun 2024 17:35:08 +0530 Subject: [PATCH 02/34] Autoscaler helpers --- Autoscaler101/helpers.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 1e324851..d58c5805 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -4,4 +4,16 @@ We have now discussed HPA's, VPA's, and you might have even read the section on - Readiness/liveness/startup probes - Graceful shutdowns -- Annotations that help with scaling. \ No newline at end of file +- Annotations that help with scaling. + +You may have already come across these concepts before, and just about every Kubernetes-based tool uses them to ensure stability. We will discuss each of the above points and follow up with a lab where we test out the above concepts using a simple Nginx server. + +# Probes + +What are readiness/liveness/startup probes and why are they useful for autoscaling? Let's break down each type of probe. + +- Readiness probe: As the name suggests, this probe checks to ensure that your container is ready. + +In order to do this, you could implement several methods. The simplest and most frequently used is the http get method. You simply point the readiness probe at your containers' endpoint, then have the probe ping it. If the response to the ping is 200 OK, your pod is ready to start receiving traffic. This is incredibly useful since it's rare that your application is ready to go immediately after starting up. Usually, the application needs to establish connections with databases, contact other microservices to get some starting information, or even run entire startup scripts to prepare the application. So it may take a couple of seconds to a couple of minutes for your application to be ready to take traffic. If any requests come in within this period, they will be dropped. With the readiness probe, you can be assured that this won't happen. + +Apart from a simple HTTP get requests, you could also run TCP commands to see if ports are up, or even run a whole bash script that executes all manner of commands to determine whether your pod is ready. \ No newline at end of file From bef00eff790110c3acdae5d584613b6811819b83 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Sat, 8 Jun 2024 12:08:09 +0530 Subject: [PATCH 03/34] helpers cont. --- Autoscaler101/helpers.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index d58c5805..0db9b790 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -16,4 +16,8 @@ What are readiness/liveness/startup probes and why are they useful for autoscali In order to do this, you could implement several methods. The simplest and most frequently used is the http get method. You simply point the readiness probe at your containers' endpoint, then have the probe ping it. If the response to the ping is 200 OK, your pod is ready to start receiving traffic. This is incredibly useful since it's rare that your application is ready to go immediately after starting up. Usually, the application needs to establish connections with databases, contact other microservices to get some starting information, or even run entire startup scripts to prepare the application. So it may take a couple of seconds to a couple of minutes for your application to be ready to take traffic. If any requests come in within this period, they will be dropped. With the readiness probe, you can be assured that this won't happen. -Apart from a simple HTTP get requests, you could also run TCP commands to see if ports are up, or even run a whole bash script that executes all manner of commands to determine whether your pod is ready. \ No newline at end of file +Apart from a simple HTTP get requests, you could also run TCP commands to see if ports are up, or even run a whole bash script that executes all commands to determine whether your pod is ready. However, this probe only continually checks to see if your app is ready to take requests. It blocks off any traffic if it starts to notice that the probe is failing. If you only have a readiness probe in place, even if your app has gone into an error state, Kubernetes will only prevent traffic from entering that pod until the probe starts to pass. It will not restart the failed application for you. This is where liveness probes come in. + +- Liveness probe: Check if your application is alive. + +A liveness and readiness probe do almost the same thing, except a liveness probe restarts the pod if it starts failing, unlike the readiness probe which only stops traffic to the pod until the probe starts succeeding. \ No newline at end of file From bcc7a4144c8474dc844adb5fec432da0ac8138b4 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Sun, 9 Jun 2024 11:15:29 +0530 Subject: [PATCH 04/34] Probes cont. --- Autoscaler101/helpers.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 0db9b790..c92032aa 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -20,4 +20,8 @@ Apart from a simple HTTP get requests, you could also run TCP commands to see if - Liveness probe: Check if your application is alive. -A liveness and readiness probe do almost the same thing, except a liveness probe restarts the pod if it starts failing, unlike the readiness probe which only stops traffic to the pod until the probe starts succeeding. \ No newline at end of file +A liveness and readiness probe do almost the same thing, except a liveness probe restarts the pod if it starts failing, unlike the readiness probe which only stops traffic to the pod until the probe starts succeeding. This means that the liveness probe should come after the readiness probe. You could say something like: if my container's port 8080 isn't being reached, stop sending traffic (readiness probe). If it is still unreachable after 1 minute, fail the liveness probe and restart the pod since the container has likely crashed, gone OOM, or is meeting some other pod or node constraints. + +- Startup probe: A probe similar to the other two, but only runs on startup. + +If your pod takes a while to initialize, it's best to use startup probes. Startup probes ensure that your pod started correctly. You can even use the same endpoint as the liveness probe but with a less strict wait time. When your pod is already running, you don't expect the endpoint to go down for more than a few seconds if at all. However, when starting up, you can expect it to be down until your application finishes initializing. This is why there is a separate startup probe instead of re-using the existing liveness probe. \ No newline at end of file From 8c9e5065b4abdc169f2c9679d4c1fd999971fc5c Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Mon, 10 Jun 2024 17:58:13 +0530 Subject: [PATCH 05/34] Autoscaler helpers --- Autoscaler101/helpers.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index c92032aa..9b46b047 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -24,4 +24,6 @@ A liveness and readiness probe do almost the same thing, except a liveness probe - Startup probe: A probe similar to the other two, but only runs on startup. -If your pod takes a while to initialize, it's best to use startup probes. Startup probes ensure that your pod started correctly. You can even use the same endpoint as the liveness probe but with a less strict wait time. When your pod is already running, you don't expect the endpoint to go down for more than a few seconds if at all. However, when starting up, you can expect it to be down until your application finishes initializing. This is why there is a separate startup probe instead of re-using the existing liveness probe. \ No newline at end of file +If your pod takes a while to initialize, it's best to use startup probes. Startup probes ensure that your pod started correctly. You can even use the same endpoint as the liveness probe but with a less strict wait time. When your pod is already running, you don't expect the endpoint to go down for more than a few seconds if at all. However, when starting up, you can expect it to be down until your application finishes initializing. This is why there is a separate startup probe instead of re-using the existing liveness probe. + +So how do these probes help with autoscaling? In the case where replicas of pods increase and decrease meaning that instances of your application are provisioned and de-provisioned, you need to make sure there is no downtime. This is where all the above probes come into play. When the load into your application increases and replicas of your pods show up, you don't want any traffic served until they are ready. If they have issues getting prepared and don't start after a while, you want them to restart and try to auto-recover. Finally, if a pod fails after running for a while, you want traffic to be blocked off and that pod restarted. This is why these probes are necessary for autoscaling. \ No newline at end of file From 3ee2131510c73ecebd805db860243ea29ad7fdcf Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Tue, 11 Jun 2024 14:10:31 +0530 Subject: [PATCH 06/34] Autoscaler helpers --- Autoscaler101/helpers.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 9b46b047..cf695f84 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -26,4 +26,10 @@ A liveness and readiness probe do almost the same thing, except a liveness probe If your pod takes a while to initialize, it's best to use startup probes. Startup probes ensure that your pod started correctly. You can even use the same endpoint as the liveness probe but with a less strict wait time. When your pod is already running, you don't expect the endpoint to go down for more than a few seconds if at all. However, when starting up, you can expect it to be down until your application finishes initializing. This is why there is a separate startup probe instead of re-using the existing liveness probe. -So how do these probes help with autoscaling? In the case where replicas of pods increase and decrease meaning that instances of your application are provisioned and de-provisioned, you need to make sure there is no downtime. This is where all the above probes come into play. When the load into your application increases and replicas of your pods show up, you don't want any traffic served until they are ready. If they have issues getting prepared and don't start after a while, you want them to restart and try to auto-recover. Finally, if a pod fails after running for a while, you want traffic to be blocked off and that pod restarted. This is why these probes are necessary for autoscaling. \ No newline at end of file +So how do these probes help with autoscaling? In the case where replicas of pods increase and decrease meaning that instances of your application are provisioned and de-provisioned, you need to make sure there is no downtime. This is where all the above probes come into play. When the load into your application increases and replicas of your pods show up, you don't want any traffic served until they are ready. If they have issues getting prepared and don't start after a while, you want them to restart and try to auto-recover. Finally, if a pod fails after running for a while, you want traffic to be blocked off and that pod restarted. This is why these probes are necessary for autoscaling. + +## Graceful shutdowns + +Now let's take a look at graceful shutdowns. If you were running a website that had high traffic and your pods scaled up during high traffic, they must scale back down after a while to make sure that your infrastructure costs are kept as efficient as possible. However, if your Kubernetes configuration was to immediately kill the pod off while the traffic was being served, that might result in a few requests being dropped. This is where graceful shutdowns are needed. + +Depending on the type of web application you are running, you may not need to configure graceful shutdowns from the Kubernetes configuration. Instead, the application framework itself might be able to intercept the shutdown signal Kubernetes sends and automatically prevent the application from receiving any new traffic. For example, in SpringBoot, you can enable graceful shutdowns simply by adding the config `server.shutdown=graceful` into your application config. However, if your application framework doesn't support something like this, or you prefer to keep your Kubernetes configurations and application configurations separate, you might consider creating a `shutdown` endpoint. \ No newline at end of file From fc136091fe3da4c3e72265c98ac261e53b4083c8 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Thu, 13 Jun 2024 18:32:26 +0530 Subject: [PATCH 07/34] Nodepool added --- Autoscaler101/helpers.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index cf695f84..fdb5a6f5 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -32,4 +32,6 @@ So how do these probes help with autoscaling? In the case where replicas of pods Now let's take a look at graceful shutdowns. If you were running a website that had high traffic and your pods scaled up during high traffic, they must scale back down after a while to make sure that your infrastructure costs are kept as efficient as possible. However, if your Kubernetes configuration was to immediately kill the pod off while the traffic was being served, that might result in a few requests being dropped. This is where graceful shutdowns are needed. -Depending on the type of web application you are running, you may not need to configure graceful shutdowns from the Kubernetes configuration. Instead, the application framework itself might be able to intercept the shutdown signal Kubernetes sends and automatically prevent the application from receiving any new traffic. For example, in SpringBoot, you can enable graceful shutdowns simply by adding the config `server.shutdown=graceful` into your application config. However, if your application framework doesn't support something like this, or you prefer to keep your Kubernetes configurations and application configurations separate, you might consider creating a `shutdown` endpoint. \ No newline at end of file +Depending on the type of web application you are running, you may not need to configure graceful shutdowns from the Kubernetes configuration. Instead, the application framework itself might be able to intercept the shutdown signal Kubernetes sends and automatically prevent the application from receiving any new traffic. For example, in SpringBoot, you can enable graceful shutdowns simply by adding the config `server.shutdown=graceful` into your application config. However, if your application framework doesn't support something like this, or you prefer to keep your Kubernetes and application configurations separate, you might consider creating a `shutdown` endpoint. We will do this during the lab. + +While microservices generally take in traffic through their endpoints, your application might differ. Your application might do batch processing by reading messages off RabbitMQ, or it might occasionally read a database and transform the data within it. In cases like this, having the pod or job terminated for scaling considerations might leave your database table in an unstable state, or it might mean that the message your pod was processing never ends up finishing. In any of these cases, graceful shutdowns can keep the pod from terminating long enough for your pod to either finish what it started or ensure a different pod can pick up where it left off. \ No newline at end of file From 1db32b068420be788260b447e7747eca90deb5d6 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Fri, 14 Jun 2024 17:13:07 +0530 Subject: [PATCH 08/34] Nodepool added --- Autoscaler101/helpers.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index fdb5a6f5..225670c7 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -34,4 +34,13 @@ Now let's take a look at graceful shutdowns. If you were running a website that Depending on the type of web application you are running, you may not need to configure graceful shutdowns from the Kubernetes configuration. Instead, the application framework itself might be able to intercept the shutdown signal Kubernetes sends and automatically prevent the application from receiving any new traffic. For example, in SpringBoot, you can enable graceful shutdowns simply by adding the config `server.shutdown=graceful` into your application config. However, if your application framework doesn't support something like this, or you prefer to keep your Kubernetes and application configurations separate, you might consider creating a `shutdown` endpoint. We will do this during the lab. -While microservices generally take in traffic through their endpoints, your application might differ. Your application might do batch processing by reading messages off RabbitMQ, or it might occasionally read a database and transform the data within it. In cases like this, having the pod or job terminated for scaling considerations might leave your database table in an unstable state, or it might mean that the message your pod was processing never ends up finishing. In any of these cases, graceful shutdowns can keep the pod from terminating long enough for your pod to either finish what it started or ensure a different pod can pick up where it left off. \ No newline at end of file +While microservices generally take in traffic through their endpoints, your application might differ. Your application might do batch processing by reading messages off RabbitMQ, or it might occasionally read a database and transform the data within it. In cases like this, having the pod or job terminated for scaling considerations might leave your database table in an unstable state, or it might mean that the message your pod was processing never ends up finishing. In any of these cases, graceful shutdowns can keep the pod from terminating long enough for your pod to either finish what it started or ensure a different pod can pick up where it left off. + +If the jobs you are running are mission-critical, and each of your jobs must run to completion, then even graceful shutdowns might not be enough. In this case, you can turn to annotations to help you out. + +## Annotations + +Annotations are a very powerful tool that you can use in your Kubernetes environments to fine-tune various aspects of how Kubernetes works. If, as mentioned above, you need to make sure that your critical job runs to completion regardless of the node cost, then you might want to make sure that the node that is running your job does not de-provision while the job is still running on it. You can do this by adding the below annotation: + +annotations: + "cluster-autoscaler.kubernetes.io/safe-to-evict": "false" \ No newline at end of file From f86f0fff8b4a9118aa8054e2f6985e2bf552201e Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Sat, 15 Jun 2024 11:37:44 +0530 Subject: [PATCH 09/34] Helpers cont. --- Autoscaler101/helpers.md | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 225670c7..4ad809e6 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -42,5 +42,21 @@ If the jobs you are running are mission-critical, and each of your jobs must run Annotations are a very powerful tool that you can use in your Kubernetes environments to fine-tune various aspects of how Kubernetes works. If, as mentioned above, you need to make sure that your critical job runs to completion regardless of the node cost, then you might want to make sure that the node that is running your job does not de-provision while the job is still running on it. You can do this by adding the below annotation: +``` annotations: - "cluster-autoscaler.kubernetes.io/safe-to-evict": "false" \ No newline at end of file + "cluster-autoscaler.kubernetes.io/safe-to-evict": "false" +``` + +This will ensure that your node stays up even if there is only 1 job running on it. This will certainly increase the cost of your infrastructure since normally, Kubernetes would relocate jobs and de-provision nodes to increase resource efficiency. It will only shut down the node once no jobs are running that have this annotation left. However, if you don't want the nodes to shut at all, you can add a different annotation that ensures that your nodes never scale down: + +``` +cluster-autoscaler.kubernetes.io/scale-down-disabled +``` + +This annotation should be applied directly to a node like so: + +``` +kubectl annotate node my-node cluster-autoscaler.kubernetes.io/scale-down-disabled=true +``` + +Obviously, this is not a recommended option unless you have no other choice regarding the severity of your application. Ideally, your jobs should be able to handle shutdowns gracefully, and any jobs that start in place of the old ones should be able to complete what the previous job was doing. From c7044a2b8992d927b59c6e85fc43455f10be8caf Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Mon, 17 Jun 2024 17:52:34 +0530 Subject: [PATCH 10/34] Autoscaler helpers --- Autoscaler101/helpers.md | 139 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 139 insertions(+) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 4ad809e6..c2a031f5 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -60,3 +60,142 @@ kubectl annotate node my-node cluster-autoscaler.kubernetes.io/scale-down-disabl ``` Obviously, this is not a recommended option unless you have no other choice regarding the severity of your application. Ideally, your jobs should be able to handle shutdowns gracefully, and any jobs that start in place of the old ones should be able to complete what the previous job was doing. + +Kubernetes annotations are key-value pairs that can be added to Kubernetes resources for various purposes, including influencing autoscaling behavior. Here are some commonly used annotations related to autoscaling: + +### Horizontal Pod Autoscaler (HPA) + +1. **Resource Limits and Requests**: + - Ensure that your Pods have resource requests and limits set, which the HPA can use to scale based on CPU and memory usage. + ```yaml + spec: + containers: + - name: myapp + image: myapp:latest + resources: + requests: + cpu: "100m" + memory: "200Mi" + limits: + cpu: "500m" + memory: "500Mi" + ``` + +2. **Custom Metrics**: + - When using custom metrics for autoscaling, you might need to annotate your deployment to specify which metrics to use. + ```yaml + metadata: + annotations: + autoscaling.alpha.kubernetes.io/metrics: '[{"type": "Resource", "resource": {"name": "cpu", "targetAverageUtilization": 80}}]' + ``` + +### Cluster Autoscaler + +1. **Pod Priority**: + - Influence the Cluster Autoscaler by specifying pod priority, ensuring critical pods get scheduled first. + ```yaml + spec: + priorityClassName: high-priority + ``` + +2. **Pod Disruption Budget (PDB)**: + - Define a PDB to control the number of pods that can be disrupted during scaling activities. + ```yaml + apiVersion: policy/v1 + kind: PodDisruptionBudget + metadata: + name: myapp-pdb + spec: + minAvailable: 80% + selector: + matchLabels: + app: myapp + ``` + +3. **Autoscaler Behavior**: + - Use annotations to modify the behavior of the Cluster Autoscaler for specific node groups. + ```yaml + metadata: + annotations: + cluster-autoscaler.kubernetes.io/safe-to-evict: "false" + ``` + +4. **Scale-down Disabled**: + - Prevent the Cluster Autoscaler from scaling down specific nodes or node groups. + ```yaml + metadata: + annotations: + cluster-autoscaler.kubernetes.io/scale-down-disabled: "true" + ``` + +### Node Autoscaling + +1. **Taints and Tolerations**: + - Use taints and tolerations to influence scheduling and scaling behaviors, ensuring only appropriate pods are scheduled on specific nodes. + ```yaml + spec: + taints: + - key: dedicated + value: myapp + effect: NoSchedule + ``` + +2. **Node Affinity**: + - Define node affinity rules to influence where pods are scheduled, which indirectly affects autoscaling decisions. + ```yaml + spec: + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: kubernetes.io/e2e-az-name + operator: In + values: + - e2e-az1 + - e2e-az2 + ``` + +3. **Karpenter Specific Annotations**: + - For users of Karpenter, specific annotations can control aspects of autoscaling behavior. + ```yaml + metadata: + annotations: + karpenter.sh/capacity-type: "spot" + karpenter.sh/instance-profile: "my-instance-profile" + ``` + +### Example of a Deployment with Autoscaling Annotations + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: myapp + annotations: + cluster-autoscaler.kubernetes.io/safe-to-evict: "false" +spec: + replicas: 3 + selector: + matchLabels: + app: myapp + template: + metadata: + labels: + app: myapp + annotations: + autoscaling.alpha.kubernetes.io/metrics: '[{"type": "Resource", "resource": {"name": "cpu", "targetAverageUtilization": 80}}]' + spec: + containers: + - name: myapp + image: myapp:latest + resources: + requests: + cpu: "100m" + memory: "200Mi" + limits: + cpu: "500m" + memory: "500Mi" +``` + +These annotations and configurations can significantly impact the autoscaling behavior of your Kubernetes cluster, allowing for more fine-grained control over resource allocation and scaling policies. \ No newline at end of file From 2b0bb852fc3d219dd2b1ec13cc519f35b5e29b19 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Tue, 18 Jun 2024 17:28:10 +0530 Subject: [PATCH 11/34] Autoscaler helpers --- Autoscaler101/helpers.md | 65 ++++------------------------------------ 1 file changed, 6 insertions(+), 59 deletions(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index c2a031f5..197368be 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -61,33 +61,13 @@ kubectl annotate node my-node cluster-autoscaler.kubernetes.io/scale-down-disabl Obviously, this is not a recommended option unless you have no other choice regarding the severity of your application. Ideally, your jobs should be able to handle shutdowns gracefully, and any jobs that start in place of the old ones should be able to complete what the previous job was doing. -Kubernetes annotations are key-value pairs that can be added to Kubernetes resources for various purposes, including influencing autoscaling behavior. Here are some commonly used annotations related to autoscaling: +Another annotation that help with autoscaling is the `autoscaling.alpha.kubernetes.io/metrics` annotation which allows you to specify custom metrics for autoscaling, like so: -### Horizontal Pod Autoscaler (HPA) - -1. **Resource Limits and Requests**: - - Ensure that your Pods have resource requests and limits set, which the HPA can use to scale based on CPU and memory usage. - ```yaml - spec: - containers: - - name: myapp - image: myapp:latest - resources: - requests: - cpu: "100m" - memory: "200Mi" - limits: - cpu: "500m" - memory: "500Mi" - ``` - -2. **Custom Metrics**: - - When using custom metrics for autoscaling, you might need to annotate your deployment to specify which metrics to use. - ```yaml - metadata: - annotations: - autoscaling.alpha.kubernetes.io/metrics: '[{"type": "Resource", "resource": {"name": "cpu", "targetAverageUtilization": 80}}]' - ``` +```yaml +metadata: + annotations: + autoscaling.alpha.kubernetes.io/metrics: '[{"type": "Resource", "resource": {"name": "cpu", "targetAverageUtilization": 80}}]' +``` ### Cluster Autoscaler @@ -165,37 +145,4 @@ Kubernetes annotations are key-value pairs that can be added to Kubernetes resou karpenter.sh/instance-profile: "my-instance-profile" ``` -### Example of a Deployment with Autoscaling Annotations - -```yaml -apiVersion: apps/v1 -kind: Deployment -metadata: - name: myapp - annotations: - cluster-autoscaler.kubernetes.io/safe-to-evict: "false" -spec: - replicas: 3 - selector: - matchLabels: - app: myapp - template: - metadata: - labels: - app: myapp - annotations: - autoscaling.alpha.kubernetes.io/metrics: '[{"type": "Resource", "resource": {"name": "cpu", "targetAverageUtilization": 80}}]' - spec: - containers: - - name: myapp - image: myapp:latest - resources: - requests: - cpu: "100m" - memory: "200Mi" - limits: - cpu: "500m" - memory: "500Mi" -``` - These annotations and configurations can significantly impact the autoscaling behavior of your Kubernetes cluster, allowing for more fine-grained control over resource allocation and scaling policies. \ No newline at end of file From 7207c0517bbc34339c5de9a61a854af475247bea Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Wed, 19 Jun 2024 17:39:50 +0530 Subject: [PATCH 12/34] Autoscaler helpers --- Autoscaler101/helpers.md | 31 ++++--------------------------- 1 file changed, 4 insertions(+), 27 deletions(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 197368be..5ecff7bd 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -5,6 +5,7 @@ We have now discussed HPA's, VPA's, and you might have even read the section on - Readiness/liveness/startup probes - Graceful shutdowns - Annotations that help with scaling. +- Pod priority/disruption You may have already come across these concepts before, and just about every Kubernetes-based tool uses them to ensure stability. We will discuss each of the above points and follow up with a lab where we test out the above concepts using a simple Nginx server. @@ -61,7 +62,7 @@ kubectl annotate node my-node cluster-autoscaler.kubernetes.io/scale-down-disabl Obviously, this is not a recommended option unless you have no other choice regarding the severity of your application. Ideally, your jobs should be able to handle shutdowns gracefully, and any jobs that start in place of the old ones should be able to complete what the previous job was doing. -Another annotation that help with autoscaling is the `autoscaling.alpha.kubernetes.io/metrics` annotation which allows you to specify custom metrics for autoscaling, like so: +Another annotation that helps with autoscaling is the `autoscaling.alpha.kubernetes.io/metrics` annotation which allows you to specify custom metrics for autoscaling, like so: ```yaml metadata: @@ -69,6 +70,8 @@ metadata: autoscaling.alpha.kubernetes.io/metrics: '[{"type": "Resource", "resource": {"name": "cpu", "targetAverageUtilization": 80}}]' ``` +Now that we have looked at annotations, let's look at how pod priority and disruptions budgets can help you with scaling. + ### Cluster Autoscaler 1. **Pod Priority**: @@ -92,34 +95,8 @@ metadata: app: myapp ``` -3. **Autoscaler Behavior**: - - Use annotations to modify the behavior of the Cluster Autoscaler for specific node groups. - ```yaml - metadata: - annotations: - cluster-autoscaler.kubernetes.io/safe-to-evict: "false" - ``` - -4. **Scale-down Disabled**: - - Prevent the Cluster Autoscaler from scaling down specific nodes or node groups. - ```yaml - metadata: - annotations: - cluster-autoscaler.kubernetes.io/scale-down-disabled: "true" - ``` - ### Node Autoscaling -1. **Taints and Tolerations**: - - Use taints and tolerations to influence scheduling and scaling behaviors, ensuring only appropriate pods are scheduled on specific nodes. - ```yaml - spec: - taints: - - key: dedicated - value: myapp - effect: NoSchedule - ``` - 2. **Node Affinity**: - Define node affinity rules to influence where pods are scheduled, which indirectly affects autoscaling decisions. ```yaml From c41e9d1a2383598c74c47c4884970af6a0363d03 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Thu, 20 Jun 2024 17:08:59 +0530 Subject: [PATCH 13/34] Autoscaler helpers --- Autoscaler101/helpers.md | 46 +++++++++++++++++++++++----------------- 1 file changed, 26 insertions(+), 20 deletions(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 5ecff7bd..0d2c529f 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -72,28 +72,34 @@ metadata: Now that we have looked at annotations, let's look at how pod priority and disruptions budgets can help you with scaling. -### Cluster Autoscaler +**Pod Priority**: -1. **Pod Priority**: - - Influence the Cluster Autoscaler by specifying pod priority, ensuring critical pods get scheduled first. - ```yaml - spec: - priorityClassName: high-priority - ``` +You can influence the Cluster Autoscaler by specifying pod priority, and ensuring critical pods get scheduled first. -2. **Pod Disruption Budget (PDB)**: - - Define a PDB to control the number of pods that can be disrupted during scaling activities. - ```yaml - apiVersion: policy/v1 - kind: PodDisruptionBudget - metadata: - name: myapp-pdb - spec: - minAvailable: 80% - selector: - matchLabels: - app: myapp - ``` + ```yaml +  spec: +    priorityClassName: high-priority + ``` + +If you have an application that handles all incoming traffic and then routes it to a second application, you would want the pods of the external-facing application that handles traffic to have more priority when scheduling. If you have jobs that run batch workloads, they might take lesser priority compared to pods that handle your active users. + +Earlier, we discussed using annotations to prevent disruptions due to scaling. However, those methods were somewhat extreme, making the node stay up even if 1 pod was running or making sure the node never went down at all. What if we wanted to allow scaling but also wanted to maintain some control over how much this scaling was allowed to disrupt our workloads? This is where pod disruption budgets come into play. + +**Pod Disruption Budget (PDB)**: + +PDBs define the number of pods that can be disrupted during scaling activities. + +```yaml +apiVersion: policy/v1 +kind: PodDisruptionBudget +metadata: + name: myapp-pdb +spec: + minAvailable: 80% + selector: + matchLabels: + app: myapp +``` ### Node Autoscaling From 1999e0341cd23682d2078632629617d6c88c2bb3 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Fri, 21 Jun 2024 13:42:05 +0530 Subject: [PATCH 14/34] PDBs --- Autoscaler101/helpers.md | 58 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 54 insertions(+), 4 deletions(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 0d2c529f..bc893915 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -87,20 +87,70 @@ Earlier, we discussed using annotations to prevent disruptions due to scaling. H **Pod Disruption Budget (PDB)**: -PDBs define the number of pods that can be disrupted during scaling activities. +A Pod Disruption Budget (PDB) is a Kubernetes resource that ensures a minimum number of pods are always available during voluntary disruptions, such as maintenance or cluster upgrades. It prevents too many pods of a critical application from being taken down simultaneously, thus maintaining the application's availability and reliability. + +### Key Components of a PDB + +1. **Min Available**: Specifies the minimum number of pods that must be available after an eviction. +2. **Max Unavailable**: Specifies the maximum number of pods that can be unavailable during a disruption. + +### Example Scenario + +Let's say you have a Kubernetes Deployment with 5 replicas of a critical web service. You want to ensure that at least 3 replicas are always available during maintenance activities. + +#### PDB Configuration + +You can create a PDB with the following YAML configuration: ```yaml apiVersion: policy/v1 kind: PodDisruptionBudget metadata: - name: myapp-pdb + name: web-service-pdb spec: - minAvailable: 80% + minAvailable: 3 selector: matchLabels: - app: myapp + app: web-service ``` +### Steps Explained + +1. **Define the API Version and Kind**: + - `apiVersion: policy/v1`: Specifies the API version. + - `kind: PodDisruptionBudget`: Indicates that this resource is a PDB. + +2. **Metadata**: + - `name: web-service-pdb`: The name of the PDB. + +3. **Spec**: + - `minAvailable: 3`: Specifies that at least 3 pods must be available at all times. + - `selector`: Defines the set of pods the PDB applies to. In this case, it matches pods with the label `app: web-service`. + +### How it Works + +1. **Normal Operation**: + - Under normal conditions, all 5 replicas of the web service are running. + +2. **During Disruption**: + - When a voluntary disruption occurs (e.g., node maintenance or a manual pod eviction), the PDB ensures that at least 3 out of the 5 pods remain running. + - If an attempt is made to evict more than 2 pods at the same time, the eviction will be blocked until the number of available pods is at least 3. + +### Example in Action + +Imagine a scenario where a node running 2 of the 5 replicas of the web service is scheduled for maintenance: + +- **Before Maintenance**: All 5 pods are running. +- **Eviction Begins**: The node is cordoned, and the 2 pods on it are scheduled for eviction. +- **PDB Check**: Kubernetes checks the PDB, which requires at least 3 pods to be available. +- **Allowed Eviction**: Since evicting 2 pods will leave 3 pods running, the eviction is allowed. +- **Maintenance Completed**: The node is maintained, and the evicted pods are rescheduled on available nodes. +- **After Maintenance**: All 5 pods are running again, meeting the PDB requirement. + +### Summary + +Pod Disruption Budgets are crucial for maintaining high availability of applications during planned maintenance or other voluntary disruptions. By setting appropriate values for `minAvailable` or `maxUnavailable`, you can ensure that critical services remain operational and meet your desired availability targets. + ### Node Autoscaling 2. **Node Affinity**: From ea9cefa9c7f8f5532e92202ca1b590129d141973 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Sat, 22 Jun 2024 13:05:09 +0530 Subject: [PATCH 15/34] autoscaler helpers --- Autoscaler101/helpers.md | 34 +++++++--------------------------- 1 file changed, 7 insertions(+), 27 deletions(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index bc893915..920f2b20 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -89,18 +89,7 @@ Earlier, we discussed using annotations to prevent disruptions due to scaling. H A Pod Disruption Budget (PDB) is a Kubernetes resource that ensures a minimum number of pods are always available during voluntary disruptions, such as maintenance or cluster upgrades. It prevents too many pods of a critical application from being taken down simultaneously, thus maintaining the application's availability and reliability. -### Key Components of a PDB - -1. **Min Available**: Specifies the minimum number of pods that must be available after an eviction. -2. **Max Unavailable**: Specifies the maximum number of pods that can be unavailable during a disruption. - -### Example Scenario - -Let's say you have a Kubernetes Deployment with 5 replicas of a critical web service. You want to ensure that at least 3 replicas are always available during maintenance activities. - -#### PDB Configuration - -You can create a PDB with the following YAML configuration: +Let's say you have a Kubernetes Deployment with 5 replicas of a critical web service. You want to ensure that at least 3 replicas are always available during maintenance activities. You can create a PDB with the following YAML configuration: ```yaml apiVersion: policy/v1 @@ -114,18 +103,13 @@ spec: app: web-service ``` -### Steps Explained +This is not very different to how other Kubernetes resources work where the external resource applies its configuration by selecting the deployment with a label. The components of the above PodDisruptionBudget are as follows: -1. **Define the API Version and Kind**: - - `apiVersion: policy/v1`: Specifies the API version. - - `kind: PodDisruptionBudget`: Indicates that this resource is a PDB. - -2. **Metadata**: - - `name: web-service-pdb`: The name of the PDB. - -3. **Spec**: - - `minAvailable: 3`: Specifies that at least 3 pods must be available at all times. - - `selector`: Defines the set of pods the PDB applies to. In this case, it matches pods with the label `app: web-service`. +- `apiVersion: policy/v1`: Specifies the API version. +- `kind: PodDisruptionBudget`: Indicates that this resource is a PDB. +- `name: web-service-pdb`: The name of the PDB. +- `minAvailable: 3`: Specifies that at least 3 pods must be available at all times. +- `selector`: Defines the set of pods the PDB applies to. In this case, it matches pods with the label `app: web-service`. ### How it Works @@ -147,10 +131,6 @@ Imagine a scenario where a node running 2 of the 5 replicas of the web service i - **Maintenance Completed**: The node is maintained, and the evicted pods are rescheduled on available nodes. - **After Maintenance**: All 5 pods are running again, meeting the PDB requirement. -### Summary - -Pod Disruption Budgets are crucial for maintaining high availability of applications during planned maintenance or other voluntary disruptions. By setting appropriate values for `minAvailable` or `maxUnavailable`, you can ensure that critical services remain operational and meet your desired availability targets. - ### Node Autoscaling 2. **Node Affinity**: From dbecc5a3548f907f1be5947e6da64ec5225e0360 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Mon, 24 Jun 2024 17:53:52 +0530 Subject: [PATCH 16/34] Autoscaler helpers --- Autoscaler101/helpers.md | 41 ++++++++++++++++------------------------ 1 file changed, 16 insertions(+), 25 deletions(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 920f2b20..c938a731 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -120,34 +120,25 @@ This is not very different to how other Kubernetes resources work where the exte - When a voluntary disruption occurs (e.g., node maintenance or a manual pod eviction), the PDB ensures that at least 3 out of the 5 pods remain running. - If an attempt is made to evict more than 2 pods at the same time, the eviction will be blocked until the number of available pods is at least 3. -### Example in Action +Now that we're clear on disruption budgets, let's look at node affinities. -Imagine a scenario where a node running 2 of the 5 replicas of the web service is scheduled for maintenance: +**Node Affinity**: -- **Before Maintenance**: All 5 pods are running. -- **Eviction Begins**: The node is cordoned, and the 2 pods on it are scheduled for eviction. -- **PDB Check**: Kubernetes checks the PDB, which requires at least 3 pods to be available. -- **Allowed Eviction**: Since evicting 2 pods will leave 3 pods running, the eviction is allowed. -- **Maintenance Completed**: The node is maintained, and the evicted pods are rescheduled on available nodes. -- **After Maintenance**: All 5 pods are running again, meeting the PDB requirement. +Node affinity rules influence where pods are scheduled, indirectly affecting autoscaling decisions. -### Node Autoscaling - -2. **Node Affinity**: - - Define node affinity rules to influence where pods are scheduled, which indirectly affects autoscaling decisions. - ```yaml - spec: - affinity: - nodeAffinity: - requiredDuringSchedulingIgnoredDuringExecution: - nodeSelectorTerms: - - matchExpressions: - - key: kubernetes.io/e2e-az-name - operator: In - values: - - e2e-az1 - - e2e-az2 - ``` +```yaml +spec: + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: kubernetes.io/e2e-az-name + operator: In + values: + - e2e-az1 + - e2e-az2 +``` 3. **Karpenter Specific Annotations**: - For users of Karpenter, specific annotations can control aspects of autoscaling behavior. From 65b20b12cfe55d7f36c420630f29b352c724bc8f Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Tue, 25 Jun 2024 17:41:54 +0530 Subject: [PATCH 17/34] Autoscaler helpers --- Autoscaler101/helpers.md | 13 +------------ 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index c938a731..597c1cdd 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -138,15 +138,4 @@ spec: values: - e2e-az1 - e2e-az2 -``` - -3. **Karpenter Specific Annotations**: - - For users of Karpenter, specific annotations can control aspects of autoscaling behavior. - ```yaml - metadata: - annotations: - karpenter.sh/capacity-type: "spot" - karpenter.sh/instance-profile: "my-instance-profile" - ``` - -These annotations and configurations can significantly impact the autoscaling behavior of your Kubernetes cluster, allowing for more fine-grained control over resource allocation and scaling policies. \ No newline at end of file +``` \ No newline at end of file From 331bba50f6dc561ce53700fba8ddfb6ee704210c Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Wed, 26 Jun 2024 18:00:54 +0530 Subject: [PATCH 18/34] Autoscaler helpers --- Autoscaler101/helpers.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 597c1cdd..af575ae7 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -138,4 +138,6 @@ spec: values: - e2e-az1 - e2e-az2 -``` \ No newline at end of file +``` + +Ensuring that nodes & pods are started in different zones will ensure high availability when a zone goes down. This also brings us to an important point if you run a large application with many microservices. Each replica of each microservice requires its own IP, and in a normal subnet, you only have 250 of them. Considering that each node you bring up has several daemonsets running on them that reserve their own IPs, coupled with each microservice replica needing its own IP, you might quickly find yourself in a position where you have run out of IPs and the pod is unable to start because the CNI doesn't have any IPs left to assign. In this case, having several subnets spread evenly across several availability zones is the answer. But even then, it is possible that the cluster autoscaler (or Karpenter if you use that instead), will end up provisioning nodes in a subnet that is about to run out of IPs. So having zonal topology constraints at a pod level will ensure that the pods are spread out and demand that nodes be spread out across the subnets, thereby reducing the change of IP address exhaustion. \ No newline at end of file From 0af670ebb726b8a5b4aaa3fe363649bb87991d53 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Thu, 27 Jun 2024 17:45:00 +0530 Subject: [PATCH 19/34] termination log --- Autoscaler101/helpers.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index af575ae7..b661fb6c 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -140,4 +140,6 @@ spec: - e2e-az2 ``` -Ensuring that nodes & pods are started in different zones will ensure high availability when a zone goes down. This also brings us to an important point if you run a large application with many microservices. Each replica of each microservice requires its own IP, and in a normal subnet, you only have 250 of them. Considering that each node you bring up has several daemonsets running on them that reserve their own IPs, coupled with each microservice replica needing its own IP, you might quickly find yourself in a position where you have run out of IPs and the pod is unable to start because the CNI doesn't have any IPs left to assign. In this case, having several subnets spread evenly across several availability zones is the answer. But even then, it is possible that the cluster autoscaler (or Karpenter if you use that instead), will end up provisioning nodes in a subnet that is about to run out of IPs. So having zonal topology constraints at a pod level will ensure that the pods are spread out and demand that nodes be spread out across the subnets, thereby reducing the change of IP address exhaustion. \ No newline at end of file +Ensuring that nodes & pods are started in different zones will ensure high availability when a zone goes down. This also brings us to an important point if you run a large application with many microservices. Each replica of each microservice requires its own IP, and in a normal subnet, you only have 250 of them. Considering that each node you bring up has several daemonsets running on them that reserve their own IPs, coupled with each microservice replica needing its own IP, you might quickly find yourself in a position where you have run out of IPs and the pod is unable to start because the CNI doesn't have any IPs left to assign. In this case, having several subnets spread evenly across several availability zones is the answer. But even then, it is possible that the cluster autoscaler (or Karpenter if you use that instead), will end up provisioning nodes in a subnet that is about to run out of IPs. So having zonal topology constraints at a pod level will ensure that the pods are spread out and demand that nodes be spread across the subnets, thereby reducing the chance of IP address exhaustion. + +This is the very start of looking into possible problems you could run into while scaling. Depending on how much you scale and what you scale with, you might run into all manner of unpredictable issues. If you were to take an application designed to run on a static machine, and then scale it as-is, the problems would become apparent to you. So if you are planning to scale your production workloads, make sure you have proper monitoring and logging in place. For more on this, take a look at how you can [run filebeat as a sidecar](../Logging101/filebeat-sidecar.md) so that even if you were to scale your applications to hundreds of replicas, you would still have perfect logging over each of them. This is pretty crucial because at one point, sifting through log files is no longer an option. There would be so many that you would probably have trouble finding one among the thousands. \ No newline at end of file From cbaf9f694b3a77c1716d9e5a8fa72bb7669b8344 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Sat, 29 Jun 2024 16:10:00 +0530 Subject: [PATCH 20/34] Autoscaler helpers --- Autoscaler101/helpers.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index b661fb6c..17b1beb1 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -142,4 +142,4 @@ spec: Ensuring that nodes & pods are started in different zones will ensure high availability when a zone goes down. This also brings us to an important point if you run a large application with many microservices. Each replica of each microservice requires its own IP, and in a normal subnet, you only have 250 of them. Considering that each node you bring up has several daemonsets running on them that reserve their own IPs, coupled with each microservice replica needing its own IP, you might quickly find yourself in a position where you have run out of IPs and the pod is unable to start because the CNI doesn't have any IPs left to assign. In this case, having several subnets spread evenly across several availability zones is the answer. But even then, it is possible that the cluster autoscaler (or Karpenter if you use that instead), will end up provisioning nodes in a subnet that is about to run out of IPs. So having zonal topology constraints at a pod level will ensure that the pods are spread out and demand that nodes be spread across the subnets, thereby reducing the chance of IP address exhaustion. -This is the very start of looking into possible problems you could run into while scaling. Depending on how much you scale and what you scale with, you might run into all manner of unpredictable issues. If you were to take an application designed to run on a static machine, and then scale it as-is, the problems would become apparent to you. So if you are planning to scale your production workloads, make sure you have proper monitoring and logging in place. For more on this, take a look at how you can [run filebeat as a sidecar](../Logging101/filebeat-sidecar.md) so that even if you were to scale your applications to hundreds of replicas, you would still have perfect logging over each of them. This is pretty crucial because at one point, sifting through log files is no longer an option. There would be so many that you would probably have trouble finding one among the thousands. \ No newline at end of file +This is the very start of looking into problems you could run into while scaling. Depending on how much you scale and what you scale with, you might run into unpredictable issues. If you were to take an application designed to run on a static machine, and then scale it as-is, the problems would become apparent. So if you are planning to scale your production workloads, make sure you have proper monitoring and logging in place. For more on this, take a look at how you can [run filebeat as a sidecar](../Logging101/filebeat-sidecar.md) so that even if you were to scale your applications to hundreds of replicas, you would still have perfect logging over each of them. This is crucial because at one point, sifting through log files is no longer an option. There would be so many that you would probably have trouble finding one among the thousands. Monitoring is also pretty important. Even with graceful shutdowns enabled, you might sometimes see requests getting dropped. When this happens, you need to know why the request was declined. Perhaps the node went out of memory, or the pod unexpectedly crashed. Maybe there was some third party that intervened with the graceful shutdown. So having something that can scrape the logs and Kubernetes events is pretty useful so can go back and see what failed. Of course, proper logging and monitoring is crucial in any production environment. However, when it comes to Kubernetes where nodes, pods, and other resources come up and go down regularly, you wouldn't have a trace of what happened to the resource when you wanted to debug something. \ No newline at end of file From a5b94efaadb9effb0f964c0c364f5d366f2bee00 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Sun, 30 Jun 2024 15:17:02 +0530 Subject: [PATCH 21/34] Autoscaling considerations --- Autoscaler101/helpers.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 17b1beb1..8e51dbde 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -142,4 +142,8 @@ spec: Ensuring that nodes & pods are started in different zones will ensure high availability when a zone goes down. This also brings us to an important point if you run a large application with many microservices. Each replica of each microservice requires its own IP, and in a normal subnet, you only have 250 of them. Considering that each node you bring up has several daemonsets running on them that reserve their own IPs, coupled with each microservice replica needing its own IP, you might quickly find yourself in a position where you have run out of IPs and the pod is unable to start because the CNI doesn't have any IPs left to assign. In this case, having several subnets spread evenly across several availability zones is the answer. But even then, it is possible that the cluster autoscaler (or Karpenter if you use that instead), will end up provisioning nodes in a subnet that is about to run out of IPs. So having zonal topology constraints at a pod level will ensure that the pods are spread out and demand that nodes be spread across the subnets, thereby reducing the chance of IP address exhaustion. -This is the very start of looking into problems you could run into while scaling. Depending on how much you scale and what you scale with, you might run into unpredictable issues. If you were to take an application designed to run on a static machine, and then scale it as-is, the problems would become apparent. So if you are planning to scale your production workloads, make sure you have proper monitoring and logging in place. For more on this, take a look at how you can [run filebeat as a sidecar](../Logging101/filebeat-sidecar.md) so that even if you were to scale your applications to hundreds of replicas, you would still have perfect logging over each of them. This is crucial because at one point, sifting through log files is no longer an option. There would be so many that you would probably have trouble finding one among the thousands. Monitoring is also pretty important. Even with graceful shutdowns enabled, you might sometimes see requests getting dropped. When this happens, you need to know why the request was declined. Perhaps the node went out of memory, or the pod unexpectedly crashed. Maybe there was some third party that intervened with the graceful shutdown. So having something that can scrape the logs and Kubernetes events is pretty useful so can go back and see what failed. Of course, proper logging and monitoring is crucial in any production environment. However, when it comes to Kubernetes where nodes, pods, and other resources come up and go down regularly, you wouldn't have a trace of what happened to the resource when you wanted to debug something. \ No newline at end of file +### Additional autoscaling considerations + +This is the very start of looking into problems you could run into while scaling. Depending on how much you scale and what you scale with, you might run into unpredictable issues. If you were to take an application designed to run on a static machine, and then scale it as-is, the problems would become apparent. So if you are planning to scale your production workloads, make sure you have proper monitoring and logging in place. For more on this, take a look at how you can [run filebeat as a sidecar](../Logging101/filebeat-sidecar.md) so that even if you were to scale your applications to hundreds of replicas, you would still have perfect logging over each of them. This is crucial because at one point, sifting through log files is no longer an option. There would be so many that you would probably have trouble finding one among the thousands. Monitoring is also pretty important. Even with graceful shutdowns enabled, you might sometimes see requests getting dropped. When this happens, you need to know why the request was declined. Perhaps the node went out of memory, or the pod unexpectedly crashed. Maybe there was some third party that intervened with the graceful shutdown. So having something that can scrape the logs and Kubernetes events is pretty useful so can go back and see what failed. Of course, proper logging and monitoring is crucial in any production environment. However, when it comes to Kubernetes where nodes, pods, and other resources come up and go down regularly, you wouldn't have a trace of what happened to the resource when you wanted to debug something. + +You also have to consider that the tool used to perform scaling might run into issues. For example, if you were to use Karpenter for node scaling and all Karpenter pods went down, node scaling would no longer happen. This means that pods might come up and remain in a pending state since there aren't enough nodes to run the pods. To counter this, you need to run multiple replicas of Karpenter and properly set the node affinity rules so that there is never a situation where Karpenter goes down completely. Additionally, tools like KEDA which are used for pod autoscaling ensure low downtime by running reserve pods that can come up if the main pod goes down. Other autoscaling tools like the cluster autoscaler and HPA/VPA are already built with resilience in mind, so you don't have to worry too much about it. However, if you were to something like scale your pods based on Prometheus metrics, then it is your responsibility to make sure that Prometheus is running with high availability perhaps with the help of Thanos. \ No newline at end of file From 310b1914fcfb099c6d4deff1eec03b3d495747e2 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Mon, 1 Jul 2024 17:41:38 +0530 Subject: [PATCH 22/34] Autoscaler helpers --- Autoscaler101/helpers.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 8e51dbde..c5546a06 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -146,4 +146,8 @@ Ensuring that nodes & pods are started in different zones will ensure high avail This is the very start of looking into problems you could run into while scaling. Depending on how much you scale and what you scale with, you might run into unpredictable issues. If you were to take an application designed to run on a static machine, and then scale it as-is, the problems would become apparent. So if you are planning to scale your production workloads, make sure you have proper monitoring and logging in place. For more on this, take a look at how you can [run filebeat as a sidecar](../Logging101/filebeat-sidecar.md) so that even if you were to scale your applications to hundreds of replicas, you would still have perfect logging over each of them. This is crucial because at one point, sifting through log files is no longer an option. There would be so many that you would probably have trouble finding one among the thousands. Monitoring is also pretty important. Even with graceful shutdowns enabled, you might sometimes see requests getting dropped. When this happens, you need to know why the request was declined. Perhaps the node went out of memory, or the pod unexpectedly crashed. Maybe there was some third party that intervened with the graceful shutdown. So having something that can scrape the logs and Kubernetes events is pretty useful so can go back and see what failed. Of course, proper logging and monitoring is crucial in any production environment. However, when it comes to Kubernetes where nodes, pods, and other resources come up and go down regularly, you wouldn't have a trace of what happened to the resource when you wanted to debug something. -You also have to consider that the tool used to perform scaling might run into issues. For example, if you were to use Karpenter for node scaling and all Karpenter pods went down, node scaling would no longer happen. This means that pods might come up and remain in a pending state since there aren't enough nodes to run the pods. To counter this, you need to run multiple replicas of Karpenter and properly set the node affinity rules so that there is never a situation where Karpenter goes down completely. Additionally, tools like KEDA which are used for pod autoscaling ensure low downtime by running reserve pods that can come up if the main pod goes down. Other autoscaling tools like the cluster autoscaler and HPA/VPA are already built with resilience in mind, so you don't have to worry too much about it. However, if you were to something like scale your pods based on Prometheus metrics, then it is your responsibility to make sure that Prometheus is running with high availability perhaps with the help of Thanos. \ No newline at end of file +You also have to consider that the tool used to perform scaling might run into issues. For example, if you were to use Karpenter for node scaling and all Karpenter pods went down, node scaling would no longer happen. This means that pods might come up and remain in a pending state since there aren't enough nodes to run the pods. To counter this, you need to run multiple replicas of Karpenter and properly set the node affinity rules so that there is never a situation where Karpenter goes down completely. Additionally, tools like KEDA which are used for pod autoscaling ensure low downtime by running reserve pods that can come up if the main pod goes down. Other autoscaling tools like the cluster autoscaler and HPA/VPA are already built with resilience in mind, so you don't have to worry too much about it. However, if you were to something like scale your pods based on Prometheus metrics, then it is your responsibility to make sure that Prometheus is running with high availability perhaps with the help of Thanos. + +## Lab + +Now, let's get started on the lab and take a practical look at all the things we discussed above. For this, it's best to use a cloud provider for your Kubernetes cluster as opposed to Minikube, since we need to have multiple nodes so we can take a look at node scaling. Even a multi-node cluster that you run on your local machine is fine. For this, we will be using the Nginx image as the application and come up with our own Nginx deployment yaml that incorporates most of the attributes discussed above. \ No newline at end of file From 7b1347aa77b3fbbdf00c5a75aaecd4310be42e02 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Thu, 4 Jul 2024 16:59:27 +0530 Subject: [PATCH 23/34] Autoscaler helpers --- Autoscaler101/helpers.md | 192 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 191 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index c5546a06..367ec72c 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -150,4 +150,194 @@ You also have to consider that the tool used to perform scaling might run into i ## Lab -Now, let's get started on the lab and take a practical look at all the things we discussed above. For this, it's best to use a cloud provider for your Kubernetes cluster as opposed to Minikube, since we need to have multiple nodes so we can take a look at node scaling. Even a multi-node cluster that you run on your local machine is fine. For this, we will be using the Nginx image as the application and come up with our own Nginx deployment yaml that incorporates most of the attributes discussed above. \ No newline at end of file +Now, let's get started on the lab and take a practical look at all the things we discussed above. For this, it's best to use a cloud provider for your Kubernetes cluster as opposed to Minikube, since we need to have multiple nodes so we can take a look at node scaling. Even a multi-node cluster that you run on your local machine is fine. For this, we will be using the Nginx image as the application and come up with our own Nginx deployment yaml that incorporates most of the attributes discussed above. + +Certainly! Here’s a detailed example of how to configure readiness, liveness, and startup probes for an NGINX deployment in Kubernetes. + +### Step 1: Create a Kubernetes Deployment Manifest + +Create a deployment manifest file named `nginx-deployment.yaml` with the following content: + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: nginx-deployment + labels: + app: nginx +spec: + replicas: 3 + selector: + matchLabels: + app: nginx + template: + metadata: + labels: + app: nginx + spec: + containers: + - name: nginx + image: nginx:latest + ports: + - containerPort: 80 + livenessProbe: + httpGet: + path: /healthz + port: 80 + initialDelaySeconds: 30 + periodSeconds: 10 + readinessProbe: + httpGet: + path: /readiness + port: 80 + initialDelaySeconds: 5 + periodSeconds: 5 + startupProbe: + httpGet: + path: /startup + port: 80 + initialDelaySeconds: 0 + periodSeconds: 10 + lifecycle: + postStart: + exec: + command: ["/bin/sh", "-c", "echo 'nginx started'"] + preStop: + exec: + command: ["/bin/sh", "-c", "nginx -s quit"] +``` + +### Step 2: Create a ConfigMap for Custom NGINX Configuration + +Create a ConfigMap file named `nginx-configmap.yaml` with the following content to define custom health check endpoints: + +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: nginx-config +data: + default.conf: | + server { + listen 80; + + location /healthz { + access_log off; + return 200 'OK'; + add_header Content-Type text/plain; + } + + location /readiness { + access_log off; + return 200 'OK'; + add_header Content-Type text/plain; + } + + location /startup { + access_log off; + return 200 'OK'; + add_header Content-Type text/plain; + } + + location / { + root /usr/share/nginx/html; + index index.html index.htm; + } + } +``` + +### Step 3: Apply the ConfigMap + +```sh +kubectl apply -f nginx-configmap.yaml +``` + +### Step 4: Create a Kubernetes ConfigMap Volume Mount in Deployment + +Update the `nginx-deployment.yaml` to mount the ConfigMap as a volume: + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: nginx-deployment + labels: + app: nginx +spec: + replicas: 3 + selector: + matchLabels: + app: nginx + template: + metadata: + labels: + app: nginx + spec: + containers: + - name: nginx + image: nginx:latest + ports: + - containerPort: 80 + volumeMounts: + - name: nginx-config-volume + mountPath: /etc/nginx/conf.d + subPath: default.conf + livenessProbe: + httpGet: + path: /healthz + port: 80 + initialDelaySeconds: 30 + periodSeconds: 10 + readinessProbe: + httpGet: + path: /readiness + port: 80 + initialDelaySeconds: 5 + periodSeconds: 5 + startupProbe: + httpGet: + path: /startup + port: 80 + initialDelaySeconds: 0 + periodSeconds: 10 + lifecycle: + postStart: + exec: + command: ["/bin/sh", "-c", "echo 'nginx started'"] + preStop: + exec: + command: ["/bin/sh", "-c", "nginx -s quit"] + volumes: + - name: nginx-config-volume + configMap: + name: nginx-config +``` + +### Step 5: Apply the Updated Deployment Manifest + +```sh +kubectl apply -f nginx-deployment.yaml +``` + +### Step 6: Verify the Deployment + +To check the status of your deployment and the probes, you can use the following commands: + +```sh +# Check the status of the deployment +kubectl get deployments + +# Check the status of the pods +kubectl get pods + +# Describe a pod to see probe details +kubectl describe pod +``` + +This setup ensures that: + +- **Liveness Probe:** Checks if the NGINX container is alive. If it fails, Kubernetes will restart the container. +- **Readiness Probe:** Checks if the NGINX container is ready to serve traffic. If it fails, the pod will be removed from the service endpoints. +- **Startup Probe:** Ensures that the NGINX container has started up properly before any liveness or readiness probes are executed. + +With this configuration, you should have a robust deployment of NGINX with proper health checks using readiness, liveness, and startup probes. \ No newline at end of file From a9d8ada621f1767743da282c1454acc759f9fb55 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Fri, 5 Jul 2024 14:32:43 +0530 Subject: [PATCH 24/34] Autoscaler helpers --- Autoscaler101/helpers.md | 107 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 106 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 367ec72c..73ccdd34 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -340,4 +340,109 @@ This setup ensures that: - **Readiness Probe:** Checks if the NGINX container is ready to serve traffic. If it fails, the pod will be removed from the service endpoints. - **Startup Probe:** Ensures that the NGINX container has started up properly before any liveness or readiness probes are executed. -With this configuration, you should have a robust deployment of NGINX with proper health checks using readiness, liveness, and startup probes. \ No newline at end of file +With this configuration, you should have a robust deployment of NGINX with proper health checks using readiness, liveness, and startup probes. + + +Implementing a graceful shutdown for your NGINX deployment involves ensuring that your application can handle termination signals properly, finish any ongoing requests, and clean up resources before the container is terminated. Here’s how you can achieve this in Kubernetes: + +### Step 1: Define a PreStop Hook in Your Deployment Manifest + +Update the deployment manifest (`nginx-deployment.yaml`) to include a `preStop` hook that will handle the graceful shutdown: + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: nginx-deployment + labels: + app: nginx +spec: + replicas: 3 + selector: + matchLabels: + app: nginx + template: + metadata: + labels: + app: nginx + spec: + terminationGracePeriodSeconds: 60 + containers: + - name: nginx + image: nginx:latest + ports: + - containerPort: 80 + volumeMounts: + - name: nginx-config-volume + mountPath: /etc/nginx/conf.d + subPath: default.conf + livenessProbe: + httpGet: + path: /healthz + port: 80 + initialDelaySeconds: 30 + periodSeconds: 10 + readinessProbe: + httpGet: + path: /readiness + port: 80 + initialDelaySeconds: 5 + periodSeconds: 5 + startupProbe: + httpGet: + path: /startup + port: 80 + initialDelaySeconds: 0 + periodSeconds: 10 + lifecycle: + postStart: + exec: + command: ["/bin/sh", "-c", "echo 'nginx started'"] + preStop: + exec: + command: ["/bin/sh", "-c", "nginx -s quit && sleep 30"] + volumes: + - name: nginx-config-volume + configMap: + name: nginx-config +``` + +### Step 2: Apply the Updated Deployment Manifest + +Apply the updated deployment manifest to your Kubernetes cluster: + +```sh +kubectl apply -f nginx-deployment.yaml +``` + +### Explanation of Configuration + +1. **terminationGracePeriodSeconds:** This sets the period (in seconds) that Kubernetes will wait after sending a SIGTERM signal to the container before forcefully terminating it with a SIGKILL signal. The default value is 30 seconds, but it can be adjusted based on your application's requirements. In this example, it is set to 60 seconds. + +2. **preStop Hook:** This lifecycle hook executes a command just before the container is terminated. In this case, the command is `nginx -s quit && sleep 30`. + - `nginx -s quit`: This command gracefully stops the NGINX process, allowing it to finish serving ongoing requests. + - `sleep 30`: This ensures that the container waits for 30 seconds before fully shutting down. This additional sleep period provides extra time for ongoing requests to complete and for NGINX to shut down cleanly. + +### Step 3: Verify the Graceful Shutdown + +To verify that the graceful shutdown is working correctly, you can simulate a termination of one of the NGINX pods and observe the logs and behavior: + +1. **Delete an NGINX Pod:** + + ```sh + kubectl delete pod + ``` + +2. **Observe Logs and Behavior:** + + Use the following command to check the logs of the NGINX pod: + + ```sh + kubectl logs --previous + ``` + + Look for logs indicating that NGINX received the shutdown signal and that it stopped gracefully. + +### Summary + +By implementing the `preStop` hook and setting an appropriate `terminationGracePeriodSeconds`, you ensure that NGINX can handle ongoing requests and cleanly shut down before the container is terminated. This approach provides a smooth user experience and avoids abrupt disconnections or data loss during shutdowns. \ No newline at end of file From 7aeb191872461deb018f9e07bc9f67b664673cb7 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Mon, 8 Jul 2024 17:23:56 +0530 Subject: [PATCH 25/34] Autoscaler helpers --- Autoscaler101/helpers.md | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 73ccdd34..5e0d2f4d 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -152,8 +152,6 @@ You also have to consider that the tool used to perform scaling might run into i Now, let's get started on the lab and take a practical look at all the things we discussed above. For this, it's best to use a cloud provider for your Kubernetes cluster as opposed to Minikube, since we need to have multiple nodes so we can take a look at node scaling. Even a multi-node cluster that you run on your local machine is fine. For this, we will be using the Nginx image as the application and come up with our own Nginx deployment yaml that incorporates most of the attributes discussed above. -Certainly! Here’s a detailed example of how to configure readiness, liveness, and startup probes for an NGINX deployment in Kubernetes. - ### Step 1: Create a Kubernetes Deployment Manifest Create a deployment manifest file named `nginx-deployment.yaml` with the following content: @@ -207,6 +205,8 @@ spec: command: ["/bin/sh", "-c", "nginx -s quit"] ``` +Here we see all the different types of probes we previously discussed. We first have a `livenessProbe` that checks whether our pod is alive every 10 seconds, a `readinessProbe` that checks if the pod is ready to serve traffic every 5 seconds, and a startup probe that checks to see if the pod has started up properly. This also has a `postStart` and a `preStop` hook. The `preStart` hook just echos out a line while the `preStop` hook runs `nginx -s quit` which finishes service open connections before shutting down (graceful shutdown). + ### Step 2: Create a ConfigMap for Custom NGINX Configuration Create a ConfigMap file named `nginx-configmap.yaml` with the following content to define custom health check endpoints: @@ -334,15 +334,8 @@ kubectl get pods kubectl describe pod ``` -This setup ensures that: - -- **Liveness Probe:** Checks if the NGINX container is alive. If it fails, Kubernetes will restart the container. -- **Readiness Probe:** Checks if the NGINX container is ready to serve traffic. If it fails, the pod will be removed from the service endpoints. -- **Startup Probe:** Ensures that the NGINX container has started up properly before any liveness or readiness probes are executed. - With this configuration, you should have a robust deployment of NGINX with proper health checks using readiness, liveness, and startup probes. - Implementing a graceful shutdown for your NGINX deployment involves ensuring that your application can handle termination signals properly, finish any ongoing requests, and clean up resources before the container is terminated. Here’s how you can achieve this in Kubernetes: ### Step 1: Define a PreStop Hook in Your Deployment Manifest From c42e392c3839d0f8c6000072d710b1f5f4a4a5fc Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Tue, 9 Jul 2024 17:27:44 +0530 Subject: [PATCH 26/34] Autoscaler helpers --- Autoscaler101/helpers.md | 20 +++----------------- 1 file changed, 3 insertions(+), 17 deletions(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 5e0d2f4d..58644211 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -196,16 +196,9 @@ spec: port: 80 initialDelaySeconds: 0 periodSeconds: 10 - lifecycle: - postStart: - exec: - command: ["/bin/sh", "-c", "echo 'nginx started'"] - preStop: - exec: - command: ["/bin/sh", "-c", "nginx -s quit"] ``` -Here we see all the different types of probes we previously discussed. We first have a `livenessProbe` that checks whether our pod is alive every 10 seconds, a `readinessProbe` that checks if the pod is ready to serve traffic every 5 seconds, and a startup probe that checks to see if the pod has started up properly. This also has a `postStart` and a `preStop` hook. The `preStart` hook just echos out a line while the `preStop` hook runs `nginx -s quit` which finishes service open connections before shutting down (graceful shutdown). +Here we see all the different types of probes we previously discussed. We first have a `livenessProbe` that checks whether our pod is alive every 10 seconds, a `readinessProbe` that checks if the pod is ready to serve traffic every 5 seconds, and a startup probe that checks to see if the pod has started up properly.Now that we have defined the various probes, we need to define the actual endpoints. Otherwise, the probes will ping these missing endpoints and our application will never start. ### Step 2: Create a ConfigMap for Custom NGINX Configuration @@ -300,13 +293,6 @@ spec: port: 80 initialDelaySeconds: 0 periodSeconds: 10 - lifecycle: - postStart: - exec: - command: ["/bin/sh", "-c", "echo 'nginx started'"] - preStop: - exec: - command: ["/bin/sh", "-c", "nginx -s quit"] volumes: - name: nginx-config-volume configMap: @@ -336,7 +322,7 @@ kubectl describe pod With this configuration, you should have a robust deployment of NGINX with proper health checks using readiness, liveness, and startup probes. -Implementing a graceful shutdown for your NGINX deployment involves ensuring that your application can handle termination signals properly, finish any ongoing requests, and clean up resources before the container is terminated. Here’s how you can achieve this in Kubernetes: +Now, let's move on to graceful shutdowns. ### Step 1: Define a PreStop Hook in Your Deployment Manifest @@ -408,7 +394,7 @@ Apply the updated deployment manifest to your Kubernetes cluster: kubectl apply -f nginx-deployment.yaml ``` -### Explanation of Configuration +This introduces two new attributes: 1. **terminationGracePeriodSeconds:** This sets the period (in seconds) that Kubernetes will wait after sending a SIGTERM signal to the container before forcefully terminating it with a SIGKILL signal. The default value is 30 seconds, but it can be adjusted based on your application's requirements. In this example, it is set to 60 seconds. From e4e4c78445e957e7f509a9beb8d54e6aff3b3276 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Wed, 10 Jul 2024 17:37:37 +0530 Subject: [PATCH 27/34] Autoscaler helpers --- Autoscaler101/helpers.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 58644211..d8c5fb20 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -420,8 +420,6 @@ To verify that the graceful shutdown is working correctly, you can simulate a te kubectl logs --previous ``` - Look for logs indicating that NGINX received the shutdown signal and that it stopped gracefully. + When observing the logs from the nginx pods, you should be able to see that a graceful shutdown is being performed. -### Summary - -By implementing the `preStop` hook and setting an appropriate `terminationGracePeriodSeconds`, you ensure that NGINX can handle ongoing requests and cleanly shut down before the container is terminated. This approach provides a smooth user experience and avoids abrupt disconnections or data loss during shutdowns. \ No newline at end of file +Now that we've covered graceful shutdowns, let's take a look at a few annotations in action. \ No newline at end of file From 9b9dae04ecdb5dcc0dfa26f6d3cdc6e7af634cb7 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Thu, 11 Jul 2024 17:05:12 +0530 Subject: [PATCH 28/34] Autoscaler helpers --- Autoscaler101/helpers.md | 94 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 93 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index d8c5fb20..eebf2856 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -422,4 +422,96 @@ To verify that the graceful shutdown is working correctly, you can simulate a te When observing the logs from the nginx pods, you should be able to see that a graceful shutdown is being performed. -Now that we've covered graceful shutdowns, let's take a look at a few annotations in action. \ No newline at end of file +With that, we cover graceful shutdowns. We will be skipping annotations for this lab since the annotations themselves are uncomplicated and applying them to your deployments or nodes is very straightforward. So let's jump right ahead to pod priority. + +To introduce Pod Priority to your Deployment, you need to define a `PriorityClass` and then reference it in your Deployment's Pod spec. Pod Priority is used to influence the scheduling and eviction policies for Pods. Here’s how you can add a `PriorityClass` and incorporate it into your existing Deployment: + +### Step 1: Define a PriorityClass + +First, you need to create a `PriorityClass` resource. This resource defines a priority value and an optional description. + +```yaml +apiVersion: scheduling.k8s.io/v1 +kind: PriorityClass +metadata: + name: high-priority +value: 100000 +globalDefault: false +description: "This priority class is used for high priority pods." +``` + +### Step 2: Reference the PriorityClass in the Deployment + +Next, update your Deployment to use the newly defined `PriorityClass`. + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: nginx-deployment + labels: + app: nginx +spec: + replicas: 3 + selector: + matchLabels: + app: nginx + template: + metadata: + labels: + app: nginx + spec: + priorityClassName: high-priority + terminationGracePeriodSeconds: 60 + containers: + - name: nginx + image: nginx:latest + ports: + - containerPort: 80 + volumeMounts: + - name: nginx-config-volume + mountPath: /etc/nginx/conf.d + subPath: default.conf + livenessProbe: + httpGet: + path: /healthz + port: 80 + initialDelaySeconds: 30 + periodSeconds: 10 + readinessProbe: + httpGet: + path: /readiness + port: 80 + initialDelaySeconds: 5 + periodSeconds: 5 + startupProbe: + httpGet: + path: /startup + port: 80 + initialDelaySeconds: 0 + periodSeconds: 10 + lifecycle: + postStart: + exec: + command: ["/bin/sh", "-c", "echo 'nginx started'"] + preStop: + exec: + command: ["/bin/sh", "-c", "nginx -s quit && sleep 30"] + volumes: + - name: nginx-config-volume + configMap: + name: nginx-config +``` + +### Explanation + +1. **PriorityClass Resource**: + - `name`: The name of the priority class (`high-priority` in this example). + - `value`: The priority value assigned to this class. Higher values indicate higher priority. + - `globalDefault`: Indicates if this should be the default priority class for Pods that do not specify any priority class. + - `description`: A human-readable description of the priority class. + +2. **Deployment Update**: + - `priorityClassName`: Added to the Pod spec to assign the priority class to the Pods created by this Deployment. + +By adding the `PriorityClass` and referencing it in your Deployment, you ensure that the Pods in this Deployment are given a higher priority during scheduling and eviction processes compared to other Pods with lower priority or no specified priority class. \ No newline at end of file From bb92c76c1d60e6849fbf786ddcd3406e22ed721c Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Fri, 12 Jul 2024 17:41:54 +0530 Subject: [PATCH 29/34] Autoscaler helpers --- Autoscaler101/helpers.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index eebf2856..6b986778 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -424,7 +424,7 @@ To verify that the graceful shutdown is working correctly, you can simulate a te With that, we cover graceful shutdowns. We will be skipping annotations for this lab since the annotations themselves are uncomplicated and applying them to your deployments or nodes is very straightforward. So let's jump right ahead to pod priority. -To introduce Pod Priority to your Deployment, you need to define a `PriorityClass` and then reference it in your Deployment's Pod spec. Pod Priority is used to influence the scheduling and eviction policies for Pods. Here’s how you can add a `PriorityClass` and incorporate it into your existing Deployment: +To introduce Pod Priority to our Nginx Deployment, we need to define a `PriorityClass` and then reference it in our Deployment's Pod spec. ### Step 1: Define a PriorityClass @@ -442,7 +442,7 @@ description: "This priority class is used for high priority pods." ### Step 2: Reference the PriorityClass in the Deployment -Next, update your Deployment to use the newly defined `PriorityClass`. +Next, update the Deployment to use the newly defined `PriorityClass`. ```yaml apiVersion: apps/v1 @@ -514,4 +514,4 @@ spec: 2. **Deployment Update**: - `priorityClassName`: Added to the Pod spec to assign the priority class to the Pods created by this Deployment. -By adding the `PriorityClass` and referencing it in your Deployment, you ensure that the Pods in this Deployment are given a higher priority during scheduling and eviction processes compared to other Pods with lower priority or no specified priority class. \ No newline at end of file +By adding the `PriorityClass` and referencing it in your Deployment, you ensure that the Pods in this Deployment are given a higher priority during scheduling and eviction processes compared to other Pods with lower priority or no specified priority class. Lower priority in this case would be a priority less that 10000 (which is what we have defined as high). \ No newline at end of file From 910db2f9c9c64d17fbcdd473a96bdb6445d1dd04 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Mon, 15 Jul 2024 17:26:48 +0530 Subject: [PATCH 30/34] Autoscaler helpers --- Autoscaler101/helpers.md | 96 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 95 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 6b986778..dd452f78 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -514,4 +514,98 @@ spec: 2. **Deployment Update**: - `priorityClassName`: Added to the Pod spec to assign the priority class to the Pods created by this Deployment. -By adding the `PriorityClass` and referencing it in your Deployment, you ensure that the Pods in this Deployment are given a higher priority during scheduling and eviction processes compared to other Pods with lower priority or no specified priority class. Lower priority in this case would be a priority less that 10000 (which is what we have defined as high). \ No newline at end of file +By adding the `PriorityClass` and referencing it in your Deployment, you ensure that the Pods in this Deployment are given a higher priority during scheduling and eviction processes compared to other Pods with lower priority or no specified priority class. Lower priority in this case would be a priority less that 10000 (which is what we have defined as high). + +Finally, let's get to PodDisruptionBudgets. To add a Pod Disruption Budget (PDB) to your deployment, you need to create a PDB resource. A PDB ensures that a certain number of pods in a deployment are available even during voluntary disruptions (such as draining a node for maintenance). Below is your updated configuration with a PDB added: + +### Deployment YAML + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: nginx-deployment + labels: + app: nginx +spec: + replicas: 3 + selector: + matchLabels: + app: nginx + template: + metadata: + labels: + app: nginx + spec: + priorityClassName: high-priority + terminationGracePeriodSeconds: 60 + containers: + - name: nginx + image: nginx:latest + ports: + - containerPort: 80 + volumeMounts: + - name: nginx-config-volume + mountPath: /etc/nginx/conf.d + subPath: default.conf + livenessProbe: + httpGet: + path: /healthz + port: 80 + initialDelaySeconds: 30 + periodSeconds: 10 + readinessProbe: + httpGet: + path: /readiness + port: 80 + initialDelaySeconds: 5 + periodSeconds: 5 + startupProbe: + httpGet: + path: /startup + port: 80 + initialDelaySeconds: 0 + periodSeconds: 10 + lifecycle: + postStart: + exec: + command: ["/bin/sh", "-c", "echo 'nginx started'"] + preStop: + exec: + command: ["/bin/sh", "-c", "nginx -s quit && sleep 30"] + volumes: + - name: nginx-config-volume + configMap: + name: nginx-config +``` + +### Pod Disruption Budget YAML + +```yaml +apiVersion: policy/v1 +kind: PodDisruptionBudget +metadata: + name: nginx-pdb + labels: + app: nginx +spec: + minAvailable: 2 + selector: + matchLabels: + app: nginx +``` + +### Explanation: +- **minAvailable: 2**: This specifies that at least 2 pods must be available at all times. +- **selector**: Ensures that the PDB applies to the pods matching the specified labels (`app: nginx`). + +### Applying the Configuration: +1. Save the deployment YAML to a file, e.g., `nginx-deployment.yaml`. +2. Save the PDB YAML to another file, e.g., `nginx-pdb.yaml`. +3. Apply both configurations using `kubectl`: + ```sh + kubectl apply -f nginx-deployment.yaml + kubectl apply -f nginx-pdb.yaml + ``` + +This ensures your deployment has a disruption budget to maintain availability during node maintenance or other voluntary disruptions. \ No newline at end of file From be091dfeb634564f78cac264383418e3c0b13bce Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Tue, 16 Jul 2024 18:01:53 +0530 Subject: [PATCH 31/34] Autoscaler helpers --- Autoscaler101/helpers.md | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index dd452f78..6e0e858b 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -516,7 +516,7 @@ spec: By adding the `PriorityClass` and referencing it in your Deployment, you ensure that the Pods in this Deployment are given a higher priority during scheduling and eviction processes compared to other Pods with lower priority or no specified priority class. Lower priority in this case would be a priority less that 10000 (which is what we have defined as high). -Finally, let's get to PodDisruptionBudgets. To add a Pod Disruption Budget (PDB) to your deployment, you need to create a PDB resource. A PDB ensures that a certain number of pods in a deployment are available even during voluntary disruptions (such as draining a node for maintenance). Below is your updated configuration with a PDB added: +Finally, let's get to PodDisruptionBudgets. To add a Pod Disruption Budget (PDB) to your deployment, you need to create a PDB resource. A PDB ensures that a certain number of pods in a deployment are available even during voluntary disruptions (such as draining a node for maintenance). Below is our updated configuration with a PDB added. The deployment file remains essentially the same. The disruption budget comes from a new manifest of kind `PodDisruptionBudget`: ### Deployment YAML @@ -599,13 +599,10 @@ spec: - **minAvailable: 2**: This specifies that at least 2 pods must be available at all times. - **selector**: Ensures that the PDB applies to the pods matching the specified labels (`app: nginx`). -### Applying the Configuration: -1. Save the deployment YAML to a file, e.g., `nginx-deployment.yaml`. -2. Save the PDB YAML to another file, e.g., `nginx-pdb.yaml`. -3. Apply both configurations using `kubectl`: - ```sh - kubectl apply -f nginx-deployment.yaml - kubectl apply -f nginx-pdb.yaml - ``` +Apply the configuration: -This ensures your deployment has a disruption budget to maintain availability during node maintenance or other voluntary disruptions. \ No newline at end of file +```sh +kubectl apply -f nginx-pdb.yaml +``` + +This should ensure that the nginx pod has a proper disruption budget at all times. \ No newline at end of file From 7a3611d441f85874de4ae4737392b7e6d0c424b9 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Thu, 18 Jul 2024 09:42:54 +0530 Subject: [PATCH 32/34] Autoscaler helpers --- Autoscaler101/helpers.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 6e0e858b..7d819f22 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -605,4 +605,4 @@ Apply the configuration: kubectl apply -f nginx-pdb.yaml ``` -This should ensure that the nginx pod has a proper disruption budget at all times. \ No newline at end of file +Now your number of pods won't go below the minimum available pod count meaning that if pods are evicted due to autoscaling, if a new version is deployed, or if your pods are supposed to restart for any reason, at least 2 pods will always be up. This, however, will not consider a case where your pod or node goes out of memory, becomes unreachable, or unschedulable. If your node doesn't have enough resources to give, even a PDB insisting that the pod needs to stay up doesn't work. The same applies if the node suddenly were to get removed. To minimize the change of this happening, you will properly have to set pod requests and limits so that the resource requirements of a pod never exceed what the node can provide. \ No newline at end of file From b81e7c09051349ecc2886fba193d636fb465b2b3 Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Fri, 19 Jul 2024 18:04:36 +0530 Subject: [PATCH 33/34] Autoscaler helpers --- Autoscaler101/helpers.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/Autoscaler101/helpers.md b/Autoscaler101/helpers.md index 7d819f22..18e70eae 100644 --- a/Autoscaler101/helpers.md +++ b/Autoscaler101/helpers.md @@ -605,4 +605,8 @@ Apply the configuration: kubectl apply -f nginx-pdb.yaml ``` -Now your number of pods won't go below the minimum available pod count meaning that if pods are evicted due to autoscaling, if a new version is deployed, or if your pods are supposed to restart for any reason, at least 2 pods will always be up. This, however, will not consider a case where your pod or node goes out of memory, becomes unreachable, or unschedulable. If your node doesn't have enough resources to give, even a PDB insisting that the pod needs to stay up doesn't work. The same applies if the node suddenly were to get removed. To minimize the change of this happening, you will properly have to set pod requests and limits so that the resource requirements of a pod never exceed what the node can provide. \ No newline at end of file +Now your number of pods won't go below the minimum available pod count meaning that if pods are evicted due to autoscaling, if a new version is deployed, or if your pods are supposed to restart for any reason, at least 2 pods will always be up. This, however, will not consider a case where your pod or node goes out of memory, becomes unreachable, or unschedulable. If your node doesn't have enough resources to give, even a PDB insisting that the pod needs to stay up doesn't work. The same applies if the node suddenly were to get removed. To minimize the change of this happening, you will properly have to set pod requests and limits so that the resource requirements of a pod never exceed what the node can provide. + +# Conclusion + +This brings us to the end of this section, where we discuss how to use various tools provided both natively and as add-ons to improve the stability of scaling, which is an essential aspect of running high availability production applications. There are a large number of other tools within the CNCF collective that help improve this stability, so don't hesitate to research the various options to get the best fit for your production workload. For more about scaling, don't forget to check out our [KEDA](../Keda101/what-is-keda.md) and [Karpenter](../Karpenter101/what-is-karpenter.md) sections. \ No newline at end of file From 9d8f03810382d4e7061118cee96460f6ea74af7d Mon Sep 17 00:00:00 2001 From: Phantom-Intruder Date: Mon, 22 Jul 2024 17:39:59 +0530 Subject: [PATCH 34/34] Autoscaler helpers --- Observability101/observability.txt | 0 README.md | 1 + 2 files changed, 1 insertion(+) create mode 100644 Observability101/observability.txt diff --git a/Observability101/observability.txt b/Observability101/observability.txt new file mode 100644 index 00000000..e69de29b diff --git a/README.md b/README.md index a0599401..dd1b0896 100644 --- a/README.md +++ b/README.md @@ -247,6 +247,7 @@ A Curated List of Kubernetes Labs and Tutorials - [What are autoscalers](./Autoscaler101/what-are-autoscalers.md) - [Autoscaler lab](./Autoscaler101/autoscaler-lab.md) + - [Autoscaler helpers](./Autoscaler101/helpers.md) ## Helm101