Kubernetes — Application High-Availability, Part 2 (More Basics)

Paul Dally
4 min readFeb 7, 2022

--

Part 1 of this series can be found here.

Probes

In Part 1, we saw that Kubernetes has load-balancing via Services. However, unless you add specific configuration, a Pod is considered healthy and ready for traffic as long as the processes started in all of the containers of the Pod are running.

To override this default behavior, Kubernetes provides a number of different probes, which are discussed in greater detail in this article.

Probes can increase the availability of your application by detecting and restarting (livenessProbe) impaired containers, or removing them from rotation (readinessProbe).

Simple Rollouts

High-availability is also a concern when thinking about deploying application changes (application rollouts). When working with Deployments or StatefulSets, when you make changes to the Pod spec (for example, you want to run an upgraded image or add an environment variable), Kubernetes will often (but not always) need to terminate your Pod and create a new one to replace it... Or create a new Pod, and then terminate an old Pod. Unlike StatefulSets, with Deployments, you can choose which behavior you prefer. A StatefulSet, on the other hand, uses ordered Pod creation and ordered Pod termination — namely Pods are created and destroyed in order of and inversely to their Pod ordinal respectively.

Deployments

All Kubernetes Deployments have a “rolloutStrategy”. To state what is hopefully obvious from the name, this rolloutStrategy allows you to configure the strategy that Kubernetes will use to roll out Pods. If you do not explicitly specify a rollout strategy, Kubernetes will use the following default:

spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%

This default is often not appropriate, so PLEASE do not just blindly accept the default values.

For example, if replicas: 1, then 25% doesn’t make sense. Also, Kubernetes will try to surge first and then terminate second if both are configured as options — and this can cause problems if a ResourceQuota has been defined in the Namespace because Kubernetes will create the ReplicaSet, which will in turn try to create Pods — but this will fail but keep trying, and Kubernetes will be blocked from terminating Pods to free up capacity in a timely fashion.

If you are running close to your ResourceQuota, you should consider setting maxSurge to 0%.

StatefulSet

StatefulSets allow you to configure .spec.updateStrategy

If you specify “OnDelete”, then you are responsible for deleting Pods created by the StatefulSet. This puts full control of the rollout process into your hands, but requires you to manually delete the Pods or script the process, which isn’t optimal.

If you specify “RollingUpdate” (the default), then the StatefulSet will delete and recreate each Pod in the StatefulSet. It will proceed in the same order as Pod termination (from the largest ordinal to the smallest), updating each Pod one at a time. If replicas: 3, then Pod mystatefulset-2 would be terminated and recreated first, then mystatefulset-1 and then finally mystatefulset-0.

Partitioned rollouts are also possible with StatefulSets, but in my experience are not commonly required. Suffice it to say that this allows a partial rollout (i.e. any Pods that have an ordinal greater than or equal to the specified partition will be updated). For example, with replicas: 3, if you specify .spec.updateStrategy.rollingUpdate.partition: 2, then only mystatefulset-2 will be updated with Pod spec changes.

If you are using a partitioned rolling update, you may want to configure the Pod spec labels to ensure that the updated Pods are not considered ready to avoid exposing these Pods to your users until you have been able to test. Not all StatefulSets and/or Pod spec changes will be suitable for partitioned rolling updates, so please use this approach with caution.

PodDisruptionBudget

A PodDisruptionBudget is a relatively unappreciated high-availability feature provided by Kubernetes, which is unfortunate.

Disruptions are any event that causes a Pod to disappear, whether that be voluntary (Pod deletion, deployments updating the Pod spec, draining a Node for repair, upgrade or (auto)scaling, etc.) or involuntary (hardware failure, Node crash, Pod eviction due to lack of Node resources, etc.).

A PodDisruptionBudget limits the number of Pods of a replicated application that are down simultaneously from most voluntary disruptions (most notably, this does not include Pod deletion).

Here is an example of a PodDisruptionBudget:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-pdb
spec:
selector:
matchLabels:
app.kubernetes.io/name: hello-world
maxUnavailable: 1

There is some overlap with the rollout features of Deployments, but it is worth noting that PodDisruptionBudgets can apply across more than one Deployment or StatefulSet (and/or any other way that Pods are created), ensuring that only the specified number of Pods is unavailable at the same time across ANY pods that match the provided selector. It should also be noted that rollout features of Deployments only apply to… well, rollouts. PodDisruptionBudgets more expansive scope make them a valuable tool in Kubernetes’ high-availability toolbelt.

NOTE: Please be careful when using PodDisruptionBudgets and pay attention to the number of replicas for your Pods. A PodDisruptionBudget can prevent node maintenance from occurring if your Pod configuration is such that the PodDisruptionBudget cannot be satisfied. For example, suppose that you have replicas: 1, and you specified minAvailable: 1. Kubernetes would not be able to terminate your Pod.

We’re Still Only Just Beginning

So far, we’ve still only scratched the surface of the features that Kubernetes provides for application high-availability. When I have some more time, I hope to create future parts — so stay tuned!

--

--

Paul Dally
Paul Dally

Written by Paul Dally

AVP, IT Foundation Platforms Architecture at Sun Life Financial. Views & opinions expressed are my own, not necessarily those of Sun Life

No responses yet