Why Leaving Pods in CrashLoopBackOff Can Have a Bigger Impact Than You Might Think
Recently I’ve seen a number of Pods in CrashLoopBackoff state for extended periods of time, especially in pre-production environments. In one case, the developer deployed a number of invalid Deployments and immediately thereafter went on vacation. Operations teams detected this and ultimately resolved the issue, but among many developers, it seems that there is a perception that a consistently crashing container doesn’t matter —or that it can’t affect anything other than your own Pod. Unfortunately, that just isn’t the case!
How CrashLoopBackOff works
According to the Kubernetes documentation, if the RestartPolicy is Always or OnFailure, every time a failed pod is restarted by the kublet it is restarted with an exponential back-off delay. This delay is capped at 5 minutes and if the pod is executed successfully for 10 minutes, this delay is reset to the initial value.
The exponential back-off delay starts at 10s and goes from there (10s, 20s, 40s, …) up to the 5 minute cap. This does reduce the number of restarts that a consistently crashing container would otherwise undergo, but an improperly configured or buggy container could still conceivably be crashing and restarting 300 or more times per day…
High volumes of container restarts may eventually cause other new containers on the Node to not start
With some container runtimes and versions of the linux kernel, each container restarts can leak “cgroups” and over time, especially if multiple containers are crashing and restarting, this can result in other containers being unable to start with a cgroup memory allocation error on that worker Node. In this situation, you may see an error message like this in when you describe your Pod:
shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:385: applying cgroup configuration for process caused: mkdir /sys/fs/cgroup/memory/kubepods/burstable/pod0f8fe7dd-87bf-4f13-9e17-b12c41c46418/0a55234bf4c2b1785bdaefb63227b606d71eb6025a44de34e16db87032f77950: cannot allocate memory: unknown
Rebooting may resolve the issue temporarily, and upgrading your Node’s OS to get a newer kernel version may help. Setting the
cgroup.memory=nokmem kernel parameter may also help, although this requires a reboot to take effect and isn’t really a complete solution and can lead to other issues.
Repeatedly restarting containers may cause unnecessarily high Node resource usage
A container’s startup processing can be resource intensive. Often, it is the most resource intensive part of the container’s entire lifecycle. For example, consider the following container’s CPU usage:
In this example, at container start the CPU usage is orders of magnitude higher than afterwards. If your container is constantly crashing (and restarting), expensive startup processing would be running every few minutes instead of just once. At scale this unnecessary startup processing may result in insufficient resources being available for valid workloads. It may also result in increased costs, due to cluster or Pod scaling. Waste not, want not!
If the container doesn’t reach a ready state, your rollout may not complete
Kubernetes changes are typically rolled out such that downtime is minimized. For example, a Deployment might create a new Pod (technically the Deployment creates a ReplicaSet which creates the Pod…), wait until it becomes ready, destroy an old Pod, and then repeat until no old Pods are running. If your first new Pod enters CrashLoopBackOff and never becomes ready, then the rollout will stall and all of your old Pods will still be running. You may be testing your application, and wondering “why isn’t my change working?”. What a frustrating waste of time!
Containers that are constantly restarting aren’t doing real work
Finally, perhaps the most obvious problem with a container that is constantly restarting is that while it is restarting, it isn’t doing the work that you intended it to. This may result in downtime to your application, and may also cause unexpected and undesired HorizontalPodAutoscaler activation.
Make sure you monitor your container restart counts. When a container restart count starts going up, or you detect that it is in CrashLoopBackOff, don’t ignore it!