One common reason your Pod may be stuck in Pending

Paul Dally
3 min readAug 31, 2022

--

What to do when nodes don’t match the Pod’s node affinity

If your Pod is stuck in pending, the very first thing that you should do is describe the Pod. You can do that with a command like the following:

kubectl -n <namespace> describe pod <podname>

When you do that, you may see a warning message like this:

Warning FailedScheduling 118s (x26 over 7m43s) default-scheduler 0/53 nodes are available: 53 node(s) didn’t match Pod’s node affinity

There are lots of situations other than what is shown above that can cause a Pod to be stuck in pending status... This specific situation, however, is the one that I tend to see most commonly in the enterprise context, and it seems to give developers the most difficulty when trying to understand what is going on.

This particular warning means that your Pod cannot be scheduled because you are specifying a nodeSelector or nodeAffinity that doesn’t match the labels on the workers in your kubernetes cluster. The describe output of the Pod shows the nodeSelectors:

<snip>
Node-Selectors: topology.kubernetes.io/zone=ca-central-1a
topology.kubernetes.io/zone=ca-central-1b
<snip>

Unfortunately, nodeAffinity is not displayed in the describe output. Either refer to your source manifests (if using helm, perhaps the “helm template” command might help). You can also use kubectl edit (the full manifest will be loaded into an editor, but just be careful not to modify the yaml and save it before exiting the editor).

You may have only nodeSelectors, only nodeAffinity or perhaps you’ll have both, for example:

apiVersion: v1
kind: Pod
metadata:
<snip>
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/region
operator: In
values:
- ca-central-1
<snip>
nodeSelector:
topology.kubernetes.io/zone: "ca-central-1a"
topology.kubernetes.io/zone: "ca-central-1b"

If I had been creating this Pod spec, I probably wouldn’t have included the nodeAffinity (it is redundant, since zones ca-central-1a and ca-central-1b are by definition in region ca-central-1). Notwithstanding that, the cause of the issue should be fairly obvious…

All nodeSelectors must be matched. Node labels can only have a single value. Perhaps the developer thought that nodeSelectors were ORed and not ANDed, or perhaps this was just a careless mistake. Either way, Kubernetes is doing exactly what you told it to do by not scheduling the Pod on any worker, because the scheduling instructions were impossible to satisfy.

nodeAffinity is a lot more flexible than nodeSelector. The documentation goes into the specifics on how both nodeSelector and nodeAffinity work. Suffice it to say, however, you can find yourself in a similar situation with nodeAffinity as well, where the conditions you’ve set exclude all of the possible nodes in the cluster.

The only remaining aspect of this problem is determining what labels have actually been configured on the Kubernetes Nodes. You can see this easily with the following command:

kubectl get nodes --show-labels

If your Pod affinity is correct and it is the Node labels that are wrong, and you have the required permissions, you can add labels to worker Nodes like this:

kubectl label nodes <nodename> mylabel=somevalue

In summary

If a nodeSelector or nodeAffinity that you specify simply doesn’t match the labels on any worker Nodes, or if you specify a combination of nodeSelectors or nodeAffinity clauses that are are mutually exclusive, your Pod won’t start. Either add labels to your worker Nodes, or remove whatever nodeSelector or nodeAffinity clauses are extraneous.

Once you understand that Kubernetes is doing exactly what you told it to do, node affinity tends to be a lot less mysterious!

--

--

Paul Dally
Paul Dally

Written by Paul Dally

AVP, IT Foundation Platforms Architecture at Sun Life Financial. Views & opinions expressed are my own, not necessarily those of Sun Life

No responses yet