Don’t make these common Kubernetes configuration mistakes!

Paul Dally
5 min readJun 3, 2022
https://publicdomainvectors.org/en/free-clipart/Vector-clip-art-of-glossy-settings-options-icon/21301.html,

1 Unreasonable resources.requests compared to actual usage (or not using requests at all). Ideally, you would be doing some load-testing at volumes that are representative for the environment in question, and configuring your resources.requests and resources.limits accordingly. Remember, requests are reservations and once those reservations reach the capacity of the worker node, no more Pods will schedule on that worker — even if those Pods aren’t actually using the reserved resources. This can significantly increase the costs of running your cluster. Conversely, if your requests are too low, your container might be scheduled on a worker node where there is no realistic possibility of getting the resources that it needs for regular operation.

2 HorizontalPodAutoscaler settings set to scale higher than ResourceQuota will allow. If the ResourceQuota will only allow 2 Pods to run, why are you specifying maxReplicas of 10 on the HorizontalPodAutoscaler?

3 Deployment rollout settings with surge>0 when the ResourceQuota will not allow an increase in the number of Pods. See this article for more information.

4 Excessive replicas especially in pre-production environments. Do you really need 10 Pod replicas in dev? Or, for that matter, more than 1? Waste not, want not… or at least pay less…

5Failing to ensure that your Pods are running where you expect them to run. See this article for information on topologySpreadContstraint issues. You should also check that your nodeSelector and/or affinity clauses are doing what you intend. If your Pods stay pending with a message like “Warning FailedScheduling 118s (x26 over 7m43s) default-scheduler 0/61 nodes are available: 53 node(s) didn’t match Pod’s node affinity”, this is another case where your nodeSelector and/or affinity clauses are likely incorrect.

6 Monolithic, overly permissive and/or badly named NetworkPolicy. See this article for more information.

7 Inefficient kustomization base/overlay design. If you are duplicating the same thing between multiple overlays, you should consider putting it in your base instead. See this article and this article for more information.

8 Inefficient image builds (no layered builds, multiple RUN statements that increase image size without adding value, etc.). See this article for more information.

9 Container memory limits lower than the configured maximum Java heap size (-Xmx) and/or failing to account for Java non-heap memory usage such as metaspace. This will result in OOMKilled container restarts. If possible, use -XX:MaxRAMPercentage instead of -Xmx (don’t specify both, if you do -XX:MaxRAMPercentage will probably simply be ignored).

10 HorizontalPodAutoscalers with scaling thesholds lower than “resting” Pod utilization. See this article for more information.

11 Failing to monitor and remediate container crashes. If your container is repeatedly crashing, it will use more worker node resources than it needs to (container starts are often the most expensive part of the containers lifecycle), and it may cause stability implications over time. See this article for more information.

12Not cleaning up obsolete ConfigMaps and Secrets created by configMapGenerator and secretGenerator in kustomization.yaml with disableNameSuffixHash: truenot specified when using kustomize.

13 startupProbes (or livenessProbes on Pods that don’t have startupProbes) that are configured with settings such that the application cannot actually start before the probe will be triggered. For example, if your container takes 3–4 minutes to start, and your Probes are set at 3 minutes exactly, then Kubernetes might be restarting your container a number of times before it finally starts within the configured thresholds. Alternatively, failing to specify timeoutSeconds or setting it to a value that is too low may cause your container to unexpectedly restart or stop taking traffic.

14Forgetting to delete Ingress objects when deleting the Services associated with the Ingress, or the Deployments/StatefulSets that are providing the Endpoints to the Service. This may vary somewhat by Ingress controller, but will result in decreased performance and potentially increased disk and memory utilization due to unnecessary polling and warning messages of Services not having any active Endpoint.

15Not understanding the licensing impacts of running your software in containers. Oracle software will often require you to license not just the processors accessible to the container, but the entire physical server that the Pod is running on and in certain very common cases, every physical server in any cluster managed by your virtualization management tooling (even if only actual server is running Oracle software). Similarly, IBM software may require you to run additional monitoring “agents” (e.g. IBM License Service) and add annotations to your Pods or license the whole worker Node as well. Using HorizontalPodAutoscaler or the fact that Kubernetes might schedule new Pods on different worker Nodes can also result in unexpected licensing costs. You should not assume that your license obligation necessarily stops at your cpu limits in the Pod spec, or that manually tabulating your usage will be sufficient.

16Using a semantic image tag like “latest” or “latest-prod” in your podspec. This will make it extremely difficult for you to know what image you are actually running, and may even cause your Pods to start running images that you didn’t intend at times that you didn’t intend. You should almost always be specifying an immutable tag in your Pod spec. Try sticking with the version number... See this article for details.

17Trying to change immutable fields on your object without deleting the object first. For example, the label selector on apps/v1 Deployment is immutable. If, after deploying a Deployment, you change the selector and deploy again, you will get a message that the Deployment is invalid because the “field is immutable”. It is generally discouraged to make label selector updates and it is suggested to plan your selectors up front. If you must change the selector, you may have to delete the object and re-deploy (which may have availability implications) or deploy the object with a new name alongside the existing object. However, don’t assume that you should always delete objects before a deployment as a matter of general practice to avoid this issue, as this will likely impact availability.

18Running your container as root. According to the NSA, preventing root execution by using non-root containers will tend to limit the impact of a container compromise. Make sure you test any changes to the container user thoroughly.

19Mismatches between your Pod labels and your Service (or NetworkPolicy or …) labelSelectors. Check that your Service has endpoints by doing a kubectl describe. If there are no endpoints and the Pods are ready, then there may be a mismatch.

Do you have suggestions of other common problems that should make this list? Respond to this story with your ideas!

--

--

Paul Dally

AVP, IT Foundation Platforms Architecture at Sun Life Financial. Views & opinions expressed are my own, not necessarily those of Sun Life