Kubernetes — Debugging NetworkPolicy (Part 2)

5 min readFeb 10, 2022

The first part of this series can be found here. It discusses features that are not available (at least yet) with NetworkPolicy, how to determine if your network plugin supports NetworkPolicy and other general steps debugging/optimization steps that will help you successfully use NetworkPolicy.

In this 2nd part of the series, we’ll discuss which side of the conversation to debug from (if you can), as well as describe a few common scenarios where NetworkPolicy can be blocking traffic as well as provide examples of NetworkPolicy objects that can be implemented to allow that traffic.

Debugging from the egress point is easier than debugging from the ingress point

If you are restricted to vanilla NetworkPolicies, as we’ve noted previously there is no logging. Since ingress traffic will be blocked before it gets to your container(s), if it is practical you are best to debug from the source of the traffic instead. At least from the source, you’ll have configuration and/or logs to look at.

If you don’t have any control over the source, then your task may be daunting. Make sure that you have triple-checked the NetworkPolicy, and consider whether the protocol connecting to your Pod might have unexpected ports or traffic directionality.

Active mode FTP will use the connection opened by the client for a command channel, but the server will attempt to open a connection back to the client on a random port for the data channel. If your Pod is the FTP server, it would require both ingress and egress for active mode to work (I recommend passive mode as the easier solution, rather than trying to make active mode and your NetworkPolicy work together).

SQL Server, can sometimes have dynamic port ranges. And other applications and protocols could have a variety of similar behaviors. Depending on your specific application, you may need to either change your NetworkPolicies and/or modify the configuration of either your client, server or both.

For egress traffic, check your container logs

Your container logs will often tell you what the issue is. For example, when DNS is blocked, you might see “DNS resolution failed” or “Resolving timed out” type error message like we do with curl in the example below:

>kubectl -n local-demo-debugnetworkpolicy-ns logs my-deployment-5cccff8466-zmthp -c do-wget 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:05 --:--:--     0
curl: (28) Resolving timed out after 1000 milliseconds

Allowing all containers to access DNS

You can add a NetworkPolicy to allow all Pods to access DNS as follows:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-allpods-to-dns
spec:
  policyTypes:
  - Egress
  podSelector: {} 
  egress:
  - namespaceSelector:
      matchLabels:
        kubernetes.io/metadata.name: kube-system
    ports:
    - port: 53
      protocol: UDP
    - port: 53
      protocol: TCP

Note: If you are using Kubernetes <v1.21, you may need to apply the kubernetes.io/metadata.name: kube-system label to the kube-system Namespace. You can do that like this:

kubectl label namespaces kube-system kubernetes.io/metadata.name=kube-system

Allowing application-specific traffic

When application traffic is blocked, you might see a “Failed to conect” message. For example, again using curl:

>kubectl -n local-demo-debugnetworkpolicy-ns logs my-deployment-5cccff8466-zmthp -c do-wget
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (28) Failed to connect to example.com port 80 after 703 ms: Operation timed out

In this particular example, since example.com is external to Kubernetes, you would have to use an ipBlock to specify a CIDR block, similar to the following:

apiVersion: networking.k8s.io/v1 
kind: NetworkPolicy 
metadata:   
  name: allow-deployment-to-examplecom
spec:   
  podSelector:     
    matchLabels:       
      app.kubernetes.io/name: my-deployment
  policyTypes:   
  - Egress   
  egress:   
  - to:     
    - ipBlock:         
        cidr: 93.184.216.34/32
    ports:     
    - protocol: TCP       
      port: 80

This comes with at least one major challenge for services that are outside your control — the DNS name example.com may have more than 1 IP address that it can point to, and it may not be easily possible to determine what they are (or will be). For example, it might be hosted behind a CDN or have a presence in multiple regions and be globally load-balanced. The provider of the service may change their configuration at any time, potentially without notice to you.

Specifying IP addresses or ranges of IP addresses for services that you do not control may often result in an application that is prone to failure. If the provider of the service is not publishing their IP ranges and making commitments about when they might change the IP ranges and how much lead time you will be provided, you might want to avoid using egress ipBlock and instead use firewalls, security groups and/or proxy servers that outside of Kubernetes that have more capabilities to implement the restrictions you require.

Allowing all “external” IPs that are managed by other network infrastructure while still blocking “internal” IPs by default

If you do go down the pathway of using firewalls or security groups or a similar construct restricting access to external services, you may wish to configure NetworkPolicy so that all external egress is allowed (since the firewall or whatever you are using will handle that) but still restrict traffic to private IPs with NetworkPolicy. Here’s an example of a NetworkPolicy that could be used for that:

apiVersion: networking.k8s.io/v1 
kind: NetworkPolicy 
metadata:   
  name: default-block-private-networks
spec:   
  podSelector:     
    matchLabels:       
      app.kubernetes.io/name: my-deployment
  policyTypes:   
  - Egress   
  egress:   
  - to:     
    - ipBlock:         
        cidr: 0.0.0.0/0
        except:         
        - 10.0.0.0/8
        - 172.16.0.0/12
        - 192.168.0.0/16
   ports:     
    - protocol: TCP       
      port: 80

The 3 private IP ranges listed are standardized as private IP ranges. Please treat this an example, however, as your network topology and/or application requirements may be different.

Additional NetworkPolicies would be required to open up access to specific ipBlocks within the 3 private IP ranges as necessary, but these NetworkPolicies may be significantly easier to manage than the firewalls or security groups external to Kubernetes.

Conclusion

I hope this has provided you with some assistance in how to approach your debugging task and that the examples are useful. Part 3 of this series is now available, looking at using tcpdump to identify egress traffic that is being blocked, for those scenarios when your application logs and/or configuration do not provide enough information!