Comment 5 for bug 1916927

Revision history for this message
George Kraft (cynerva) wrote :

Thank you. The new crashdump helps. The hung process is:

root 239311 0.0 0.0 8696 3424 ? S 19:37 0:00 /bin/bash /tmp/juju-exec025225977/script.sh
root 239312 0.0 0.1 746368 39576 ? Sl 19:37 0:00 /snap/kubectl/1927/kubectl --kubeconfig /root/.kube/config delete ns netpolicy

kube-controller-manager logs the failure to delete the namespace:

E0421 19:37:07.995259 148767 namespace_controller.go:162] deletion of namespace netpolicy failed: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request

The metrics-server pod is in CrashLoopBackOff. The logs were empty at the time this crashdump was captured, but I was able to find the error from metrics-server in an the originally reported crashdump:

Error: Get https://10.152.183.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication: net/http: TLS handshake timeout

I believe this is caused by conflict between the Calico CIDR (192.168.0.0/16) and the internal network space (192.168.33.135/??). I checked the most recent crashdump and it does appear that the conflict is still occurring.

I recommend setting the Calico charm's cidr config to something that does not collide with the host networking.