2023-04-22 14:31:49 |
Juan Pablo Noreña |
description |
The control plane of my Charmed Kubernetes deployment got blocked because the kubelet service stopped in both nodes.
App Version Status Scale Charm Channel Rev Exposed Message
kubernetes-control-plane 1.27.1 blocked 2 kubernetes-control-plane latest/stable 231 no Stopped services: kubelet
kubernetes-worker 1.27.1 waiting 3 kubernetes-worker latest/stable 87 yes Waiting for kubelet to start.
The charm in trying to restart it, but falls into the following error:
$ sudo snap logs kubelet.daemon
2023-04-22T14:04:18Z systemd[1]: Started Service for snap application kubelet.daemon.
2023-04-22T14:04:18Z kubelet.daemon[53906]: E0422 14:04:18.514308 53906 run.go:74] "command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime"
2023-04-22T14:04:18Z systemd[1]: snap.kubelet.daemon.service: Main process exited, code=exited, status=1/FAILURE
2023-04-22T14:04:18Z systemd[1]: snap.kubelet.daemon.service: Failed with result 'exit-code'.
2023-04-22T14:04:28Z systemd[1]: snap.kubelet.daemon.service: Scheduled restart job, restart counter is at 109.
2023-04-22T14:04:28Z systemd[1]: Stopped Service for snap application kubelet.daemon.
2023-04-22T14:04:28Z systemd[1]: Started Service for snap application kubelet.daemon.
2023-04-22T14:04:28Z kubelet.daemon[53977]: E0422 14:04:28.757914 53977 run.go:74] "command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime"
2023-04-22T14:04:28Z systemd[1]: snap.kubelet.daemon.service: Main process exited, code=exited, status=1/FAILURE
2023-04-22T14:04:28Z systemd[1]: snap.kubelet.daemon.service: Failed with result 'exit-code'.
$ sudo snap info kubelet
installed: 1.27.1 (2945) 22MB classic,in-cohort
The upstream documentation says:
--container-runtime string Default: remote
The container runtime to use. Possible values: docker, remote. (DEPRECATED: will be removed in 1.27 as the only valid value is 'remote')
--cloud-provider string
The provider for cloud services. Set to empty string for running with no cloud provider. If set, the cloud provider determines the name of the node (consult cloud provider documentation to determine if and how the hostname is used). (DEPRECATED: will be removed in 1.24 or later, in favor of removing cloud provider code from Kubelet.)
https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/
The kubelet args looks like this in the snap:
$ cat /var/snap/kubelet/current/args
--kubeconfig="/root/cdk/kubeconfig" --v="0" --node-ip="172.31.20.76" --container-runtime="remote" --container-runtime-endpoint="unix:///var/run/containerd/containerd.sock" --cloud-provider="aws" --config="/root/cdk/kubelet/config.yaml" --pod-infra-container-image="rocks.canonical.com:443/cdk/pause:3.6" --register-with-taints="node-role.kubernetes.io/control-plane:NoSchedule"
As a workaround, I sshed into the kubernetes-control-plane units, manually removed '--container-runtime="remote"' and '--cloud-provider="aws"' from /var/snap/kubelet/current/args since both seem to be deprecated and restarted the snap before the config gets overridden by the charm:
$ sudo snap restart kubelet.daemon |
The control plane of my Charmed Kubernetes deployment got blocked because the kubelet service stopped in both nodes.
App Version Status Scale Charm Channel Rev Exposed Message
kubernetes-control-plane 1.27.1 blocked 2 kubernetes-control-plane latest/stable 231 no Stopped services: kubelet
kubernetes-worker 1.27.1 waiting 3 kubernetes-worker latest/stable 87 yes Waiting for kubelet to start.
The charm in trying to restart it, but falls into the following error:
$ sudo snap logs kubelet.daemon
2023-04-22T14:04:18Z systemd[1]: Started Service for snap application kubelet.daemon.
2023-04-22T14:04:18Z kubelet.daemon[53906]: E0422 14:04:18.514308 53906 run.go:74] "command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime"
2023-04-22T14:04:18Z systemd[1]: snap.kubelet.daemon.service: Main process exited, code=exited, status=1/FAILURE
2023-04-22T14:04:18Z systemd[1]: snap.kubelet.daemon.service: Failed with result 'exit-code'.
2023-04-22T14:04:28Z systemd[1]: snap.kubelet.daemon.service: Scheduled restart job, restart counter is at 109.
2023-04-22T14:04:28Z systemd[1]: Stopped Service for snap application kubelet.daemon.
2023-04-22T14:04:28Z systemd[1]: Started Service for snap application kubelet.daemon.
2023-04-22T14:04:28Z kubelet.daemon[53977]: E0422 14:04:28.757914 53977 run.go:74] "command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime"
2023-04-22T14:04:28Z systemd[1]: snap.kubelet.daemon.service: Main process exited, code=exited, status=1/FAILURE
2023-04-22T14:04:28Z systemd[1]: snap.kubelet.daemon.service: Failed with result 'exit-code'.
$ sudo snap info kubelet
installed: 1.27.1 (2945) 22MB classic,in-cohort
The upstream documentation says:
--container-runtime string Default: remote
The container runtime to use. Possible values: docker, remote. (DEPRECATED: will be removed in 1.27 as the only valid value is 'remote')
--cloud-provider string
The provider for cloud services. Set to empty string for running with no cloud provider. If set, the cloud provider determines the name of the node (consult cloud provider documentation to determine if and how the hostname is used). (DEPRECATED: will be removed in 1.24 or later, in favor of removing cloud provider code from Kubelet.)
https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/
The kubelet args looks like this in the snap:
$ cat /var/snap/kubelet/current/args
--kubeconfig="/root/cdk/kubeconfig" --v="0" --node-ip="172.31.20.76" --container-runtime="remote" --container-runtime-endpoint="unix:///var/run/containerd/containerd.sock" --cloud-provider="aws" --config="/root/cdk/kubelet/config.yaml" --pod-infra-container-image="rocks.canonical.com:443/cdk/pause:3.6" --register-with-taints="node-role.kubernetes.io/control-plane:NoSchedule"
To try, I sshed into the kubernetes-control-plane units, manually removed '--container-runtime="remote"' and '--cloud-provider="aws"' from /var/snap/kubelet/current/args since both seem to be deprecated and restarted the snap:
$ sudo snap restart kubelet.daemon
But the charm will overwrite the snap config and restart it again falling in the same error. |
|