Restarting snap.kubelet.daemon service results in failed to parse kubelet flag: unknown flag: --container-runtime

Bug #2017353 reported by Juan Pablo Noreña
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Kubernetes Control Plane Charm
In Progress
Low
Kevin W Monroe

Bug Description

The control plane of my Charmed Kubernetes deployment got blocked because the kubelet service stopped in both nodes.

App Version Status Scale Charm Channel Rev Exposed Message
kubernetes-control-plane 1.27.1 blocked 2 kubernetes-control-plane latest/stable 231 no Stopped services: kubelet
kubernetes-worker 1.27.1 waiting 3 kubernetes-worker latest/stable 87 yes Waiting for kubelet to start.

The charm in trying to restart it, but falls into the following error:

$ sudo snap logs kubelet.daemon
2023-04-22T14:04:18Z systemd[1]: Started Service for snap application kubelet.daemon.
2023-04-22T14:04:18Z kubelet.daemon[53906]: E0422 14:04:18.514308 53906 run.go:74] "command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime"
2023-04-22T14:04:18Z systemd[1]: snap.kubelet.daemon.service: Main process exited, code=exited, status=1/FAILURE
2023-04-22T14:04:18Z systemd[1]: snap.kubelet.daemon.service: Failed with result 'exit-code'.
2023-04-22T14:04:28Z systemd[1]: snap.kubelet.daemon.service: Scheduled restart job, restart counter is at 109.
2023-04-22T14:04:28Z systemd[1]: Stopped Service for snap application kubelet.daemon.
2023-04-22T14:04:28Z systemd[1]: Started Service for snap application kubelet.daemon.
2023-04-22T14:04:28Z kubelet.daemon[53977]: E0422 14:04:28.757914 53977 run.go:74] "command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime"
2023-04-22T14:04:28Z systemd[1]: snap.kubelet.daemon.service: Main process exited, code=exited, status=1/FAILURE
2023-04-22T14:04:28Z systemd[1]: snap.kubelet.daemon.service: Failed with result 'exit-code'.

$ sudo snap info kubelet
installed: 1.27.1 (2945) 22MB classic,in-cohort

The upstream documentation says:

--container-runtime string Default: remote
The container runtime to use. Possible values: docker, remote. (DEPRECATED: will be removed in 1.27 as the only valid value is 'remote')

--cloud-provider string
The provider for cloud services. Set to empty string for running with no cloud provider. If set, the cloud provider determines the name of the node (consult cloud provider documentation to determine if and how the hostname is used). (DEPRECATED: will be removed in 1.24 or later, in favor of removing cloud provider code from Kubelet.)

https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/

The kubelet args looks like this in the snap:
$ cat /var/snap/kubelet/current/args
--kubeconfig="/root/cdk/kubeconfig" --v="0" --node-ip="172.31.20.76" --container-runtime="remote" --container-runtime-endpoint="unix:///var/run/containerd/containerd.sock" --cloud-provider="aws" --config="/root/cdk/kubelet/config.yaml" --pod-infra-container-image="rocks.canonical.com:443/cdk/pause:3.6" --register-with-taints="node-role.kubernetes.io/control-plane:NoSchedule"

To try, I sshed into the kubernetes-control-plane units, manually removed '--container-runtime="remote"' and '--cloud-provider="aws"' from /var/snap/kubelet/current/args since both seem to be deprecated and restarted the snap:

$ sudo snap restart kubelet.daemon

But the charm will overwrite the snap config and restart it again falling in the same error.

summary: Restarting snap.kubelet.daemon service results in failed to parse
- kubelet flag: unknown flag: --container-runtim
+ kubelet flag: unknown flag: --container-runtime
description: updated
Revision history for this message
Kevin W Monroe (kwmonroe) wrote :

From mattermost, 1.26 charms were tracking 'latest/stable' snaps. Those snaps were refreshed last week with the 1.27 GA. Installed snaps will automatically refresh; deployed charms will not. It's recommended to use a named track for both charm and snap channel config to keep them in-sync.

Two options:
- configure the charms to use 1.26/stable snaps vs latest/stable
- upgrade the charms to 1.27/stable which are compatible with 1.27/stable snaps

Changed in charm-kubernetes-master:
status: New → Triaged
Revision history for this message
Juan Pablo Noreña (jpablo-norena) wrote :

Thank you, it is a Charmed Kubernetes on AWS, so I decided to configure the charms to use 1.26/stable snaps and keep them in 1.27/stable since they support n-2 snap releases.

Revision history for this message
Kevin W Monroe (kwmonroe) wrote :

Ultimately, this is caused by a charm/snap mismatch. Charms can support n-2 snap versions, but not n+1. Docs need to call out the *charm* channel in addition to the snap channel config.

PR for that:
https://github.com/charmed-kubernetes/kubernetes-docs/pull/797

Changed in charm-kubernetes-master:
assignee: nobody → Kevin W Monroe (kwmonroe)
importance: Undecided → Low
milestone: none → 1.29
status: Triaged → In Progress
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.