Magnum

Bug #1763405
Comment #0

Comment 0 for bug 1763405

Revision history for this message

Bharat Kunwar (brtknr) wrote on 2018-04-12: FailedNodeAllocatableEnforcement warning under `kubectl describe nodes`

FailedNodeAllocatableEnforcement warning under `kubectl describe nodes` events

Steps to reproduce:

- Deploy kubernetes with the latest branch of queens/stable with following kubernetes config:

Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"archive", BuildDate:"2018-02-13T11:42:06Z", GoVersion:"go1.10rc2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"archive", BuildDate:"2018-02-13T11:42:06Z", GoVersion:"go1.10rc2", Compiler:"gc", Platform:"linux/amd64"}

- Run `kubectl describe nodes` on the master node.

Expected result:

No warning

Actual result:

Warning FailedNodeAllocatableEnforcement 25m (x876 over 15h) kubelet, k8s-fa27-mqf5nkvarpmq-minion-0 Failed to update Node Allocatable Limits "": failed to set supported cgroup subsystems for cgroup : Failed to set config for supported subsystems : failed to write 135089913856 to memory.limit_in_bytes: write /rootfs/var/lib/containers/atomic/kubelet.0/rootfs/sys/fs/cgroup/memory/memory.limit_in_bytes: invalid argument

Discussion:

This is essentially a [kubernetes bug](https://github.com/kubernetes/kubernetes/issues/55867). The presence of the warning doesn't necessarily appear to change the overall behaviour of the cluster. According to the [docs](https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#enabling-qos-and-pod-level-cgroups), `cgroups-per-qos` is supposed to be enabled by default. However, this change is not propagated through in the latest version of kubernetes.

As a workaround, I have attached a patch for magnum. If your cluster is already deployed, ingress into your worker nodes and add/change `--cgroups-per-qos=true --enforce-node-allocatable=pods` on to your `KUBELET_ARGS` line located inside `/etc/kubernetes/kubelet` so it looks something like this:

```
KUBELET_ARGS="$(/etc/kubernetes/get_require_kubeconfig.sh) --pod-manifest-path=/etc/kubernetes/manifests --cadvisor-port=0 --kubeconfig /etc/kubernetes/kubelet-config.yaml --hostname-override=k8s-fa27-mqf5nkvarpmq-minion-0 --address=10.0.0.5 --port=10250 --read-only-port=0 --anonymous-auth=false --authorization-mode=Webhook --authentication-token-webhook=true --cluster_dns=10.254.0.10 --cluster_domain=cluster.local --pod-infra-container-image=gcr.io/google_containers/pause:3.0 --client-ca-file=/etc/kubernetes/certs/ca.crt --tls-cert-file=/etc/kubernetes/certs/kubelet.crt --tls-private-key-file=/etc/kubernetes/certs/kubelet.key --cgroup-driver=systemd --cgroups-per-qos=true --enforce-node-allocatable=pods"
```

Then run `sudo systemctl restart kubelet.service`.

FailedNodeAllocatableEnforcement warning under `kubectl describe nodes` events

Steps to reproduce:

- Deploy kubernetes with the latest branch of queens/stable with following kubernetes config:

- Run `kubectl describe nodes` on the master node.

Expected result:

No warning

Actual result:

Warning  FailedNodeAllocatableEnforcement  25m (x876 over 15h)  kubelet, k8s-fa27-mqf5nkvarpmq-minion-0  Failed to update Node Allocatable Limits "": failed to set supported cgroup subsystems for cgroup : Failed to set config for supported subsystems : failed to write 135089913856 to memory.limit_in_bytes: write /rootfs/var/lib/containers/atomic/kubelet.0/rootfs/sys/fs/cgroup/memory/memory.limit_in_bytes: invalid argument

Discussion:

```
KUBELET_ARGS="$(/etc/kubernetes/get_require_kubeconfig.sh) --pod-manifest-path=/etc/kubernetes/manifests --cadvisor-port=0 --kubeconfig /etc/kubernetes/kubelet-config.yaml --hostname-override=k8s-fa27-mqf5nkvarpmq-minion-0 --address=10.0.0.5 --port=10250 --read-only-port=0 --anonymous-auth=false --authorization-mode=Webhook --authentication-token-webhook=true --cluster_dns=10.254.0.10 --cluster_domain=cluster.local  --pod-infra-container-image=gcr.io/google_containers/pause:3.0 --client-ca-file=/etc/kubernetes/certs/ca.crt --tls-cert-file=/etc/kubernetes/certs/kubelet.crt --tls-private-key-file=/etc/kubernetes/certs/kubelet.key  --cgroup-driver=systemd --cgroups-per-qos=true --enforce-node-allocatable=pods"
```

Then run `sudo systemctl restart kubelet.service`.