Kubernetes Worker Charm

increase inotify limits for kubelet/cAdvisor

Bug #1828759 reported by Paul Collins on 2019-05-13

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	Kubernetes Worker Charm	Fix Released	Medium	Mike Wilson	Kubernetes Worker Charm 1.16

Bug Description

One of our k8s clusters, running v1.12.8, has a lot of cron jobs running, and therefore spawns a lot of pods. This seems to provoke an inotify leak somewhere in k8s that eventually causes our nodes to stop working and become NotReady. kubelet was logging this at the end of each attempt to start:

May 12 06:25:43 juju-66cffb-mojo-is-kubernetes-24 kubelet.daemon[14274]: E0512 06:25:43.547072 14274 raw.go:146] Failed to watch directory "/sys/fs/cgroup/blkio/system.slice/run-r21e036a699424d61aab9c6320782209e.scope": inotify_add_watch /sys/fs/cgroup/blkio/system.slice/run-r21e036a699424d61aab9c6320782209e.scope: no space left on device
May 12 06:25:43 juju-66cffb-mojo-is-kubernetes-24 kubelet.daemon[14274]: E0512 06:25:43.547173 14274 raw.go:146] Failed to watch directory "/sys/fs/cgroup/blkio/system.slice": inotify_add_watch /sys/fs/cgroup/blkio/system.slice/run-r21e036a699424d61aab9c6320782209e.scope: no space left on device
May 12 06:25:43 juju-66cffb-mojo-is-kubernetes-24 kubelet.daemon[14274]: F0512 06:25:43.547217 14274 kubelet.go:1344] Failed to start cAdvisor inotify_add_watch /sys/fs/cgroup/blkio/system.slice/run-r21e036a699424d61aab9c6320782209e.scope: no space left on device
May 12 06:25:45 juju-66cffb-mojo-is-kubernetes-24 systemd[1]: snap.kubelet.daemon.service: Main process exited, code=exited, status=255/n/a

fs.inotify.max_user_watches was set to 8192 when I investigated, and after changing it to 1048576, on the next restart attempt kubelet didn't exit, the nodes became Ready, and our cron jobs started running again.

Various third parties increased these limits:
  * https://github.com/kubermatic/machine-controller/pull/471/files
  * https://github.com/jetstack/tarmak/pull/757/files
and it seems that cAdvisor has fixed it, and possibly the change has made it into some version of Kubernetes itself, although the precise status of the k8s issue is not entirely clear to me
  * https://github.com/google/cadvisor/pull/1916
  * https://github.com/kubernetes/kubernetes/issues/63204

If the fixed cAdvisor has not yet made it to all current releases, then the Juju charm should probably bump the inotify limits in the meantime.

Tags:

Revision history for this message

Mike Wilson (knobby) wrote on 2019-05-16:

This is a configurable thing after the next stable release. It was fixed in https://github.com/charmed-kubernetes/layer-kubernetes-master-worker-base/pull/3.

The way it will work is `juju config sysctl="{ fs.inotify.max_user_watches=1048576 }"`

Changed in charm-kubernetes-worker:
assignee:	nobody → Mike Wilson (knobby)
importance:	Undecided → Medium
status:	New → Fix Committed

Revision history for this message

Tom Haddon (mthaddon) wrote on 2019-05-17:

What will the default be? Surely CDK should set the defaults to be sensible enough that people won't run into this.

Revision history for this message

Mike Wilson (knobby) wrote on 2019-05-17:

Excellent point. Currently the defaults do not include inotify settings. Do you have any input as to what settings should be default?

Revision history for this message

Paul Collins (pjdc) wrote on 2019-05-21:

I've set them as follows on our spawn-heavy cluster:

fs.inotify.max_user_instances = 8192 # default 128
fs.inotify.max_user_watches = 1048576 # default 8192

and so far so good, although I don't know how large these structures are so this could represent a surprisingly large amount of unpageable kernel memory. They don't appear to live in their own slabs, so /proc/slabinfo is no help. FWIW, the first two links in my original report used similar values.

Felipe Reyes (freyes) on 2019-05-24

tags:

added: sts

Revision history for this message

Seyeong Kim (seyeongkim) wrote on 2019-06-12:

It seems that this fix is released.
Could somebody confirm if it is released?

Thanks

Revision history for this message

Seyeong Kim (seyeongkim) wrote on 2019-06-12:

https://api.jujucharms.com/charmstore/v5/~containers/kubernetes-worker-541/archive/config.yaml
include sysctl, proxy-extra-args

Revision history for this message

George Kraft (cynerva) wrote on 2019-06-12:

The sysctl config has been released to stable with these charm revisions:
cs:~containers/kubernetes-master-684
cs:~containers/kubernetes-worker-541

However, we have not addressed the second part of this issue: bumping the inotify limits by default. For now, you can work around it by using the kubernetes-worker charm's sysctl config to manually set the inotify limits.

Removing Fix Committed status, since part of this issue is still unresolved.

Changed in charm-kubernetes-worker:
status:	Fix Committed → Confirmed

Revision history for this message

Mike Wilson (knobby) wrote on 2019-07-01:

https://github.com/charmed-kubernetes/layer-kubernetes-master-worker-base/pull/6

Changed in charm-kubernetes-worker:
status:	Confirmed → In Progress

Revision history for this message

Tim Van Steenburgh (tvansteenburgh) wrote on 2019-07-23:

The new fs.inotify defaults will be available in edge builds:
cs:~containers/kubernetes-master-714
cs:~containers/kubernetes-worker-562

Changed in charm-kubernetes-worker:
status:	In Progress → Fix Committed

Revision history for this message

Seyeong Kim (seyeongkim) wrote on 2019-08-20:

#10

Is this reverted again?

I was able to see this on 714 but not in 724

Revision history for this message

George Kraft (cynerva) wrote on 2019-08-20:

#11

This hasn't been reverted. The fix is available in edge.

kubernetes-master-714 was an edge build, so it makes sense that you saw the fix there.

kubernetes-master-724 was a candidate/stable build, and we have not backported this fix to stable, so it's expected not to see the fix there.

Tim Van Steenburgh (tvansteenburgh) on 2019-08-20

Changed in charm-kubernetes-worker:
milestone:	none → 1.16

Tim Van Steenburgh (tvansteenburgh) on 2019-09-27

Changed in charm-kubernetes-worker:
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

Duplicates of this bug

Bug #1830342

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.