Memory leaking when running kubernetes cronjobs
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Triaged
|
High
|
Unassigned | ||
Bionic |
Triaged
|
High
|
Unassigned | ||
Cosmic |
Triaged
|
High
|
Unassigned | ||
linux-azure (Ubuntu) |
Triaged
|
High
|
Unassigned |
Bug Description
We are using Kubernetes V1.8.15 with docker 18.03.1-ce.
We schedule 50 Kubernetes cronjobs to run every 5 minutes. Each cronjob will create a simple busybox container, echo hello world, then terminate.
In the data attached to the bug I let this run for 1 hour, and in this time the Available memory had reduced from 31256704 kB to 30461224 kB - so a loss of 776 MB. From previous longer runs we observe the available memory continues to drop.
There doesn't appear to be any processes left behind, or any growth in any other processes to explain where the memory has gone.
echo 3 > /proc/sys/
We are currently running Ubuntu 4.15.0-
The leak was more severe on the Debian system, and investigations there showed leaks in pcpu_get_vm_areas and were related to memory cgroups. Running with Kernel 4.17 on debian showed a leak at a similar rate to what we now observe on Ubuntu 18. This leak causes us issues as we need to run the cronjobs regularly and want the systems to remain up for months.
Kubernetes will create a new cgroup each time the cronjob runs, but these are removed when the job completes (which takes a few seconds). If I use systemd-cgtop I don't see any increase in cgroups over time - but if I monitor /proc/cgroups over time I can see num_cgroups for memory increases.
For the duration of the test I collected slabinfo, meminfo, vmallocinfo & cgroups - which I will attach to the bug. Each file is suffixed with the number of seconds since the start.
*.0 & *.600 were taken before the test was started. The test was stopped shortly after the *.4200 files were generated. I then left the system idle for 10 minutes. I then ran echo 3 > /proc/sys/
Note, the data attached is from running on kernel 4.18.7-
*** Problem in linux-image-
The problem cannot be reported:
This report is about a package that is not installed.
So I switched back to 4.15.0-
ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-
ProcVersionSign
Uname: Linux 4.15.0-32-generic x86_64
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Sep 13 08:55 seq
crw-rw---- 1 root audio 116, 33 Sep 13 08:55 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.2
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
Date: Thu Sep 13 08:55:46 2018
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd
Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: Xen HVM domU
PciMultimedia:
ProcEnviron:
LANG=C.UTF-8
SHELL=/bin/bash
TERM=xterm
PATH=(custom, no user)
ProcFB:
ProcKernelCmdLine: BOOT_IMAGE=
RelatedPackageV
linux-
linux-
linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
WifiSyslog:
dmi.bios.date: 08/13/2018
dmi.bios.vendor: Xen
dmi.bios.version: 4.7.5-1.21
dmi.chassis.type: 1
dmi.chassis.vendor: Xen
dmi.modalias: dmi:bvnXen:
dmi.product.name: HVM domU
dmi.product.
dmi.sys.vendor: Xen
tags: | added: kernel-bug-reported-upstream |
Changed in linux (Ubuntu Bionic): | |
status: | New → Triaged |
importance: | Undecided → Medium |
Changed in linux (Ubuntu): | |
importance: | Medium → High |
Changed in linux (Ubuntu Bionic): | |
importance: | Medium → High |
tags: |
added: kernel-key removed: kernel-da-key |
tags: |
added: kernel-da-key removed: kernel-key |
Changed in linux-azure (Ubuntu): | |
status: | New → Triaged |
Changed in linux-azure (Ubuntu Bionic): | |
status: | New → Triaged |
Changed in linux-azure (Ubuntu Cosmic): | |
status: | New → Triaged |
no longer affects: | linux-azure (Ubuntu Cosmic) |
no longer affects: | linux-azure (Ubuntu Bionic) |
Changed in linux-azure (Ubuntu): | |
importance: | Undecided → High |
tags: | added: cscc |
This change was made by a bot.