Ubuntu 22.04 kernel 5.15.0-46-generic leaks kernel memory in kmalloc-2k slabs

Bug #1987430 reported by Chris Siebenmann
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
In Progress
Medium
Matthew Ruffell
Jammy
Fix Released
Medium
Matthew Ruffell
Kinetic
Fix Released
Medium
Matthew Ruffell

Bug Description

Since updating to kernel 5.15.0-46-generic (package version 5.15.0-46.49), all of our Ubuntu 22.04 LTS servers are leaking kernel memory; our first server with 8 GB of RAM just fatally OOMed, causing us to detect this. Inspection of OOM reports, /proc/meminfo, and /proc/slabinfo says that it's mostly going to unreclaimable kmalloc-2k slabs:

        Aug 23 12:51:11 cluster kernel: [361299.864757] Unreclaimable slab info:
        Aug 23 12:51:11 cluster kernel: [361299.864757] Name Used Total
        [...]
        Aug 23 12:51:11 cluster kernel: [361299.864924] kmalloc-2k 6676584KB 6676596KB

Most of our machines appear to be leaking slab memory at a rate of around 20 to 40 Mbytes/hour, with some machines leaking much faster; the champions are leaking kernel memory at 145 Mbytes/hour and 237 Mbytes/hour.

We aren't running any proprietary kernel modules and our only unusual kernel configuration is that we've disabled AppArmor with 'apparmor=0' on the kernel command line.

/proc/version_signature:
Ubuntu 5.15.0-46.49-generic 5.15.39

Full kernel command line from the Dell R240 system that fatally OOMd:
BOOT_IMAGE=/boot/vmlinuz-5.15.0-46-generic root=UUID=3165564f-a2dd-4b39-935b-114f3e23ff54 ro console=ttyS0,115200 console=tty0 apparmor=0

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1987430

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Chris Siebenmann (cks) wrote :

This is happening on a server and I cannot persuade it to run apport-collect in any useful way; it fails both with and without forwarded X (where lynx is started but apparently is incompatible with Launchpad for authorizing machines). I can attach specific information if necessary.

As an update, we have some systems that without the kernel update but with our recent change to explicitly disable AppArmor on the kernel command line, and they seem to also have the symptoms of this problem. So this issue may be a general one with disabling AppArmor in the Ubuntu kernel.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Chris Siebenmann (cks) wrote :

I would be happy to attach logs or other information from this system or any of our other systems, many of which are still being affected by this bug until we reboot them.

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi Chris,

I stood up a Jammy VM, and tried with and without apparmor=0 on the kernel command line, for a few hours at a time.

The /proc/slabinfo for kmalloc-2k and /proc/meminfo for unreclaimable slabs were stable and did not grow, so it must be related to your workload.

What sort of workloads are running on your servers?

If you stand up a fresh VM and minimally configure it, can you still see the kernel memory leak? Or do you need your workload to provoke it?

If you do have a set of commands to reproduce the issue, please list them here.

Thanks,
Matthew

Revision history for this message
Chris Siebenmann (cks) wrote :

We've seen this on a wide variety of workloads, including general user logins with NFS mounts, SLURM head and cluster nodes, a Prometheus/Grafana server, a Grafana Loki server, two Exim servers, a Samba server, LDAP servers, Matlab license servers, and a monitoring machine that just runs conserver. It seems to be correlated with the amount of processes and activity that happens on a machine, as the two machines that leaked the most are our primary general use login server and our Prometheus server (which is constantly running a churn of monitoring and probe activity). As a result of this, I don't currently have any particular commands that reproduce this.

It may be relevant that we are auditing some system calls. The generated /etc/audit/audit.rules on our servers has:
-D
-b 8192
-f 1
--backlog_wait_time 60000
-a exit,always -F arch=b64 -S execve
-a exit,always -F arch=b32 -S execve

We also have audit log only to files by masking systemd-journald-audit.socket.

I will see if I can reproduce this in a VM by generating random activity (I'm going to try repeatedly compiling something over and over), first in our standard configuration and then in a more minimal one. It will likely take at least a day or two to know one way or another.

Revision history for this message
Chris Siebenmann (cks) wrote :

It appears that the combination of our audit rules being enabled and apparmor=0 is what triggers the leak. My test case is repeatedly compiling Go from source and running its self tests (cloning https://go.googlesource.com/go, then 'cd go/src; while true; do ./all.bash; done'). On a VM configured as one of our machines (with audit rules), this leaks visibly. On a basically stock 22.04 VM (and thus with no audit rules), this doesn't leak. If I disable auditd on our configuration, it stops leaking. On the stock configuration, if I install auditd and our rules, and enable auditd (and reboot), it immediately starts leaking with rapid growth in kmalloc-2k. Taken from slabtop on the stock VM + auditd:

  OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
 81360 81355 99% 2.00K 5085 16 162720K kmalloc-2k
 44010 39386 89% 1.15K 1630 27 52160K ext4_inode_cache
[...]

This is on a VM that's been up only 24 minutes so far; this is the top slab entry, far ahead of the second placed one I've also shown. And just in the process of writing this comment, it's grown to 206080K.

Stopping auditd on the our-setup VM seems to stop further kmalloc-2k slab growth but doesn't reduce the current size.

Revision history for this message
Chris Siebenmann (cks) wrote :

I booted the "stock" VM with slub_nomerge and after two hours of uptime (and constant Go compilation and testing), the top counts in slabtop for active objects and memory use are:

  OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
536896 536869 99% 2.00K 33556 16 1073792K kmalloc-2k
536350 536350 100% 0.02K 3155 170 12620K audit_buffer
536320 536320 100% 0.25K 33520 16 134080K skbuff_head_cache
489099 489099 100% 0.10K 12541 39 50164K buffer_head
 78057 75309 96% 0.19K 3717 21 14868K dentry
 65110 59084 90% 0.02K 383 170 1532K lsm_inode_cache

  OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
537104 537097 99% 2.00K 33569 16 1074208K kmalloc-2k
536528 536528 100% 0.25K 33533 16 134132K skbuff_head_cache
465348 397547 85% 0.10K 11932 39 47728K buffer_head
 36936 36471 98% 1.15K 1368 27 43776K ext4_inode_cache
 76839 56841 73% 0.19K 3659 21 14636K dentry
 20988 19716 93% 0.62K 1749 12 13992K inode_cache
536520 536520 100% 0.02K 3156 170 12624K audit_buffer

It seems suggestive that the top three by count have almost the same count (and it's a large one).

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi Chris,

I have some good news to share. Thanks to your detailed comments, I can reproduce the issue easily in my lab. Here is my reproducer:

Start a fresh VM, either Jammy or Kinetic, just needs to use the Ubuntu -generic kernel.

1. Edit /etc/default/grub and append apparmor=0 to GRUB_CMDLINE_LINUX_DEFAULT
2. sudo update-grub
3. sudo apt update
4. sudo apt install auditd
5. Append the following to /etc/audit/rules.d/audit.rules:
-a exit,always -F arch=b64 -S execve
-a exit,always -F arch=b32 -S execve
6. sudo reboot
7. sudo apt install stress-ng
8. stress-ng --exec $(nproc)
9. Check the following for memory leaks:
$ watch "sudo cat /proc/meminfo | grep SUnreclaim"
$ watch "sudo cat /proc/slabinfo | grep kmalloc-2k"
$ sudo slabtop

At this point SUnreclaim will grow rapidly, at a rate of 3mb or so per second. If you leave it for a few minutes, it will consume hundreds of megabytes.

I have been doing some testing, and the Jammy 5.15.0-46-generic and Kinetic 5.19.0-15-generic kernels are affected.

I tried 5.15.0-25-generic as well, and it had the same issue.

I tried mainline 5.15 and 5.19 from the Ubuntu mainline repo, but they did not reproduce the issue at all.

Interesting. It currently looks like a custom Ubuntu SAUCE patch to either apparmor or audit is causing the memory leak. I'm going to start investigating this more deeply.

For now, I think that you should run with apparmor=1 on the kernel command line as a workaround while we root cause and get this fixed.

I'll keep you updated on what I find.

Thanks,
Matthew

Changed in linux (Ubuntu Jammy):
status: New → In Progress
Changed in linux (Ubuntu Kinetic):
status: Confirmed → In Progress
Changed in linux (Ubuntu Jammy):
importance: Undecided → Medium
Changed in linux (Ubuntu Kinetic):
importance: Undecided → Medium
Changed in linux (Ubuntu Jammy):
assignee: nobody → Matthew Ruffell (mruffell)
Changed in linux (Ubuntu Kinetic):
assignee: nobody → Matthew Ruffell (mruffell)
tags: added: jammy kinetic seg
Revision history for this message
JianlinLv (jianlinlv) wrote :

hi Matthew,
Any update about this issue?
This issue also was happened with ubuntu 22.04 kernel 5.15.0-26.
Does the latest tag Ubuntu-5.15.0-66.73 fix this issue?

Revision history for this message
JianlinLv (jianlinlv) wrote :

Attach patch to fix this issue

tags: added: patch
Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi JianlinLV,

I tried the latest 5.15.0-60-generic kernel from -updates, and I couldn't reproduce the issue anymore.

Can you try 5.15.0-60-generic and let me know?

I bisected it down to 5.15.0-52-generic being broken, and it being fixed in 5.15.0-53-generic.

Still trying to find the commit which fixed the issue.

I had a read of your patch, and it fixes a part of the Apparmor LSM Stacking patchset that Ubuntu carries out of tree. The upstream code has changed a bit from what is found in the 5.15.0-x-generic kernel in Jammy. https://<email address hidden>/

Let me know how 5.15.0-60-generic goes.

Thanks,
Matthew

Revision history for this message
JianlinLv (jianlinlv) wrote :

hi Matthew,
I did some investigation in 5.15.0-66-generic, lsm_multiple_contexts()return 0 that make audit_log_lsm() return without malloc memory thus avoiding memory leaks.

audit_log_lsm()
->if (!lsm_multiple_contexts())
 return;

I haven't found out which commit change the behavior of lsm_multiple_contexts().

Jianlin

Revision history for this message
Jacob Martin (jacobmartin) wrote :

I am able to reproduce this issue on 5.15.0-52-generic. However, it seems to be hidden in 5.15.0-53-generic by this commit:

39cce16cfeed UBUNTU: SAUCE: LSM: Change Landlock from LSMBLOB_NEEDED to LSMBLOB_NOT_NEEDED

Applying this commit on its own on top of 5.15.0-52-generic stops the memory leak in the test case described by Matthew in #8. This is coincidental, since now with apparmor=0 no lsmblob slots are assigned. Thus as JianlinLv mentions in #12, lsm_multiple_contexts() will return false, and audit_log_lsm() will exit before any memory is allocated.

Before this commit, landlock was assigned 3 lsmblob slots that did not use the task_getsecid_obj hook (from dmesg with lsm.debug=1):
[ 0.155733] LSM: landlock assigned lsmblob slot 0
[ 0.155733] LSM: landlock assigned lsmblob slot 1
[ 0.155733] LSM: landlock assigned lsmblob slot 2

Thus, before 5.15.0-53, lsm_multiple_contexts() would return true and there would be no early exit before memory allocation. With apparmor disabled, the only LSM modules registered to use lsmblob slots would be ones that did not implement the task_getsecid_subj hook, so the localblob variable would not get set by anyone. Hence, there would be this other early exit (post-allocation) in audit_log_lsm()...

    if (blob == NULL) {
        security_task_getsecid_subj(current, &localblob);
        if (!lsmblob_is_set(&localblob))
            return;
        ...
    }

... which is one of the two locations addressed by the patch.

The above commit introduced in 5.15.0-53 does not fix the underlying problem, but the underlying problem is resolved by JianlinLv's patch. The patch has received its two ACKs on the SRU mailing list and is pending application.

Stefan Bader (smb)
Changed in linux (Ubuntu Jammy):
status: In Progress → Fix Committed
Revision history for this message
Jacob Martin (jacobmartin) wrote :

For Kinetic, this was resolved in 5.19.0-17 by the apparmor and LSM stacking patchset described by this LP bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1989983.

Specifically, the underlying issue was fixed in Kinetic by these two commits:

    4bd3e737cd45 Revert "UBUNTU: SAUCE: Audit: Add new record for multiple process LSM attributes"
    dd547c837f7c UBUNTU: SAUCE: lsm stacking v37: Audit: Add record for multiple task security contexts

Together, these commits removed the affected function and replaced it with an implementation that does not have this memory leak.

Changed in linux (Ubuntu Kinetic):
status: In Progress → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.15.0-68.75 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux verification-needed-jammy
Revision history for this message
Matthew Ruffell (mruffell) wrote :

Performing verification for Jammy.

I started a fresh Jammy VM, and installed 5.15.0-67-generic from -updates.

I appended apparmor=0 to GRUB_CMDLINE_LINUX_DEFAULT, updated grub, and rebooted.

From there, I installed auditd, and stress-ng. I edited /etc/audit/rules.d/audit.rules to include:

-a exit,always -F arch=b64 -S execve
-a exit,always -F arch=b32 -S execve

I then rebooted, and started stress-ng.

I checked the kmalloc-2k slab with:

$ watch "sudo cat /proc/meminfo | grep SUnreclaim"
$ watch "sudo cat /proc/slabinfo | grep kmalloc-2k"
$ sudo slabtop

Now, since this bug has been worked around by:

39cce16cfeed UBUNTU: SAUCE: LSM: Change Landlock from LSMBLOB_NEEDED to LSMBLOB_NOT_NEEDED

in 5.15.0-53-generic, the kmalloc-k slab did not uncontrollably increase, but
that is okay, it is just the leak isn't reachable with landlock not using the
task_getsecid_obj hook.

This means this verification is more of a smoke test than checking root cause.

I then enabled -proposed, and installed 5.15.0-68-generic, and rebooted.

I again ran stress-ng and checked the kmalloc-2k slab, and all was well, there
was no memory leak.

The core issue was fixed, but again, masked by the previous fix in
5.15.0-53-generic.

There was no smoke, and things ran as expected. Marking verified for Jammy.

tags: added: verification-done-jammy
removed: verification-needed-jammy
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (66.3 KiB)

This bug was fixed in the package linux - 5.15.0-69.76

---------------
linux (5.15.0-69.76) jammy; urgency=medium

  * jammy/linux: 5.15.0-69.76 -proposed tracker (LP: #2012092)

  * NFS deathlock with last Kernel 5.4.0-144.161 and 5.15.0-67.74 (LP: #2009325)
    - NFS: Correct timing for assigning access cache timestamp

linux (5.15.0-68.75) jammy; urgency=medium

  * jammy/linux: 5.15.0-68.75 -proposed tracker (LP: #2008349)

  * Packaging resync (LP: #1786013)
    - debian/dkms-versions -- update from kernel-versions (main/2023.02.27)

  * Ubuntu 22.04 kernel 5.15.0-46-generic leaks kernel memory in kmalloc-2k
    slabs (LP: #1987430)
    - SAUCE: audit: fix memory leak of audit_log_lsm()

  * [EGS] Backport intel_idle support for Eagle Stream Ubuntu 22.04 release
    (LP: #2003267)
    - intel_idle: add SPR support
    - intel_idle: add 'preferred_cstates' module argument
    - intel_idle: add core C6 optimization for SPR
    - cpuidle: intel_idle: Drop redundant backslash at line end
    - intel_idle: Fix the 'preferred_cstates' module parameter
    - intel_idle: Fix SPR C6 optimization
    - intel_idle: make SPR C1 and C1E be independent

  * Fix speaker mute hotkey doesn't work on Dell G16 series (LP: #2003161)
    - platform/x86: dell-wmi: Add a keymap for KEY_MUTE in type 0x0010 table

  * Fix the ACPI _CPC not found error from kernel dmesg on some dynamic SSDT
    table loaded firmwares (LP: #2006077)
    - ACPI: bus: Avoid using CPPC if not supported by firmware
    - ACPI: bus: Set CPPC _OSC bits for all and when CPPC_LIB is supported
    - ACPI: CPPC: Only probe for _CPC if CPPC v2 is acked

  * rtcpie in timers from ubuntu_kernel_selftests randomly failing
    (LP: #1814234)
    - SAUCE: selftest: rtcpie: Force passing unreliable subtest

  * Jammy update: v5.15.87 upstream stable release (LP: #2007441)
    - usb: dwc3: qcom: Fix memory leak in dwc3_qcom_interconnect_init
    - cifs: fix oops during encryption
    - nvme-pci: fix doorbell buffer value endianness
    - nvme-pci: fix mempool alloc size
    - nvme-pci: fix page size checks
    - ACPI: resource: do IRQ override on LENOVO IdeaPad
    - ACPI: resource: do IRQ override on XMG Core 15
    - ACPI: resource: do IRQ override on Lenovo 14ALC7
    - block, bfq: fix uaf for bfqq in bfq_exit_icq_bfqq
    - ata: ahci: Fix PCS quirk application for suspend
    - nvme: fix the NVME_CMD_EFFECTS_CSE_MASK definition
    - nvmet: don't defer passthrough commands with trivial effects to the
      workqueue
    - fs/ntfs3: Validate BOOT record_size
    - fs/ntfs3: Add overflow check for attribute size
    - fs/ntfs3: Validate data run offset
    - fs/ntfs3: Add null pointer check to attr_load_runs_vcn
    - fs/ntfs3: Fix memory leak on ntfs_fill_super() error path
    - fs/ntfs3: Add null pointer check for inode operations
    - fs/ntfs3: Validate attribute name offset
    - fs/ntfs3: Validate buffer length while parsing index
    - fs/ntfs3: Validate resident attribute name
    - fs/ntfs3: Fix slab-out-of-bounds read in run_unpack
    - soundwire: dmi-quirks: add quirk variant for LAPBC710 NUC15
    - fs/ntfs3: Validate index root when initialize NTFS security
    - fs/ntfs3: Use __G...

Changed in linux (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-azure/5.15.0-1036.43 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-azure verification-needed-jammy
removed: verification-done-jammy
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-aws/5.15.0-1034.38 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-aws
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-xilinx-zynqmp/5.15.0-1021.25 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-xilinx-zynqmp
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-aws-5.15/5.15.0-1046.51~20.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal-linux-aws-5.15' to 'verification-done-focal-linux-aws-5.15'. If the problem still exists, change the tag 'verification-needed-focal-linux-aws-5.15' to 'verification-failed-focal-linux-aws-5.15'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-focal-linux-aws-5.15-v2 verification-needed-focal-linux-aws-5.15
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.