Slab page exclusion issue on Linux 6.2-rc1

Bug #2038248 reported by Chengen Du
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
makedumpfile (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Fix Committed
Medium
Chengen Du
Lunar
Fix Committed
Medium
Chengen Du

Bug Description

[Impact]

The kernel crashdumps generated by makedumpfile on kernel 6.2
(affects Lunar, and Jammy with the HWE kernel) might not open
on crash, due to kernel changes not reflected in makedumpfile.

The Kernel commit 130d4df57390 ("mm/sl[au]b: rearrange struct slab fields to allow larger rcu_head"), included in Linux 6.2-rc1 and later versions, introduced a change that aligns the offset of slab.slabs with that of page.mapping.
However, this modification unintentionally causes the makedumpfile command with the -d 8 option, meant to exclude user data, to incorrectly exclude certain slab pages.
Consequently, when utilizing dumpfiles generated in this manner, the "crash" utility may encounter an error when attempting to initiate a session:

crash: page excluded: kernel virtual address: ffff0000e269d428 type: "xa_node shift"

[Fix]

An upstream fix is available.
==========
commit 5f17bdd2128998a3eeeb4521d136a192222fadb6
Author: Kazuhito Hagio <email address hidden>
Date: Wed Dec 21 11:06:39 2022 +0900

    [PATCH] Fix wrong exclusion of slab pages on Linux 6.2-rc1
==========

[Test Plan]

1. Install the required packages and then proceed to reboot the machine.
# sudo apt install crash linux-crashdump -y
# reboot

2. To check the status of kdump, use the `kdump-config show` command.
# kdump-config show
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_COREDIR: /var/crash
crashkernel addr: 0x64000000
   /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinuz-6.2.0-33-generic
kdump initrd:
   /var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-6.2.0-33-generic
current state: ready to kdump

kexec command:
  /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-6.2.0-33-generic root=UUID=3e72f5d5-870b-4b8e-9a0d-8ba920391379 ro console=tty1 console=ttyS0 reset_devices systemd.unit=kdump-tools-dump.service nr_cpus=1 irqpoll usbcore.nousb" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz

3. To trigger a crash dump forcefully, execute the `echo c | sudo tee /proc/sysrq-trigger` command.
4. Download the kernel .ddeb file, which will be used for analyzing the dump file.
# sudo -i
# cd /var/crash
# pull-lp-ddebs linux-image-unsigned-$(uname -r)
# dpkg-deb -x linux-image-unsigned-$(uname -r)-*.ddeb dbgsym-$(uname -r)
5. Utilize the "crash" utility to parse and analyze the dump file.
# crash dbgsym-$(uname -r)/usr/lib/debug/boot/vmlinux-$(uname -r) XXXX/dump.XXXX
...
please wait... (gathering task table data)
crash: page excluded: kernel virtual address: ffff0000e269d428 type: "xa_node shift"

[Where problems could occur]

The patch has altered the method for excluding slab pages, aligning with the structural changes introduced in Linux 6.2-rc1.
This modification is essential for Linux kernel 6.2.
However, it's crucial to note that this change may impact the content of the dump file, potentially leading to a situation where the "crash" utility is unable to parse it in the worst-case scenario.

Chengen Du (chengendu)
Changed in makedumpfile (Ubuntu Lunar):
assignee: nobody → Chengen Du (chengendu)
status: New → In Progress
Revision history for this message
Chengen Du (chengendu) wrote :

debdiff for Lunar

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "lp2038248-makedumpfile-lunar.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi Chengen,

Thanks for the detailed SRU template and debdiff!

I have only 2 minor fixes, which I already performed:
- Version: s/ubuntu1/ubuntu0.1/ (see doc [1])
- Maintainer: this is the first 'ubuntu' version, so run `update-maintainer` (see `debian/control` hunk).

The updated debdiff is attached for reference,
and I'll continue the work on sponsoring this.

cheers

[1] https://wiki.ubuntu.com/SecurityTeam/UpdatePreparation#Update_the_packaging

tags: added: se-sponsor-mfo
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi Chengen,

The fix is included in 1.7.3 in mantic, so only lunar needs the fix.

We probably would like jammy as well, for compatibility with the 6.2+ HWE kernel (without regression to the 5.15 GA kernel).

Could you please check jammy for that too? (I'll add a task as Incomplete.)

 $ git describe --contains 5f17bdd2128998a3eeeb4521d136a192222fadb6
 1.7.3~6

 $ rmadison -a source makedumpfile
  makedumpfile | 1.5.5-2ubuntu1 | trusty | source
  makedumpfile | 1.5.5-2ubuntu1.6 | trusty-updates | source
  makedumpfile | 1:1.5.9-5~ubuntu14.04.1 | trusty-backports | source
  makedumpfile | 1:1.5.9-5 | xenial | source
  makedumpfile | 1:1.6.3-2~16.04.3 | xenial-updates | source
  makedumpfile | 1:1.6.3-2 | bionic | source
  makedumpfile | 1:1.6.5-1ubuntu1~18.04.7 | bionic-updates | source
  makedumpfile | 1:1.6.7-1ubuntu2 | focal | source
  makedumpfile | 1:1.6.7-1ubuntu2.4 | focal-updates | source
  makedumpfile | 1:1.7.0-1build1 | jammy | source
  makedumpfile | 1:1.7.2-1 | lunar | source
  makedumpfile | 1:1.7.3-1 | mantic | source

Packages verified with LXD VM and upstream crash for now (before bug 2038248). All good!

Uploaded to Lunar.

Thanks!

Details:
---

Setup:

 $ lxc launch --vm --config limits.memory=2GiB ubuntu:lunar mdf-l
 $ lxc shell mdf-l

 # apt update && apt install -y linux-image-generic linux-crashdump crash
 # apt remove -y $(dpkg -l | awk '$2 ~ /linux-.*kvm/ { print $2 }')

 # sed 's/crashkernel=[^ "]\+/crashkernel=512M/' -i /etc/default/grub.d/kdump-tools.cfg
 # update-grub
 # reboot
 # kdump-config show | grep state:
 current state: ready to kdump
 # echo c >/proc/sysrq-trigger

Debug symbols:

 # wget https://launchpad.net/ubuntu/+archive/primary/+files/linux-image-unsigned-6.2.0-34-generic-dbgsym_6.2.0-34.34_amd64.ddeb
 # ar x linux-image-unsigned-6.2.0-34-generic-dbgsym_6.2.0-34.34_amd64.ddeb data.tar.xz
 # tar xvf data.tar.xz ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic
 ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic

Upstream crash (for now):

 # apt build-dep -y crash
 # git clone https://github.com/crash-utility/crash.git
 # cd crash
 # make

Original package:
---

 # ./crash /var/crash/202310072357/dump.202310072357 ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic
 ...
 please wait... (gathering task table data)
 crash: page excluded: kernel virtual address: ffff9b13c2b826c8 type: "xa_node shift"

Patched package:
---

 # wget https://launchpad.net/~mfo/+archive/ubuntu/test/+build/26759821/+files/makedumpfile_1.7.2-1ubuntu0.1_amd64.deb
 # apt install ./makedumpfile_1.7.2-1ubuntu0.1_amd64.deb

 # kdump-config reload
 # kdump-config show | grep state:
 current state: ready to kdump
 # echo c >/proc/sysrq-trigger

 # ./crash /var/crash/202310080054/dump.202310080054 ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic
 ...
       KERNEL: ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic
     DUMPFILE: /var/crash/202310080054/dump.202310080054 [PARTIAL DUMP]
 ...
 crash>

Changed in makedumpfile (Ubuntu Jammy):
status: New → Incomplete
assignee: nobody → Chengen Du (chengendu)
Changed in makedumpfile (Ubuntu Lunar):
importance: Undecided → Medium
Changed in makedumpfile (Ubuntu Jammy):
importance: Undecided → Medium
Changed in makedumpfile (Ubuntu):
status: New → Fix Released
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

(The package built correctly in a PPA on all architectures.)

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Jammy is also affected for the 6.2 HWE kernel.

The patch for makedumpfile is the same, and applies cleanly.
It fixes the issue with the 6.2 HWE kernel, and causes no regression with the 5.15 GA kernel (ie, the dump file can still be opened in crash).

Details:
---

Setup:

 $ lxc launch --vm --config limits.memory=2GiB ubuntu:jammy mdf-j
 $ lxc shell mdf-j

 # apt update && apt install -y linux-image-generic-hwe-22.04 linux-crashdump crash
 # apt remove -y $(dpkg -l | awk '$2 ~ /linux-.*kvm/ { print $2 }')

 # sed 's/crashkernel=[^ "]\+/crashkernel=512M/' -i /etc/default/grub.d/kdump-tools.cfg
 # update-grub
 # reboot
 # kdump-config show | grep state:
 current state: ready to kdump
 # echo c >/proc/sysrq-trigger

Debug symbols:

 # wget https://launchpad.net/ubuntu/+archive/primary/+files/linux-image-unsigned-6.2.0-34-generic-dbgsym_6.2.0-34.34~22.04.1_amd64.ddeb
 # ar x linux-image-unsigned-6.2.0-34-generic-dbgsym_6.2.0-34.34~22.04.1_amd64.ddeb data.tar.xz
 # tar xvf data.tar.xz ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic
 ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic

Upstream crash (for now):

 # apt build-dep -y crash
 # git clone https://github.com/crash-utility/crash.git
 # cd crash
 # make

Original package:
---

 # ./crash ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic /var/crash/202310101134/dump.202310101134
 ...
 please wait... (gathering task table data)
 crash: page excluded: kernel virtual address: ffff9b13c2b826c8 type: "xa_node shift"

Patched package:
---

 $ ./crash ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic /var/crash/202310101206/dump.202310101206
 ...
      RELEASE: 6.2.0-34-generic
      VERSION: #34~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Sep 7 13:12:03 UTC 2
 ...
 crash>

Patched package & GA kernel:
---

 (upstream crash and Ubuntu crash, both work)

 $ ./crash ./usr/lib/debug/boot/vmlinux-5.15.0-86-generic /var/crash/202310101225/dump.202310101225
 ...
      RELEASE: 5.15.0-86-generic
      VERSION: #96-Ubuntu SMP Wed Sep 20 08:23:49 UTC 2023
 ...
 crash>

 $ crash ./usr/lib/debug/boot/vmlinux-5.15.0-86-generic /var/crash/202310101225/dump.202310101225
 ...
      RELEASE: 5.15.0-86-generic
      VERSION: #96-Ubuntu SMP Wed Sep 20 08:23:49 UTC 2023
 ...
 crash>

Changed in makedumpfile (Ubuntu Jammy):
status: Incomplete → Confirmed
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
description: updated
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Uploaded to Jammy too.

The SRU template is now updated to reflect that.

Packages verified with LXD VM and upstream crash for the 6.2 kernel for now (before bug 2038248) and Ubuntu crash for the 5.15 GA kernel. All good!
(Details in comment #7.)

The package built correctly in a PPA on all architectures.

Changed in makedumpfile (Ubuntu Jammy):
status: Confirmed → In Progress
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello Chengen, or anyone else affected,

Accepted makedumpfile into lunar-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/makedumpfile/1:1.7.2-1ubuntu0.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-lunar to verification-done-lunar. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-lunar. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in makedumpfile (Ubuntu Lunar):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-lunar
Changed in makedumpfile (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed-jammy
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Hello Chengen, or anyone else affected,

Accepted makedumpfile into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/makedumpfile/1:1.7.0-1ubuntu0.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Chengen Du (chengendu) wrote :

The package in -proposed has been successfully tested:
1:1.7.0-1ubuntu0.1 in Jammy, 1:1.7.2-1ubuntu0.1 in Lunar

tags: added: verification-done verification-done-jammy verification-done-lunar
removed: verification-needed verification-needed-jammy verification-needed-lunar
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.