kdump cannot generate coredump file on bluefield with 5.4 kernel

Bug #2021930 reported by Tony Duan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-bluefield (Ubuntu)
New
Undecided
Unassigned

Bug Description

kdump cannot generate coredump file on bluefield with 5.4 kernel

Bug description:

Following the instruction in https://ubuntu.com/server/docs/kernel-crash-dump, the coredump file cannot be generated.

Bluefield is running 5.4 kernel
 bf2:~$ uname -a
 Linux sw-mtx-008-bf2 5.4.0-1060-bluefield #66-Ubuntu SMP PREEMPT Mon Mar 27 15:52:50 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

crashkernel parameter is configured
 bf2:~$ cat /proc/cmdline
 BOOT_IMAGE=/boot/vmlinuz-5.4.0-1060-bluefield root=UUID=52ddbe2c-ee4f-48d4-b7d4-ab76e264e438 ro console=hvc0 console=ttyAMA0 earlycon=pl011,0x01000000 fixrtc net.ifnames=0 biosdevname=0 iommu.passthrough=1 crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M
 bf2:~$ dmesg | grep -i crash
 [ 0.000000] crashkernel reserved: 0x00000000cfe00000 - 0x00000000efe00000 (512 MB)
 [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-1060-bluefield root=UUID=52ddbe2c-ee4f-48d4-b7d4-ab76e264e438 ro console=hvc0 console=ttyAMA0 earlycon=pl011,0x01000000 fixrtc net.ifnames=0 biosdevname=0 iommu.passthrough=1 crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M
 [ 8.070921] pstore: Using crash dump compression: deflate

kdump-config is as below:
 bf2:~$ kdump-config show
 DUMP_MODE: kdump
 USE_KDUMP: 1
 KDUMP_SYSCTL: kernel.panic_on_oops=1
 KDUMP_COREDIR: /var/crash
 crashkernel addr: 0x
  /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinuz-5.4.0-1060-bluefield
 kdump initrd:
  /var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-5.4.0-1060-bluefield
 current state: ready to kdump

 kexec command:
  /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-5.4.0-1060-bluefield root=UUID=52ddbe2c-ee4f-48d4-b7d4-ab76e264e438 ro console=hvc0 console=ttyAMA0 earlycon=pl011,0x01000000 fixrtc net.ifnames=0 biosdevname=0 iommu.passthrough=1 reset_devices systemd.unit=kdump-tools-dump.service nr_cpus=1" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz

sysrq:
 bf2:/# cat /proc/sys/kernel/sysrq
 176

After trigged the crash manually with "echo c > /proc/sysrq-trigger", the system could not come up because of OOM. And after change the crashkernel with 1024M memory it still hangs.
 With default 512M, it hangs at "Killed process 674"
  [ 8.718188] systemd-journald[368]: File /var/log/journal/8244d38b2f804fc692f3f2dbf8206f57/system.journal corrupted or uncleanly shut down, renaming and re.
  [ 30.252513] Out of memory: Killed process 651 (systemd-resolve) total-vm:24380kB, anon-rss:3812kB, file-rss:1828kB, shmem-rss:0kB, UID:101 pgtables:80kB o0
  ...
  [ 34.651927] Out of memory: Killed process 674 (dbus-daemon) total-vm:7884kB, anon-rss:552kB, file-rss:1380kB, shmem-rss:0kB, UID:103 pgtables:52kB oom_sco0
 With 1024M, it hangs at following
  [ 8.733323] systemd-journald[369]: File /var/log/journal/8244d38b2f804fc692f3f2dbf8206f57/system.journal corrupted or uncleanly shut down, renaming and re.

After soft reboot the Bluefield, there's no coredump file generated.
 bf2:~$ ls /var/crash/ -la
 total 52
 drwxrwxrwt 3 root root 4096 May 31 01:43 .
 drwxr-xr-x 14 root root 4096 Apr 30 11:26 ..
 drwxrwxr-x 2 ubuntu ubuntu 4096 May 31 01:43 202305310143
 -rw-r----- 1 root root 34307 May 31 01:18 _usr_share_netplan_netplan.script.0.crash
 -rw-r--r-- 1 root root 0 May 31 03:47 kdump_lock
 -rw-r--r-- 1 root root 358 May 31 03:48 kexec_cmd
 bf2:~$ ls /var/crash/202305310143/ -la
 total 8
 drwxrwxr-x 2 ubuntu ubuntu 4096 May 31 01:43 .
 drwxrwxrwt 3 root root 4096 May 31 01:43 ..

This issue also happens on 5.4.0-1049-bluefield kernel.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.