Kdump triggered manually after cpu offline operation fails to collect dump
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Medium
|
Unassigned | ||
Utopic |
Fix Released
|
Medium
|
Chris J Arges |
Bug Description
SRU Justification:
[Impact]
Kdump triggered manually after cpu offline operation fails to collect dump
[Test Case]
See Steps to Reproduce below.
[Fix]
$ git describe --contains c1caae3de46a072
v3.19-rc3~1^2~1
--
---Problem Description---
Kdump triggered manually after cpu offline operation fails to collect dump
---uname output---
Linux ubuntu 3.18.0-9-generic #10-Ubuntu SMP Mon Jan 12 21:35:28 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
Machine Type = P8
---System Hang---
We have to reboot the LPAR and gain access to the machine again.
---Steps to Reproduce---
Install a Power VM LPAR with Ubuntu 15.04 ISO using Virtual DVD.
Then offline one of the cpu's of the machine.
root@ubuntu:~# lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Thread(s) per core: 8
Core(s) per socket: 1
Socket(s): 2
NUMA node(s): 2
Model: IBM,8284-22A
Hypervisor vendor: pHyp
Virtualization type: para
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0-15
NUMA node2 CPU(s):
root@ubuntu:~# chcpu -d 15
CPU 15 disabled
root@ubuntu:~# lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-14
Off-line CPU(s) list: 15
Thread(s) per core: 7
Core(s) per socket: 1
Socket(s): 2
NUMA node(s): 2
Model: IBM,8284-22A
Hypervisor vendor: pHyp
Virtualization type: para
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0-14
NUMA node2 CPU(s):
Configure and enable kdump on the LPAR.
root@ubuntu:~# /etc/init.
current state : ready to kdump
root@ubuntu:~# kdump-config load
Modified cmdline:
segment[
segment[
segment[
segment[
segment[
segment[
* loaded kdump kernel
root@ubuntu:~#
root@ubuntu:~# kdump-config show
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.
KDUMP_COREDIR: /var/crash
crashkernel addr:
current state: ready to kdump
kexec command:
/sbin/kexec -p --args-linux --command-
root@ubuntu:~# kdump-config status
current state : ready to kdump
root@ubuntu:~# sysctl -w kernel.sysrq=1
kernel.sysrq = 1
root@ubuntu:~# cat /proc/sys/
1
Trigger the crash manually using sysrq-trigger.
root@ubuntu:~# echo c > /proc/sysrq-trigger
root@ubuntu:~# [ 311.088315] SysRq : Trigger a crash
[ 311.088331] Unable to handle kernel paging request for data at address 0x00000000
[ 311.088336] Faulting instruction address: 0xc0000000005f9094
[ 311.088341] Oops: Kernel access of bad area, sig: 11 [#1]
[ 311.088344] SMP NR_CPUS=2048 NUMA pSeries
[ 311.088349] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_
[ 311.088372] CPU: 14 PID: 1705 Comm: bash Not tainted 3.18.0-9-generic #10-Ubuntu
[ 311.088377] task: c00000027773e470 ti: c0000002782d0000 task.ti: c0000002782d0000
[ 311.088381] NIP: c0000000005f9094 LR: c0000000005fa12c CTR: c0000000005f9060
[ 311.088385] REGS: c0000002782d39d0 TRAP: 0300 Not tainted (3.18.0-9-generic)
[ 311.088389] MSR: 8000000000009033 <SF,EE,
[ 311.088401] CFAR: c0000000000084d8 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1
GPR00: c0000000005fa12c c0000002782d3c50 c000000001426890 0000000000000063
GPR04: c000000001b85c28 c000000001b965e0 00000000000000ff c0000000015e71f0
GPR08: c000000000e76890 0000000000000001 0000000000000000 0000000000000001
GPR12: c0000000005f9060 c000000007b37e00 0000000000000000 0000000022000000
GPR16: 000000001016d6e8 0000010000088208 0000000010143eb8 00000000100c9390
GPR20: 0000000000000000 000000001017b008 0000000010143d18 0000000000000000
GPR24: 0000000010156c00 0000000010178868 c0000000013756a8 0000000000000004
GPR28: 0000000000000063 c00000000133f598 c000000001375a68 0000000000000000
[ 311.088459] NIP [c0000000005f9094] sysrq_handle_
[ 311.088463] LR [c0000000005fa12c] __handle_
[ 311.088467] Call Trace:
[ 311.088470] [c0000002782d3c50] [c000000000056604] ht64_call_
[ 311.088476] [c0000002782d3c70] [c0000000005fa12c] __handle_
[ 311.088481] [c0000002782d3d10] [c0000000005fa928] write_sysrq_
[ 311.088488] [c0000002782d3d40] [c000000000345a10] proc_reg_
[ 311.088494] [c0000002782d3d90] [c0000000002b954c] vfs_write+
[ 311.088499] [c0000002782d3de0] [c0000000002ba0ec] SyS_write+
[ 311.088504] [c0000002782d3e30] [c00000000000927c] syscall_
[ 311.088508] Instruction dump:
[ 311.088511] 3842d830 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d22001b 39491cdc
[ 311.088519] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6
[ 311.088528] ---[ end trace 8543f2d87847eab7 ]---
[ 311.090822]
[ 311.090851] Sending IPI to other CPUs
[ 311.091870] IPI complete
[ 312.466826] Kernel panic - not syncing: Could not enable big endian exceptions
root@ubuntu:~# which kdump
/sbin/kdump
root@ubuntu:~# dpkg -S /sbin/kdump
kexec-tools: /sbin/kdump
root@ubuntu:~# dpkg --list | grep kexec
ii kexec-tools 1:2.0.7-5ubuntu1 ppc64el tools to support fast kexec reboots
ii pxe-kexec 0.2.4-3 ppc64el Fetch PXE configuration file and netboot using kexec
The fix patch is available upstream
https:/
Thanks
Hari
CVE References
tags: | added: architecture-ppc64le bugnameltc-120463 severity-critical targetmilestone-inin1504 |
affects: | ubuntu → linux (Ubuntu) |
tags: | added: kernel-da-key |
Changed in linux (Ubuntu): | |
status: | New → In Progress |
assignee: | nobody → Chris J Arges (arges) |
importance: | Undecided → Medium |
Changed in linux (Ubuntu): | |
status: | New → Triaged |
Changed in linux (Ubuntu Utopic): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu): | |
status: | Triaged → Fix Committed |
Changed in linux (Ubuntu): | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu): | |
status: | Fix Committed → Fix Released |
Vivid will pick up this fix when it rebases to 3.19. SRU'ing the fix for 3.16.