[Hyper-V] 16.04 kexec-tools doesn't match linux-azure
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
crash (Ubuntu) |
Fix Committed
|
High
|
Marcelo Cerri | ||
Xenial |
Fix Released
|
High
|
Marcelo Cerri | ||
kexec-tools (Ubuntu) |
Fix Released
|
High
|
Marcelo Cerri | ||
Xenial |
Fix Released
|
High
|
Marcelo Cerri | ||
linux-azure (Ubuntu) |
Fix Committed
|
High
|
Marcelo Cerri | ||
Xenial |
Fix Released
|
High
|
Marcelo Cerri |
Bug Description
[Impact]
Currently it's not possible to use the kdump functionality in xenial when running the linux-azure kernel. The problem is actually caused by several factors:
1. kexec fails to parse /proc/kcore and thus fails to load the crash kernel. That's similar to bug #1713940 and it's related to 4.10+ kernels.
2. When the crash kernel boots, a bug in KASLR causes it to crash in a very early stage. For the user, it seems the system just rebooted after the crash.
3. Currently in azure, crashkernel=128G is not enough to boot and run the dump procedure with 4.11+ kernels. That value needs to be increased in order to kdump to succeed.
4. After the vmcore is dumped, the current version of crash in xenial is not able to parse it. All the necessary fixes are already upstream and can be backported.
[Test Case]
1. Install the linux-azure kernel in an azure instance (although it's possible to run linux-azure in bare metal or kvm, the KASLR issue only is triggered in azure).
2. Follow the instructions in https:/
The vmcore must be generated and it should be possible to inspect it using crash.
3. Perform these same tests for the linux-generic kernel, on each supported architecture.
[Regression Potential]
Since both kexec-tools and crash are being changed to support 4.10+ kernels, it's very important that they continue to handle 4.4 kernels properly.
The same steps above can be used to test linux-generic for regressions.
[Other Info]
Original description:
--8<--
Because the linux-azure kernel is based on 4.11, kexec on 16.04 gives the following error:
kdump-tools[1436]: ELF core (kcore) parse failed
Perhaps the artful kexec-tools should be backported?
--8<--
Changed in linux-azure (Ubuntu): | |
status: | New → Confirmed |
Changed in linux-azure (Ubuntu): | |
importance: | Undecided → High |
assignee: | nobody → Marcelo Cerri (mhcerri) |
Changed in linux-azure (Ubuntu): | |
status: | Confirmed → In Progress |
Changed in linux-azure (Ubuntu Xenial): | |
assignee: | nobody → Marcelo Cerri (mhcerri) |
status: | New → In Progress |
importance: | Undecided → High |
Changed in kexec-tools (Ubuntu): | |
status: | New → Confirmed |
Changed in kexec-tools (Ubuntu Xenial): | |
status: | New → Confirmed |
Changed in makedumpfile (Ubuntu Xenial): | |
status: | New → Confirmed |
Changed in makedumpfile (Ubuntu): | |
status: | New → Confirmed |
no longer affects: | makedumpfile (Ubuntu) |
no longer affects: | makedumpfile (Ubuntu Xenial) |
Changed in kexec-tools (Ubuntu Xenial): | |
assignee: | nobody → Marcelo Cerri (mhcerri) |
Changed in crash (Ubuntu): | |
status: | New → Confirmed |
importance: | Undecided → High |
assignee: | nobody → Marcelo Cerri (mhcerri) |
Changed in crash (Ubuntu): | |
status: | Confirmed → In Progress |
description: | updated |
Changed in kexec-tools (Ubuntu): | |
status: | In Progress → Fix Committed |
Changed in linux-azure (Ubuntu Xenial): | |
status: | In Progress → Fix Committed |
Changed in linux-azure (Ubuntu): | |
status: | In Progress → Fix Committed |
Changed in crash (Ubuntu): | |
status: | In Progress → Fix Released |
Changed in kexec-tools (Ubuntu): | |
status: | Fix Committed → Fix Released |
tags: |
added: verification-done verification-done-xenial removed: verification-needed verification-needed-xenial |
Changed in crash (Ubuntu): | |
status: | Fix Released → Fix Committed |
Changed in crash (Ubuntu Xenial): | |
importance: | Undecided → High |
assignee: | nobody → Marcelo Cerri (mhcerri) |
Porting the artful kexec-tools from artful to xenial fixes the kcore parse failure and doesn't cause any regressions when used with the regular xenial kernel.
However, although "kdump-config load" doesn't fail anymore, the dump is not generated when using linux-azure. The artful kexec-tools also requires kdump-tools to be ported from artful. Because of that, I backported the necessary fixes into the xenial kexec-tools in order to keep the same kdump-tools version. The result was the same as using artful kexec-tools/ kdump-tools and the dump wasn't generated when using linux-azure.
Further investigation is still necessary.