Kernel sometimes panics during early boot if CPU microcode archive prepended to initramfs
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-hwe (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
As part of my response to the recent Meltdown and Spectre security issues, I've started deploying the intel-microcode package (initially, version 3.20170707.
This has caused machine boots to sometimes fail, though the behaviour does not appear deterministic.
The error reported by the kernel is:
initramfs unpacking failed: junk in compressed archive
This then immediately leads to the kernel panicing, as the initramfs is needed for mounting the local root filesystem.
(Fortunately, I have set the panic=300 kernel command-line option, so physical machines that panic in this way will auto-reboot after 5 minutes, and can thus be rescued from afar via network boot.)
I've seen these failures on two different varieties of desktop (one HP/Compaq, one Dell), and also on VMs hosted by VMware. I believe that this problem is a non-deterministic race-condition during the machine early boot sequence—probably in the kernel—as the same machine with the same disk contents can exhibit either working or failing behaviour on subsequent boot attempts.
Unfortunately, this particular error message appears in three different places in init/initramfs.c, so it's not precisely clear what specific problem is occurring.
This problem has been difficult to reproduce on hosts reliably. Machines that are affected by this issue typically present it on most boot attempts, but this cannot be relied on.
Attempting to gather more information from the kernel via the 'debug' command-line option produces more data, but this is difficult to capture. Attempting to also add "console=ttyS0" on a VM that was reliably presenting this problem caused the error to stop triggering, presumably due to changed timing.
The intel-microcode package works by prepending a prepared initramfs image with a CPIO archive that contains microcode files, with predictable names, for early application by the kernel.
Removing the intel-microcode package, and thus regenerating initramfs files without any CPIO archive prepended to them, appears to prevent this issue from triggering. My suspicion is that the kernel is failing to handle this compound archive structure in a reliable way.
However, it's conceivable that this problem is not in the Linux kernel, but in the GRUB2 bootloader in use on these machines. As I understand things, it is the responsibility of the GRUB2 bootloader to read the kernel and initramfs files from disk, and execute them both together. It's thus conceivable that the defect does not lie in the kernel, but that the GRUB2 bootloader is instead failing to reliably parse the btrfs root filesystem data-structures, and thus the kernel is correctly rejecting an invalid initramfs payload being passed to it.
However, given I've been successfully using GRUB2 and btrfs in this way without issue for some years with a variety of kernels and initramfs configurations, this strikes me as being less likely.
I have no reason to believe that this issue is limited to this (major) version of the kernel.
This exact same issue has happened to me as well. I attempted to run a livecd of Ubuntu 19.04 on a HP g72 notebook PC and came across this exact error.