This bug fixes the root problem reported in bug 1648449, so its description can be mostly reused here:
On an Amazon AWS instance that has NVMe drives, the NVMe drives fail to initialize, and so aren't usable by the system. If one of the NVMe drives contains the root filesystem, the instance won't boot.
[Test Case]
Boot an AWS instance with multiple NVMe drives. All except the first will fail to initialize, and errors will appear in the system log (if the system boots at all). With a patched kernel, all NVMe drives are initialized and enumerated and work properly.
[Regression Potential]
Patching the Xen MSI setup function may cause problems with other PCI devices using MSI/MSIX interrupts on a Xen guest.
[Other Info]
The patch from bug 1648449 was only a workaround, that changed the NVMe driver to not trigger this Xen bug. However, there have been reports of that patch causing non-Xen systems with NVMe drives to stop working, in bug 1626894. So, the best thing to do is revert the workaround patch (and its regression fix patch from bug 1651602) back to the original NVMe drive code, and apply the real Xen patch to fix the problem. That should restore functionality for non-Xen systems, and should allow Xen systems with multiple NVMe controllers to work.
[Impact]
This bug fixes the root problem reported in bug 1648449, so its description can be mostly reused here:
On an Amazon AWS instance that has NVMe drives, the NVMe drives fail to initialize, and so aren't usable by the system. If one of the NVMe drives contains the root filesystem, the instance won't boot.
[Test Case]
Boot an AWS instance with multiple NVMe drives. All except the first will fail to initialize, and errors will appear in the system log (if the system boots at all). With a patched kernel, all NVMe drives are initialized and enumerated and work properly.
[Regression Potential]
Patching the Xen MSI setup function may cause problems with other PCI devices using MSI/MSIX interrupts on a Xen guest.
[Other Info]
The patch from bug 1648449 was only a workaround, that changed the NVMe driver to not trigger this Xen bug. However, there have been reports of that patch causing non-Xen systems with NVMe drives to stop working, in bug 1626894. So, the best thing to do is revert the workaround patch (and its regression fix patch from bug 1651602) back to the original NVMe drive code, and apply the real Xen patch to fix the problem. That should restore functionality for non-Xen systems, and should allow Xen systems with multiple NVMe controllers to work.