VM using Mellanox VF fails to reboot
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
kunpeng920 |
Fix Released
|
Undecided
|
Unassigned | ||
Ubuntu-18.04-hwe |
Fix Released
|
Undecided
|
Unassigned | ||
Upstream-kernel |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
linux-hwe 5.0.0-2.3~18.04.1
1) Instantiate a Mellanox VF, e.g.:
echo 3 | sudo tee /sys/class/
2) Pass the newly instantiated VF into a virtual machine
3) Bring up the interface of the VF in the guest
The guest will be unable to reboot:
[ OK ] Stopped Monitoring of LVM2 mirrors,…sing dmeventd or progress polling.
Stopping LVM2 metadata daemon...
[ OK ] Stopped LVM2 metadata daemon.
[ OK ] Deactivated swap /swapfile.
[ OK ] Reached target Unmount All Filesystems.
[ OK ] Stopped Remount Root and Kernel File Systems.
[ OK ] Reached target Shutdown.
[ OK ] Reached target Final Step.
Starting Reboot...
[ 88.003995] mlx5_core 0000:04:00.0: mlx5_enter_
[ 88.004629] mlx5_core 0000:04:00.0: mlx5_enter_
[ 88.012056] reboot: Restarting system
This issue no longer exists in Ubuntu. Kernel bisection shows that it impacted upstream kernels between v4.20 and v5.3.
Bisection was a little complicated because there are 2 overlapping issues. There's the reboot hang, but there's also an issue that causes the host mellanox driver to crash when you passthrough a VF. So I bisected the mellanox driver crash first, then manually applied that fix while biscting the reboot hang.
Here's a chronological set of the relevant commits (annotation will
require a fixed-width font):
v4.19 dma_{map, unmap} --------+ +---- Reboot hangs ------- ------- ----+
975bb8b4dc93 PCI/IOV: Use VF0 cached config space size for other VFs ------------------+
v4.20-rc1 |
b61d271e59d7 iommu/dma: Move domain lookup into __iommu_
76bf6a8634a1 Revert "PCI/IOV: Use VF0 cached config space size for other VFs" --|------+
v5.3-rc1 +--- Passthrough crashes
8af23fad6261 iommu/dma: Handle MSI mappings separately -------
v5.3-rc5
As you can see, the reboot hang problem started in upstream v4.20-rc1,
and was fixed in v5.3-rc1. So 4.15 was not impacted, and all 5.4
kernels already have the fix.