[Hyper-V] PCI: hv: Fix 2 hang issues in hv_compose_msi_msg
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Xenial |
Invalid
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Marcelo Cerri | ||
linux-azure (Ubuntu) |
Fix Released
|
High
|
Marcelo Cerri | ||
Xenial |
Fix Released
|
Undecided
|
Marcelo Cerri | ||
Bionic |
Fix Released
|
High
|
Marcelo Cerri | ||
linux-azure-edge (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Xenial |
Fix Released
|
Critical
|
Marcelo Cerri | ||
Bionic |
Invalid
|
Undecided
|
Unassigned |
Bug Description
We've identified some issues in recent testing against upstream 4.15 SR-IOV and DPDK. The following commits are in Lorenzo's PCI tree on their way into 4.16 and stable:
Tree: https:/
PCI: hv: Only queue new work items in hv_pci_
If there is pending work in hv_pci_
the new dr entry into the dr_list. Add a check to detect pending work
items and update the code to skip queuing work if pending work items
are detected.
PCI: hv: Remove the bogus test in hv_eject_
When kernel is executing hv_eject_
be hv_pcichild_
therefore replace the bogus check with an explicit WARN_ON() on the
condition failure detection.
PCI: hv: Fix a comment typo in _hv_pcifront_
Comment in _hv_pcifront_
No functional change.
PCI: hv: Fix 2 hang issues in hv_compose_
1. With the patch "x86/vector/msi: Switch to global reservation mode",
the recent v4.15 and newer kernels always hang for 1-vCPU Hyper-V VM
with SR-IOV. This is because when we reach hv_compose_
request_irq() -> request_
-> __irq_startup() -> irq_domain_
msi_domain_
disabled in __setup_irq().
Note: when we reach hv_compose_
pci_enable_
hv_compose_
hv_compose_
With interrupts disabled, a UP VM always hangs in the busy loop in
the function, because the interrupt callback hv_pci_
can not be called.
We can do nothing but work it around by polling the channel. This
is ugly, but we don't have any other choice.
2. If the host is ejecting the VF device before we reach
hv_compose_
forever, because at this time the host doesn't respond to the
CREATE_INTERRUPT request. This issue exists the first day the
pci-hyperv driver appears in the kernel.
Luckily, this can also by worked around by polling the channel
for the PCI_EJECT message and hpdev->state, and by checking the
PCI vendor ID.
Note: actually the above 2 issues also happen to a SMP VM, if
"hbus->
PCI: hv: Serialize the present and eject work items
When we hot-remove the device, we first receive a PCI_EJECT message and
then receive a PCI_BUS_RELATIONS message with bus_rel-
The first message is offloaded to hv_eject_
is offloaded to pci_devices_
list_del(
system_wq can run them concurrently.
The patch eliminates the race condition.
Since access to present/eject work items is serialized, we do not need the
hbus->enum_sem anymore, so remove it.
All 4.15-based kernels need these fixes, or any kernels that picked up:
Fixes: 4900be83602b ("x86/vector/msi: Switch to global reservation mode")
The race condition fixed by the serialization patch applies to all kernels with PCI passthrough on Hyper-V:
Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs") (the catch-all for PCI passthrough)
no longer affects: | linux-azure (Ubuntu Xenial) |
no longer affects: | linux-azure-edge (Ubuntu Bionic) |
Changed in linux-azure-edge (Ubuntu Xenial): | |
status: | New → In Progress |
Changed in linux-azure (Ubuntu Bionic): | |
assignee: | nobody → Marcelo Cerri (mhcerri) |
Changed in linux-azure-edge (Ubuntu Xenial): | |
assignee: | nobody → Marcelo Cerri (mhcerri) |
importance: | Undecided → Critical |
Changed in linux-azure (Ubuntu Bionic): | |
importance: | Undecided → High |
Changed in linux-azure-edge (Ubuntu Xenial): | |
status: | In Progress → Fix Committed |
Changed in linux-azure (Ubuntu Bionic): | |
status: | New → Fix Committed |
Changed in linux-azure (Ubuntu Xenial): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Xenial): | |
status: | New → Invalid |
Changed in linux (Ubuntu Bionic): | |
status: | New → In Progress |
assignee: | nobody → Marcelo Cerri (mhcerri) |
Changed in linux (Ubuntu Bionic): | |
status: | In Progress → Fix Committed |
tags: |
added: verification-done-bionic removed: verification-needed-bionic |
This bug was fixed in the package linux-azure-edge - 4.15.0-1005.5
---------------
linux-azure-edge (4.15.0-1005.5) xenial; urgency=medium
* linux-azure-edge: 4.15.0-1005.5 -proposed tracker (LP: #1759923)
* [Hyper-V] hv_netvsc: enable multicast if necessary (LP: #1759885)
- hv_netvsc: fix filter flags
- SAUCE: hv_netvsc: enable multicast if necessary
linux-azure-edge (4.15.0-1004.4) xenial; urgency=medium
* linux-azure-edge: 4.15.0-1004.4 -proposed tracker (LP: #1759673)
* [Hyper- V][linux- azure] Change config for MLX4 and MLX5 (LP: #1759656) MLX{4,5} _INFINIBAND= y
- [Config] azure: CONFIG_
* [Hyper-V] Improvements for UDP on SRIOV (LP: #1756414) irqoff
- SAUCE: hv_netvsc: avoid retry on send during shutdown
- SAUCE: hv_netvsc: only wake transmit queue if link is up
- SAUCE: hv_netvsc: fix error unwind handling if vmbus_open fails
- SAUCE: hv_netvsc: cancel subchannel setup before halting device
- SAUCE: hv_netvsc: fix race in napi poll when rescheduling
- SAUCE: hv_netvsc: use napi_schedule_
- SAUCE: hv_netvsc: defer queue selection to VF
- SAUCE: hv_netvsc: filter multicast/broadcast
- SAUCE: hv_netvsc: propagate rx filters to VF
* [Hyper-V] PCI: hv: Fix 2 hang issues in hv_compose_msi_msg (LP: #1758378) msi_msg( ) read_config( ) device_ work() devices_ present( ) if
- SAUCE: PCI: hv: Serialize the present and eject work items
- SAUCE: PCI: hv: Fix 2 hang issues in hv_compose_
- SAUCE: PCI: hv: Fix a comment typo in _hv_pcifront_
- SAUCE: PCI: hv: Remove the bogus test in hv_eject_
- SAUCE: PCI: hv: Only queue new work items in hv_pci_
necessary
linux-azure-edge (4.15.0-1003.3) xenial; urgency=medium
* linux-azure-edge: 4.15.0-1003.3 -proposed tracker (LP: #1755769)
* linux-azure: 4.15.0-1003.3 -proposed tracker (LP: #1757167)
* Enable secure boot on linux-azure (LP: #1754042)
- Revert "UBUNTU: [debian] azure: do not build uefi signed binary"
* [Hyper-v] Set CONFIG_I2C_PIIX4 to "n" (LP: #1752999)
- [Config] azure: CONFIG_I2C_PIIX4=n
* [Hyper-V] set config: CONFIG_ EDAC_DECODE_ MCE=y (LP: #1751123) EDAC_DECODE_ MCE=y
- [Config] azure: CONFIG_
* Miscellaneous Ubuntu changes
- [Config] updateconfigs after rebase to Ubuntu-4.15.0-13.14
- [Config] fix up retpoline abi files
[ Ubuntu: 4.15.0-13.14 ]
* linux: 4.15.0-13.14 -proposed tracker (LP: #1756408)
* devpts: handle bind-mounts (LP: #1755857)
- SAUCE: devpts: hoist out check for DEVPTS_SUPER_MAGIC
- SAUCE: devpts: resolve devpts bind-mounts
- SAUCE: devpts: comment devpts_mntget()
- SAUCE: selftests: add devpts selftests
* [bionic][arm64] d-i: add hisi_sas_v3_hw to scsi-modules (LP: #1756103)
- d-i: add hisi_sas_v3_hw to scsi-modules
* [Bionic][ARM64] enable ROCE and HNS3 driver support for hip08 SoC
(LP: #1756097)
- RDMA/hns: Refactor eq code for hip06
- RDMA/hns: Add eq support of hip08
- RDMA/hns: Add detailed comments for mb() call
- RDMA/hns: Add rq inline data support for hip08 RoCE
- RDMA/hns: Update the usage of sr_max and rr_max field
- RDMA/hns: Set access flags of hip08 RoCE
- RDMA/hns: Filter for zero l...