Fix broken PCIe device after migration for Q35 machines
Bug #2033193 reported by
Matthew Heler
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
qemu (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Jammy |
Incomplete
|
Undecided
|
Unassigned |
Bug Description
There was a bug introduced in qemu 6.1 and 6.2 Q35 machine types that will cause hot-plugged PCIe devices to go offline during a migration. This was fixed in Qemu 7.0, and backported by other vendors in there downstream release.
The bug is here, https://<email address hidden>
I can hit this bug currently when running OpenStack on top of Ubuntu 22.04, and using the provided libvirt and qemu-kvm packages from that release.
Is it possible we can get this fix backported to Jammy?
Thank you,
To post a comment you must log in.
Thanks for the report.
Given the versions the only affected release i Jammy and =>Lunar are already Fixed.
The patches you referred: /msg872376. html)
- #1 & #2 got agreed
- #3 got challenged and not accepted (https://<email address hidden>
- neither of them ever landed upstream, they seem to only ever made it to the downstreams
By that I'm not as sure about "fixed in qemu 7.0". Maybe it went through a lot of changes and I couldn't spot it easily, since you mentioned "fixed in Qemu 7.0" do you have a pointer to the change that actually went into qemu 7.0?
I've spotted it in the downstream though: /git.centos. org/rpms/ qemu-kvm/ blob/24c15060b5 a0a922f6472e49b 2087ad174ffb63d /f/SOURCES/ kvm-pci- expose- TYPE_XIO3130_ DOWNSTREAM- name.patch /git.centos. org/rpms/ qemu-kvm/ blob/24c15060b5 a0a922f6472e49b 2087ad174ffb63d /f/SOURCES/ kvm-acpi- pcihp-pcie- set-power- on-cap- on-parent- slot.patch
https:/
https:/
And I agree that there it was dropped in the move to 7.0 - but there was no reference by what it was replaced.
I have not hit the same issue in Jammy when migrating instances. /bugzilla. redhat. com/show_ bug.cgi? id=2053584 speaks of a virtio-serial-port, is that something you are using as well?
Therefore I think "can hit this bug currently when running OpenStack on top of Ubuntu 22.04" is a bit too generic.
Therefore a few questions that shall help with the later SRU [1] processing:
1. Is there any special device or anything else that you use to be exposed and see that issue?
2. How does your guest XML look like?
3. Bug https:/
4. Is it then also a guest soft lockup or some other symptom for you?
Note: The upstream bug https:/ /bugzilla. redhat. com/show_ bug.cgi? id=2053584 has repro ideas that could be useful as well.
Further question, if one would - for trying - provide you with a PPA of jammy plus those fixes, could you easily test this in your environment?
[1]: https:/ /wiki.ubuntu. com/StableRelea seUpdates