qemu segfaults after re-attaching ceph volume to instance
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu Cloud Archive |
New
|
Undecided
|
Unassigned | ||
qemu (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Xenial |
Incomplete
|
Undecided
|
Unassigned | ||
Artful |
Incomplete
|
Undecided
|
Unassigned |
Bug Description
I have OpenStack compute nodes with qemu-system-x86. Using Ceph as storage backend for base disks and volumes (no local storage).
When I create a new volume on ceph and attach to instance - it's working.
When I detach volume, and re-attach again, with limited number of repeats I am able to crash my instance. Sometimes it's just in second try, sometimes 6, 9. In most cases it won't survive 10 cycles.
Steps to reproduce:
- create instance
- create volume in ceph
define volume in disk.xml: http://
now try a loop:
while true; do
virsh attach-device instance-0xxx disk.xml;
sleep 5;
virsh detach-disk instance-000022e8 vdb --live;
sleep 5;
done
After few iterations, instance is crashed.
Logs:
kernel: [3866704.245319] traps: qemu-system-
or
kernel: [7252748.718834] qemu-system-
Ubuntu Xenial 16.04.3 with cloud-archive@Ocata repositories
kernel: 4.4.0-109-generic
qemu-system-x86 1:2.8+dfsg-
libvirt-bin 2.5.0-3ubuntu5.
ceph/rados: 10.2.10-1xenial
@Corey / James - I have no ceph around at all, also this is reported against a cloud-archive qemu (Ocata if I read it correctly).
Can you confirm this issue and if so are there further insights how to handle it further?
@Crazik - to what extend could you try on your existing setup with different qemu&libvirt versions like those of Ubuntu Cloud Archive Pike (2.10) and Queens (2.11) from [1] ?
If you can it might be worth to update the storage node (ceph) independently to the compute node qemu/libvirt - that way more easily we might get a feeling in which area a potential fix might be.
[1]: https:/ /wiki.ubuntu. com/OpenStack/ CloudArchive