With libvirt/images_type = rbd, ephemeral instances silently ignore hw_qemu_guest_agent=yes
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Confirmed
|
Low
|
Unassigned |
Bug Description
Description
===========
If nova-compute is configured with libvirt/images_type = rbd, then instances booted off images with hw_qemu_
Steps to reproduce
===========
The steps to verify whether or not the FIFREEZE and FITHAW ioctls are received by a guest are described in:
http://
https:/
Expected result
===============
When you perform these described actions on an instance running on a compute node that does *not* set libvirt/images_type = rbd, then the FIFREEZE and FITHAW events are received as expected when the snapshot is created. This occurs irrespective of whether the instance is using boot-from-image, or boot-from-volume.
Actual result
=============
When you perform these described actions on an instance running on a compute node that *does* set libvirt/images_type = rbd, *and* the instance is set to boot from an image, then no qemu-ga events are received during snapshots at all.
The reason appears to be this direct_snapshot() call:
This is defined in
https:/
and it uses RBD functionality only. Importantly, it never interacts with
qemu-ga, so it appears to not worry at all about freezing the filesystem.
This problem was apparently introduced in https:/
However, the qemu-guest-agent calls *are* received correctly if the instance is configured to boot from volume.
Environment
===========
1. OpenStack release: Rocky (but this issue is present in current master).
2. Hypervisor: libvirt/KVM
3. Storage type: Ceph RBD
4. Networking: Neutron/ML2/OVS
Additional information
=======
A detailed discussion of the issue is available at:
https://<email address hidden>
summary: |
- With libvirt/images_type = rbd, instances ignore hw_qemu_guest_agent=yes + With libvirt/images_type = rbd, ephemeral instances silently ignore + hw_qemu_guest_agent=yes |
description: | updated |
Boot from volume isn't affected since it doesn't go through the libvirt Rbd image backend class, the compute API makes a direct RPC call to the compute to quiesce the instance (if it's ACTIVE and configured to support that):
https:/ /opendev. org/openstack/ nova/src/ commit/ 7bf75976016aae5 d458eca9f6ddac9 2bfe75dc59/ nova/compute/ api.py# L3075
The direct snapshot functionality introduced in that referenced patch is optional - it was added to make snapshots with rbd-backed images significantly faster. You can disable the fast snapshot support by disabling direct URLs in glance as a workaround.
Obviously it would be nice to have the best of both worlds though, but I'm not sure what the fix for that looks like at this time.