Comment 3 for bug 1757190

Revision history for this message
Matt Riedemann (mriedem) wrote :

I can tell that during migrate_disk_and_power_off from the virt driver on the source host, that the bdm.connection_info has the multiattach flag set because we get this message when disconnecting the volume from the first instance on the first host:

http://logs.openstack.org/17/554317/1/check/nova-multiattach/8e97832/logs/screen-n-cpu.txt.gz#_Mar_19_19_48_12_057443

Mar 19 19:48:12.057443 ubuntu-xenial-inap-mtl01-0003062768 nova-compute[27735]: INFO nova.virt.libvirt.driver [None req-f044373a-53dd-4de7-b881-ff511ecb3a70 tempest-AttachVolumeMultiAttachTest-1944074929 tempest-AttachVolumeMultiAttachTest-1944074929] [instance: 0eed0237-245e-4a18-9e30-9e72accd36c6] Detected multiple connections on this host for volume: 652600d5-f6dc-4089-ba95-d71d7640cafa, skipping target disconnect.

Which comes from the _should_disconnect_target code.

Then I eventually see _terminate_volume_connections which creates a new empty attachment between the server and volume:

http://logs.openstack.org/17/554317/1/check/nova-multiattach/8e97832/logs/screen-n-cpu.txt.gz#_Mar_19_19_48_12_307926

Mar 19 19:48:12.307926 ubuntu-xenial-inap-mtl01-0003062768 nova-compute[27735]: DEBUG cinderclient.v3.client [None req-f044373a-53dd-4de7-b881-ff511ecb3a70 tempest-AttachVolumeMultiAttachTest-1944074929 tempest-AttachVolumeMultiAttachTest-1944074929] REQ: curl -g -i -X POST http://198.72.124.95/volume/v3/774fc170bdb24f20959a551dd666ef06/attachments -H "Accept: application/json" -H "User-Agent: python-cinderclient" -H "OpenStack-API-Version: volume 3.44" -H "X-Auth-Token: {SHA1}6ba4939df9b73a2d082ac56165bd1e04d7f33e6d" -H "Content-Type: application/json" -H "X-OpenStack-Request-ID: req-f044373a-53dd-4de7-b881-ff511ecb3a70" -d '{"attachment": {"instance_uuid": "0eed0237-245e-4a18-9e30-9e72accd36c6", "connector": null, "volume_uuid": "652600d5-f6dc-4089-ba95-d71d7640cafa"}}' {{(pid=27735) _http_log_request /usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py:372}}
...
Mar 19 19:48:12.412178 ubuntu-xenial-inap-mtl01-0003062768 nova-compute[27735]: RESP BODY: {"attachment": {"status": "reserved", "detached_at": "", "connection_info": {}, "attached_at": "", "attach_mode": null, "instance": "0eed0237-245e-4a18-9e30-9e72accd36c6", "volume_id": "652600d5-f6dc-4089-ba95-d71d7640cafa", "id": "c33fe43e-ac88-48df-93bd-3293746213b6"}}

And deletes the existing attachment:

Mar 19 19:48:12.418026 ubuntu-xenial-inap-mtl01-0003062768 nova-compute[27735]: DEBUG cinderclient.v3.client [None req-f044373a-53dd-4de7-b881-ff511ecb3a70 tempest-AttachVolumeMultiAttachTest-1944074929 tempest-AttachVolumeMultiAttachTest-1944074929] REQ: curl -g -i -X DELETE http://198.72.124.95/volume/v3/774fc170bdb24f20959a551dd666ef06/attachments/448fc228-0e4c-49b4-b0a9-00f3b17468dd -H "OpenStack-API-Version: volume 3.44" -H "User-Agent: python-cinderclient" -H "X-OpenStack-Request-ID: req-f044373a-53dd-4de7-b881-ff511ecb3a70" -H "Accept: application/json" -H "X-Auth-Token: {SHA1}6ba4939df9b73a2d082ac56165bd1e04d7f33e6d" {{(pid=27735) _http_log_request /usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py:372}}

Later once we're in _finish_resize, we call _update_volume_attachments. I can see that start happening here because we're getting the host connector from the virt driver:

http://logs.openstack.org/17/554317/1/check/nova-multiattach/8e97832/logs/screen-n-cpu.txt.gz#_Mar_19_19_48_13_521171

Mar 19 19:48:13.521171 ubuntu-xenial-inap-mtl01-0003062768 nova-compute[27735]: DEBUG os_brick.utils [None req-f044373a-53dd-4de7-b881-ff511ecb3a70 tempest-AttachVolumeMultiAttachTest-1944074929 tempest-AttachVolumeMultiAttachTest-1944074929] ==> get_connector_properties: call u"{'execute': None, 'my_ip': '198.72.124.95', 'enforce_multipath': True, 'host': 'ubuntu-xenial-inap-mtl01-0003062768', 'root_helper': 'sudo nova-rootwrap /etc/nova/rootwrap.conf', 'multipath': False}" {{(pid=27735) trace_logging_wrapper /usr/local/lib/python2.7/dist-packages/os_brick/utils.py:146}}

And then we update the new attachment with the host connector:

Mar 19 19:48:13.536842 ubuntu-xenial-inap-mtl01-0003062768 nova-compute[27735]: DEBUG cinderclient.v3.client [None req-f044373a-53dd-4de7-b881-ff511ecb3a70 tempest-AttachVolumeMultiAttachTest-1944074929 tempest-AttachVolumeMultiAttachTest-1944074929] REQ: curl -g -i -X PUT http://198.72.124.95/volume/v3/774fc170bdb24f20959a551dd666ef06/attachments/c33fe43e-ac88-48df-93bd-3293746213b6 -H "Accept: application/json" -H "User-Agent: python-cinderclient" -H "OpenStack-API-Version: volume 3.44" -H "X-Auth-Token: {SHA1}6ba4939df9b73a2d082ac56165bd1e04d7f33e6d" -H "Content-Type: application/json" -H "X-OpenStack-Request-ID: req-f044373a-53dd-4de7-b881-ff511ecb3a70" -d '{"attachment": {"connector": {"initiator": "iqn.1993-08.org.debian:01:d44f672dfb2a", "ip": "198.72.124.95", "platform": "x86_64", "host": "ubuntu-xenial-inap-mtl01-0003062768", "do_local_attach": false, "mountpoint": "/dev/vdb", "os_type": "linux2", "multipath": false}}}' {{(pid=27735) _http_log_request /usr/local/lib/python2.7/dist-packages/keystoneauth1/session.py:372}}

I think I now see the problem.

_finish_resize in the compute manager calls this code:

        block_device_info = self._get_instance_block_device_info(
            context, instance, refresh_conn_info=True, bdms=bdms)

refresh_conn_info=True eventually gets us here:

https://github.com/openstack/nova/blob/f80b4e50093002f84b43ff245a605fbe44d34711/nova/virt/block_device.py#L639

I see the GET call to attachments here:

http://logs.openstack.org/17/554317/1/check/nova-multiattach/8e97832/logs/screen-n-cpu.txt.gz#_Mar_19_19_48_15_652636

Mar 19 19:48:15.684627 ubuntu-xenial-inap-mtl01-0003062768 nova-compute[27735]: RESP BODY: {"attachment": {"status": "attaching", "detached_at": "", "connection_info": {"auth_password": "5GDZJC3pBdPEJUgh", "attachment_id": "c33fe43e-ac88-48df-93bd-3293746213b6", "target_discovered": false, "encrypted": false, "driver_volume_type": "iscsi", "qos_specs": null, "target_iqn": "iqn.2010-10.org.openstack:volume-652600d5-f6dc-4089-ba95-d71d7640cafa", "target_portal": "198.72.124.95:3260", "volume_id": "652600d5-f6dc-4089-ba95-d71d7640cafa", "target_lun": 1, "access_mode": "rw", "auth_username": "67J6MYJni5Yu5VEEw8Q8", "auth_method": "CHAP"}, "attached_at": "2018-03-19T19:48:16.000000", "attach_mode": "rw", "instance": "0eed0237-245e-4a18-9e30-9e72accd36c6", "volume_id": "652600d5-f6dc-4089-ba95-d71d7640cafa", "id": "c33fe43e-ac88-48df-93bd-3293746213b6"}}

And that connection_info gets updated in the BlockDeviceMapping object, which we use later to set the shareable flag when connecting the volume on the destination host. And that's where I see the guest xml doesn't have the shareable flag because the BDM.connection_info doesn't have the multiattach flag set:

http://logs.openstack.org/17/554317/1/check/nova-multiattach/8e97832/logs/screen-n-cpu.txt.gz#_Mar_19_19_48_15_706213

block_device_info={'swap': None, 'root_device_name': u'/dev/vda', 'ephemerals': [], 'block_device_mapping': [{'guest_format': None, 'boot_index': None, 'mount_device': u'/dev/vdb', 'connection_info': {'status': u'attaching', 'detached_at': u'', u'volume_id': u'652600d5-f6dc-4089-ba95-d71d7640cafa', 'attach_mode': u'rw', 'driver_volume_type': u'iscsi', 'instance': u'0eed0237-245e-4a18-9e30-9e72accd36c6', 'attached_at': u'2018-03-19T19:48:16.000000', 'serial': u'652600d5-f6dc-4089-ba95-d71d7640cafa', 'data': {u'auth_password': u'***', u'target_discovered': False, u'encrypted': False, u'qos_specs': None, u'target_iqn': u'iqn.2010-10.org.openstack:volume-652600d5-f6dc-4089-ba95-d71d7640cafa', u'target_portal': u'198.72.124.95:3260', u'volume_id': u'652600d5-f6dc-4089-ba95-d71d7640cafa', u'target_lun': 1, u'access_mode': u'rw', u'auth_username': u'67J6MYJni5Yu5VEEw8Q8', u'auth_method': u'CHAP'}}, 'disk_bus': 'virtio', 'device_type': 'disk', 'attachment_id': 'c33fe43e-ac88-48df-93bd-3293746213b6', 'delete_on_termination': False}]}

I have no idea why this didn't fail the resize tests before with older libvirt and qemu, but maybe there was something different in those versions that preserved the shareable flag or something, but this is definitely obviously broken just by code inspection.