Seen in a recent gate job on the stable/ussuri branch [1] where requests to create a volume get stuck in status 'creating' and don't progress further, example test output:
Traceback (most recent call last):
File "/opt/stack/tempest/tempest/common/utils/__init__.py", line 89, in wrapper
return f(*func_args, **func_kwargs)
File "/opt/stack/tempest/tempest/api/compute/admin/test_volumes_negative.py", line 100, in test_multiattach_rw_volume_update_failure
vol1 = self.create_volume(multiattach=True)
File "/opt/stack/tempest/tempest/api/compute/base.py", line 528, in create_volume
volume['id'], 'available')
File "/opt/stack/tempest/tempest/common/waiters.py", line 280, in wait_for_volume_resource_status
raise lib_exc.TimeoutException(message)
tempest.lib.exceptions.TimeoutException: Request timed out
Details: volume 75bbdc7b-8a79-45ed-8883-6389b6837af8 failed to reach available status (current creating) within the required time (196 s).
In this example, we see the create_volume action time out because the volume doesn't reach available status and stays in 'creating'.
Looking at the c-vol logs for volume 75bbdc7b-8a79-45ed-8883-6389b6837af8, we see the following messages for lvcreate (the last messages referencing this volume) [2]:
Jan 14 19:44:41.822811 ubuntu-bionic-vexxhost-ca-ymq-1-0022542876 sudo[9486]: stack : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/local/bin/cinder-rootwrap /etc/cinder/rootwrap.conf env LC_ALL=C lvcreate -T -V 1g -n volume-75bbdc7b-8a79-45ed-8883-6389b6837af8 stack-volumes-lvmdriver-1/stack-volumes-lvmdriver-1-pool
Jan 14 19:44:42.011173 ubuntu-bionic-vexxhost-ca-ymq-1-0022542876 cinder-volume[28292]: INFO cinder.volume.flows.manager.create_volume [None req-85d09fb4-356f-4333-9eb5-c79a1ac1a0ea tempest-UpdateMultiattachVolumeNegativeTest-2119626783 None] Volume 75bbdc7b-8a79-45ed-8883-6389b6837af8: being created as raw with specification: {'status': 'creating', 'volume_name': 'volume-75bbdc7b-8a79-45ed-8883-6389b6837af8', 'volume_size': 1}
Jan 14 19:44:42.011722 ubuntu-bionic-vexxhost-ca-ymq-1-0022542876 cinder-volume[28292]: DEBUG oslo_concurrency.processutils [None req-85d09fb4-356f-4333-9eb5-c79a1ac1a0ea tempest-UpdateMultiattachVolumeNegativeTest-2119626783 None] Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf env LC_ALL=C lvcreate -T -V 1g -n volume-75bbdc7b-8a79-45ed-8883-6389b6837af8 stack-volumes-lvmdriver-1/stack-volumes-lvmdriver-1-pool {{(pid=28393) execute /usr/local/lib/python3.6/dist-packages/oslo_concurrency/processutils.py:379}}
Note that there is no matching "returned <code>" message.
Here is what normal/good output looks like for lvcreate, this example is from earlier in the same job run [3]:
Jan 14 19:39:06.220923 ubuntu-bionic-vexxhost-ca-ymq-1-0022542876 cinder-volume[28292]: INFO cinder.volume.flows.manager.create_volume [None req-d37cf946-69e6-4e0a-bceb-900b6392c88e tempest-AttachSCSIVolumeTestJSON-236492775 None] Volume e24aa42f-be18-4f29-9696-1dcf9796c67e: being created as raw with specification: {'status': 'creating', 'volume_name': 'volume-e24aa42f-be18-4f29-9696-1dcf9796c67e', 'volume_size': 1}
Jan 14 19:39:06.221386 ubuntu-bionic-vexxhost-ca-ymq-1-0022542876 cinder-volume[28292]: DEBUG oslo_concurrency.processutils [None req-d37cf946-69e6-4e0a-bceb-900b6392c88e tempest-AttachSCSIVolumeTestJSON-236492775 None] Running cmd (subprocess): sudo cinder-rootwrap /etc/cinder/rootwrap.conf env LC_ALL=C lvcreate -T -V 1g -n volume-e24aa42f-be18-4f29-9696-1dcf9796c67e stack-volumes-lvmdriver-1/stack-volumes-lvmdriver-1-pool {{(pid=28393) execute /usr/local/lib/python3.6/dist-packages/oslo_concurrency/processutils.py:379}}
Jan 14 19:39:06.235119 ubuntu-bionic-vexxhost-ca-ymq-1-0022542876 sudo[5649]: stack : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/local/bin/cinder-rootwrap /etc/cinder/rootwrap.conf env LC_ALL=C lvcreate -T -V 1g -n volume-e24aa42f-be18-4f29-9696-1dcf9796c67e stack-volumes-lvmdriver-1/stack-volumes-lvmdriver-1-pool
Jan 14 19:39:11.441601 ubuntu-bionic-vexxhost-ca-ymq-1-0022542876 cinder-volume[28292]: DEBUG oslo_concurrency.processutils [None req-d37cf946-69e6-4e0a-bceb-900b6392c88e tempest-AttachSCSIVolumeTestJSON-236492775 None] CMD "sudo cinder-rootwrap /etc/cinder/rootwrap.conf env LC_ALL=C lvcreate -T -V 1g -n volume-e24aa42f-be18-4f29-9696-1dcf9796c67e stack-volumes-lvmdriver-1/stack-volumes-lvmdriver-1-pool" returned: 0 in 5.220s {{(pid=28393) execute /usr/local/lib/python3.6/dist-packages/oslo_concurrency/processutils.py:416}}
Jan 14 19:39:11.463393 ubuntu-bionic-vexxhost-ca-ymq-1-0022542876 cinder-volume[28292]: INFO cinder.volume.flows.manager.create_volume [None req-d37cf946-69e6-4e0a-bceb-900b6392c88e tempest-AttachSCSIVolumeTestJSON-236492775 None] Volume volume-e24aa42f-be18-4f29-9696-1dcf9796c67e (e24aa42f-be18-4f29-9696-1dcf9796c67e): created successfully
Here ^ we see the command run, session opened, session closed, and the "returned <code>" message.
After an lvcreate hangs the way it did in the first example, no subsequent attempts to create a volume succeed. All result in a hang of the lvcreate command. In this example job run, after the first lvcreate command hangs, 4 more lvcreate commands hang trying to create various other volumes for other tests.
I found this old bug while googling about this lvcreate command hang behavior:
https://bugs.launchpad.net/cinder/+bug/1366125
and that looks to have been solved by a "later" ubuntu version (with the previous version being 14.04 at the time).
This job on the stable/ussuri branch is running ubuntu 18.04:
2021-01-14 18:34:23.176194 | localhost | # Node Information
2021-01-14 18:34:23.176292 | localhost | Inventory Hostname: controller
2021-01-14 18:34:23.176381 | localhost | Hostname: ubuntu-bionic-vexxhost-ca-ymq-1-0022542876
2021-01-14 18:34:23.176470 | localhost | Username: zuul
2021-01-14 18:34:23.176563 | localhost | Distro: Ubuntu 18.04
2021-01-14 18:34:23.176660 | localhost | Provider: vexxhost-ca-ymq-1
2021-01-14 18:34:23.176748 | localhost | Label: ubuntu-bionic
2021-01-14 18:34:23.176924 | localhost | Interface IP: 162.253.55.157
[1] https://zuul.opendev.org/t/openstack/build/20e8210fd0df40669c4b208df7f5de24
[2] https://zuul.opendev.org/t/openstack/build/20e8210fd0df40669c4b208df7f5de24/log/controller/logs/screen-c-vol.txt#1338
[3] https://zuul.opendev.org/t/openstack/build/20e8210fd0df40669c4b208df7f5de24/log/controller/logs/screen-c-vol.txt#950
Here's the matching good lvcreate command from syslog [1]:
Jan 14 19:39:06 ubuntu- bionic- vexxhost- ca-ymq- 1-0022542876 sudo[5649]: stack : TTY=unknown ; PWD=/ ; USER=root ; COMMAND= /usr/local/ bin/cinder- rootwrap /etc/cinder/ rootwrap. conf env LC_ALL=C lvcreate -T -V 1g -n volume- e24aa42f- be18-4f29- 9696-1dcf9796c6 7e stack-volumes- lvmdriver- 1/stack- volumes- lvmdriver- 1-pool bionic- vexxhost- ca-ymq- 1-0022542876 sudo[5649]: pam_unix( sudo:session) : session opened for user root by (uid=0) bionic- vexxhost- ca-ymq- 1-0022542876 sudo[5649]: pam_unix( sudo:session) : session closed for user root
Jan 14 19:39:06 ubuntu-
Jan 14 19:39:11 ubuntu-
and the matching first bad lvcreate command [2]:
Jan 14 19:44:41 ubuntu- bionic- vexxhost- ca-ymq- 1-0022542876 sudo[9486]: stack : TTY=unknown ; PWD=/ ; USER=root ; COMMAND= /usr/local/ bin/cinder- rootwrap /etc/cinder/ rootwrap. conf env LC_ALL=C lvcreate -T -V 1g -n volume- 75bbdc7b- 8a79-45ed- 8883-6389b6837a f8 stack-volumes- lvmdriver- 1/stack- volumes- lvmdriver- 1-pool bionic- vexxhost- ca-ymq- 1-0022542876 sudo[9486]: pam_unix( sudo:session) : session opened for user root by (uid=0)
Jan 14 19:44:41 ubuntu-
[1] https:/ /zuul.opendev. org/t/openstack /build/ 20e8210fd0df406 69c4b208df7f5de 24/log/ controller/ logs/syslog. txt#4361- 4363 /zuul.opendev. org/t/openstack /build/ 20e8210fd0df406 69c4b208df7f5de 24/log/ controller/ logs/syslog. txt#4522- 4523
[2] https:/