creating many volumes results in a few volumes failing to create and be ready for instances
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Cinder Charm |
In Progress
|
Medium
|
Unassigned |
Bug Description
In a focal-ussuri cloud when I created 100 cirros tiny instances and I had 3 failed instances.
The cinder-volume logs show (https:/
|__Flow 'volume_
nova-logs show the typical too many retries error and nothing in glance
the other instance/volume failures have a similar message with a different value (dm-21 and dm-28)
there is no CEPH only pure storage and glance is a single instance with local file storage in this environment.
when I turned on debug, it revealed a new error around 6 minutes before the failed instance/volume creation:
https:/
ERROR oslo_service.
T_FOUND - no exchange 'cinder-
some kind of rabbitmq error
I was not able to reproduce this error message though
Also when I tried to create 50 focal small instances, I had 4 failed instances and also found this error (https:/
os_brick.
could be related to os-brick
Changed in charm-cinder: | |
assignee: | nobody → Nishant Dash (dash3) |
status: | Triaged → In Progress |
Changed in charm-cinder: | |
assignee: | Nishant Dash (dash3) → nobody |
Okay, after some further investigation - the pure storage backend used in this scenario should not be using the slow path volume copy when an instance is booted from an image backed by a cinder volume.
The slow path is taken for backends that cannot do an efficient clone (pure storage can), but only if allowed_ direct_ url_schemes in the cinder.conf is set and includes the 'cinder' label [0] - which it does not by default and that logic was not added as part of the Cinder backend for Glance enablement. I do not think it hurts to include this option by default.
[0] - https:/ /opendev. org/openstack/ cinder/ src/branch/ stable/ yoga/cinder/ volume/ flows/manager/ create_ volume. py#L1077
# Try and clone the image if we have it set as a glance location. direct_ url_schemes:
model_ update, cloned = self._clone_ image_volume( context,
volume,
image_ location,
image_ meta)
if not cloned and 'cinder' in CONF.allowed_