OpenStack Cinder Charm

Cinder-volume may fail to start properly during deployment

Bug #1968621 reported by DUFOUR Olivier on 2022-04-11

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Cinder Charm	New	Undecided	Unassigned

Bug Description

This issue seems to happen when deploying specifically Cinder-volume on separate units.
The topology of Cinder is the following :
- Cinder API and Scheduler on LXD units
- Cinder Volume on baremetal units (for multiple iSCSI backends accesses)

When deploying a bundle, one or multiple cinder-volume units may end in 'blocked' status with its message complaining that 'cinder-volume' process isn't running which is exactly the issue.

In term of version :
- MaaS 3.1
- Juju 2.9.28
- Cinder charm's stable from Charmhub : 530

So far I've seen this happening from time to time on :
- Focal Wallaby and Focal Xena with a Powerstore a iSCSI backend.
- Focal Ussuri with Purestorage as iSCSI backend.

The solution is simply to run on the unit 'sudo systemctl restart cinder-volume' and the deployment can finish properly.

Looking at the logs, Cinder-volume fails to find a proper working backend and terminates itself, which is a normal behavior since it happens while the deployment is ongoing and all the local/subordinates charms may not have finished to install themselves.

I can observe that systemd's unit is configured to try to restart cinder-volume service if it fails to start, but for some reason it seems to stop retrying at some point. (see attached journalctl log).

The most interesting part on both log files are happening between 17:01:00 and 17:07:23 (time I restarted manually the service through systemctl command)

Tags:

Revision history for this message

DUFOUR Olivier (odufourc) wrote on 2022-04-11:

journalctl-cinder-volume.log Edit (29.7 KiB, text/plain)

Revision history for this message

DUFOUR Olivier (odufourc) wrote on 2022-04-11:

cinder-volume.log during bootstrap Edit (676.8 KiB, text/plain)

Revision history for this message

Vern Hart (vern) wrote on 2022-08-03:

I've seen this with my current deployment.

Using the "enabled-services" option, we've got scheduler,api deployed in control containers and volume deployed on the bare metal nodes because we are utilizing purestorage iscsi backends.

The cinder-volume units seem to often fail to start the cinder-volume services. I am not sure why but it looks like it may be related to the backend not being ready yet (at first).

We left the deployment for several hours (5pm until about 9am) and the service never restarted on their own.

Regardless *why* it's failing at first, it eventually succeeds when I manually start the cinder-volume service. Could the charm be more proactive by restarting services that aren't running that should be?

As suggested, this work-around gets us past this:

juju run -a cinder-volume sudo systemctl restart cinder-volume

Revision history for this message

Nobuto Murata (nobuto) wrote on 2022-12-01:

This would be related to systemd start limit like:
https://opendev.org/openstack/charm-designate/commit/1fef3793f5b4f089969a746ba8be33972ef58f10
https://opendev.org/openstack/charm-nova-compute/commit/140be9d0a99f65769899a710ea9d378a29a29e80

tags:

added: good-first-bug

Revision history for this message

Andre Ruiz (andre-ruiz) wrote on 2023-01-06 (last edit on 2023-01-06):

I'm also seeing this, and together another issue that may be related:

cinder-volume/4 blocked executing 10 10.1.12.62 Services not running that should be: cinder-volume
cinder-three-par/2 waiting idle 10.1.12.62 Charm configuration in progress

At the same time that cinder-volume complains about not running, the cinder 3par charm does not configure itself. After cinder-volumes are started by hand and I remove/re-add the relation between the 3par driver and cinder-volume, it then happily finishes configuring very quickly.

Unfortunately I can't test if 3par drivers would configure ok on the first try if cinder-volumes were not stopped.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.