Charm Helpers

Bug #2025297
Comment #0

Comment 0 for bug 2025297

Revision history for this message

Paul Goins (vultaire) wrote on 2023-06-28:

During a recent maintenance window, an attempted workaround to https://bugs.launchpad.net/cinder/+bug/2023073 was put into place, involving an edit of /etc/lvm.conf to set "wipe_signatures_when_zeroing_new_lvs = 0".

Unfortunately, at one point the file was rewritten incorrectly, setting that line to instead read "wipe_signatures_when_zeroing_new_lvs = ", i.e. no value for the key.

This unfortunately caused certain charmhelpers commands to return incorrect values regarding whether disks were initialized for LVM or not, resulting in the charm re-initializing the disks.

I'm attaching an excerpt from Juju unit logs for cinder-lvm which shows the problem, but giving a summary of what happened:

* reactive/layer_openstack.py:101:run_storage_backend is entered, and its code ends up calling CinderLVMCharm.cinder_configuration(), which in turn calls configure_block_devices().
* configure_block_devices() calls configure_lvm_storage().
* In configure_lvm_storage(), is_lvm_physical_volume(device) returns False, has_partition_table(device) also returns False, and thus prepare_volume(device) gets called.
* prepare_volume(device) calls clean_storage(device), which in turn calls zap_disk(device).

See the attachment for a detailed traceback.

Re: the has_partition_table() check: I confirmed that the configuration value "overwrite" is set to False, and the configured block-devices all lacked MBR/GPT partition tables so the has_partition_table() check wouldn't have blocked this.

This leaves the is_lvm_physical_volume() check as being the only protection applicable in this case against accidental re-initialization. And in this particular charm version, that is implemented in the charmhelpers code as follows:

    try:
        check_output(['pvdisplay', block_device])
        return True
    except CalledProcessError:
        return False

Basically, anything which would cause the above command to fail - like, perhaps, a misconfigured /etc/lvm.conf - may cause this check to falsely return that something is *not* an LVM volume, resulting in it getting re-initialized.

In summary: I believe this is a critical bug in charmhelpers which also critically impacts the cinder-lvm charm with the risk of blowing away data on configured LVM devices in the case of a misconfigured /etc/lvm.conf.

Unfortunately, at one point the file was rewritten incorrectly, setting that line to instead read "wipe_signatures_when_zeroing_new_lvs = ", i.e. no value for the key.

This unfortunately caused certain charmhelpers commands to return incorrect values regarding whether disks were initialized for LVM or not, resulting in the charm re-initializing the disks.

I'm attaching an excerpt from Juju unit logs for cinder-lvm which shows the problem, but giving a summary of what happened:

See the attachment for a detailed traceback.

This leaves the is_lvm_physical_volume() check as being the only protection applicable in this case against accidental re-initialization.  And in this particular charm version, that is implemented in the charmhelpers code as follows:

try:
        check_output(['pvdisplay', block_device])
        return True
    except CalledProcessError:
        return False