2023-06-29 05:16:37 |
Paul Goins |
description |
During a recent maintenance window, an attempted workaround to https://bugs.launchpad.net/cinder/+bug/2023073 was put into place, involving an edit of /etc/lvm.conf to set "wipe_signatures_when_zeroing_new_lvs = 0".
Unfortunately, at one point the file was rewritten incorrectly, setting that line to instead read "wipe_signatures_when_zeroing_new_lvs = ", i.e. no value for the key.
This unfortunately caused certain charmhelpers commands to return incorrect values regarding whether disks were initialized for LVM or not, resulting in the charm re-initializing the disks.
I'm attaching an excerpt from Juju unit logs for cinder-lvm which shows the problem, but giving a summary of what happened:
* reactive/layer_openstack.py:101:run_storage_backend is entered, and its code ends up calling CinderLVMCharm.cinder_configuration(), which in turn calls configure_block_devices().
* configure_block_devices() calls configure_lvm_storage().
* In configure_lvm_storage(), is_lvm_physical_volume(device) returns False, has_partition_table(device) also returns False, and thus prepare_volume(device) gets called.
* prepare_volume(device) calls clean_storage(device), which in turn calls zap_disk(device).
See the attachment for a detailed traceback.
Re: the has_partition_table() check: I confirmed that the configuration value "overwrite" is set to False, and the configured block-devices all lacked MBR/GPT partition tables so the has_partition_table() check wouldn't have blocked this.
This leaves the is_lvm_physical_volume() check as being the only protection applicable in this case against accidental re-initialization. And in this particular charm version, that is implemented in the charmhelpers code as follows:
try:
check_output(['pvdisplay', block_device])
return True
except CalledProcessError:
return False
Basically, anything which would cause the above command to fail - like, perhaps, a misconfigured /etc/lvm.conf - may cause this check to falsely return that something is *not* an LVM volume, resulting in it getting re-initialized.
In summary: I believe this is a critical bug in charmhelpers which also critically impacts the cinder-lvm charm with the risk of blowing away data on configured LVM devices in the case of a misconfigured /etc/lvm.conf. |
During a recent maintenance window, an attempted workaround to https://bugs.launchpad.net/cinder/+bug/2023073 was put into place, involving an edit of /etc/lvm/lvm.conf to set "wipe_signatures_when_zeroing_new_lvs = 0".
Unfortunately, at one point the file was rewritten incorrectly, setting that line to instead read "wipe_signatures_when_zeroing_new_lvs = ", i.e. no value for the key.
This unfortunately caused certain charmhelpers commands to return incorrect values regarding whether disks were initialized for LVM or not, resulting in the charm re-initializing the disks.
I'm attaching an excerpt from Juju unit logs for cinder-lvm which shows the problem, but giving a summary of what happened:
* reactive/layer_openstack.py:101:run_storage_backend is entered, and its code ends up calling CinderLVMCharm.cinder_configuration(), which in turn calls configure_block_devices().
* configure_block_devices() calls configure_lvm_storage().
* In configure_lvm_storage(), is_lvm_physical_volume(device) returns False, has_partition_table(device) also returns False, and thus prepare_volume(device) gets called.
* prepare_volume(device) calls clean_storage(device), which in turn calls zap_disk(device).
See the attachment for a detailed traceback.
Re: the has_partition_table() check: I confirmed that the configuration value "overwrite" is set to False, and the configured block-devices all lacked MBR/GPT partition tables so the has_partition_table() check wouldn't have blocked this.
This leaves the is_lvm_physical_volume() check as being the only protection applicable in this case against accidental re-initialization. And in this particular charm version, that is implemented in the charmhelpers code as follows:
try:
check_output(['pvdisplay', block_device])
return True
except CalledProcessError:
return False
Basically, anything which would cause the above command to fail - like, perhaps, a misconfigured /etc/lvm.conf - may cause this check to falsely return that something is *not* an LVM volume, resulting in it getting re-initialized.
In summary: I believe this is a critical bug in charmhelpers which also critically impacts the cinder-lvm charm with the risk of blowing away data on configured LVM devices in the case of a misconfigured /etc/lvm.conf. |
|