I fail to remove a volume snapshot using Horizon or CLI. Cinder fails to remove snapshot but does not report any error. Only the snapshot remains after deletion, with state "error_deleting". The CLI command return 0. See below for a detailed trace:
[root@phoenix ~]# openstack volume snapshot list
+--------------------------------------+------------------------+-------------+--------+------+
| ID | Name | Description | Status | Size |
+--------------------------------------+------------------------+-------------+--------+------+
| 3df6ef28-7f8d-45bc-a9e2-d67703c020ae | Debian Bullseye LVM 8G | | error | 8 |
+--------------------------------------+------------------------+-------------+--------+------+
[root@phoenix ~]# openstack volume snapshot delete 3df6ef28-7f8d-45bc-a9e2-d67703c020ae; echo $?
0
[root@phoenix ~]# openstack volume snapshot list
+--------------------------------------+------------------------+-------------+----------------+------+
| ID | Name | Description | Status | Size |
+--------------------------------------+------------------------+-------------+----------------+------+
| 3df6ef28-7f8d-45bc-a9e2-d67703c020ae | Debian Bullseye LVM 8G | | error_deleting | 8 |
+--------------------------------------+------------------------+-------------+----------------+------+
[root@phoenix ~]# #openstack volume snapshot delete --force 3df6ef28-7f8d-45bc-a9e2-d67703c020ae; echo $?
[root@phoenix ~]# openstack volume snapshot set --state error 3df6ef28-7f8d-45bc-a9e2-d67703c020ae; echo $?
0
[root@phoenix ~]# openstack volume snapshot delete --force 3df6ef28-7f8d-45bc-a9e2-d67703c020ae; echo $?
0
[root@phoenix ~]# openstack volume snapshot list
+--------------------------------------+------------------------+-------------+----------------+------+
| ID | Name | Description | Status | Size |
+--------------------------------------+------------------------+-------------+----------------+------+
| 3df6ef28-7f8d-45bc-a9e2-d67703c020ae | Debian Bullseye LVM 8G | | error_deleting | 8 |
+--------------------------------------+------------------------+-------------+----------------+------+
[root@phoenix ~]# vim /etc/cinder/cinder.conf
[root@phoenix ~]# openstack volume snapshot set --state error 3df6ef28-7f8d-45bc-a9e2-d67703c020ae; echo $?
0
[root@phoenix ~]# openstack volume snapshot delete --force 3df6ef28-7f8d-45bc-a9e2-d67703c020ae; echo $?
0
[root@phoenix ~]# openstack volume snapshot list
+--------------------------------------+------------------------+-------------+----------------+------+
| ID | Name | Description | Status | Size |
+--------------------------------------+------------------------+-------------+----------------+------+
| 3df6ef28-7f8d-45bc-a9e2-d67703c020ae | Debian Bullseye LVM 8G | | error_deleting | 8 |
+--------------------------------------+------------------------+-------------+----------------+------+
[root@phoenix ~]#
[root@phoenix ~]# openstack volume snapshot delete --force 3df6ef28-7f8d-45bc-a9e2-d67703c020ae; echo $?
0
[root@phoenix ~]# openstack volume snapshot list
+--------------------------------------+------------------------+-------------+----------------+------+
| ID | Name | Description | Status | Size |
+--------------------------------------+------------------------+-------------+----------------+------+
| 3df6ef28-7f8d-45bc-a9e2-d67703c020ae | Debian Bullseye LVM 8G | | error_deleting | 8 |
+--------------------------------------+------------------------+-------------+----------------+------+
journalctl does not yield any output despite having activated full logs in /etc/cinder/cinder.conf
#log_config_append = <None>
...
default_log_levels = amqp=DEBUG,amqplib=DEBUG,boto=DEBUG,qpid=DEBUG,sqlalchemy=DEBUG,suds=DEBUG,oslo.messaging=DEBUG,oslo_messaging=DEBUG,iso8601=DEBUG,requests.packages.urllib3.connectionpool=DEBUG,urllib3.connectionpool=DEBUG,websocket=DEBUG,requests.packages.urllib3.util.retry=DEBUG,urllib3.util.retry=DEBUG,keystonemiddleware=DEBUG,routes.middleware=DEBUG,stevedore=DEBUG,taskflow=DEBUG,keystoneauth=DEBUG,oslo.cache=DEBUG,oslo_policy=DEBUG,dogpile.core.dogpile=DEBUG
But errors are reported in /var/log/cinder/volume.log:
var/log/cinder/volume.log:103148:2023-03-09 15:18:20.274 3152 ERROR oslo_messaging.rpc.server cinder.exception.RemoteFSInvalidBackingFile: File /media/os-stores/cinder/mnt/ea003e1a1bee1aaf573aafc40cd188ff/volume-e67af022-0d5b-4253-9981-c8b34622e058.3df6ef28-7f8d-45bc-a9e2-d67703c020ae has invalid backing file /media/os-stores/cinder/mnt/ea003e1a1bee1aaf573aafc40cd188ff/volume-e67af022-0d5b-4253-9981-c8b34622e058
I tried to create a new qcow2 image to replace /media/os-stores/cinder/mnt/ea003e1a1bee1aaf573aafc40cd188ff/volume-e67af022-0d5b-4253-9981-c8b34622e058 and a snapshot using this new image that I placed instead of /media/os-stores/cinder/mnt/ea003e1a1bee1aaf573aafc40cd188ff/volume-e67af022-0d5b-4253-9981-c8b34622e058.3df6ef28-7f8d-45bc-a9e2-d67703c020ae, then I tried to remove the snapshot again using Horizon, but the deletion fails still and the same trace is produced in the logs.
As far as I can see, the only remaining option to me is to manually delete the snapshot and the volume, in a way similar to https://wiki.infn.it/progetti/cloud-areapd/operations/manual_delete_cinder_volumes. This is far from ideal though, as the risk to set the database in an inconsistent state is high.
Can you suggest a safe solution to remove the sticky snapshot?
> ERROR oslo_messaging. rpc.server cinder. exception. RemoteFSInvalid BackingFile: File /media/ os-stores/ cinder/ mnt/ea003e1a1be e1aaf573aafc40c d188ff/ volume- e67af022- 0d5b-4253- 9981-c8b34622e0 58.3df6ef28- 7f8d-45bc- a9e2-d67703c020 ae has invalid backing file /media/ os-stores/ cinder/ mnt/ea003e1a1be e1aaf573aafc40c d188ff/ volume- e67af022- 0d5b-4253- 9981-c8b34622e0 58
I think this failure occurs because the backing file should just be "volume- e67af022- 0d5b-4253- 9981-c8b34622e0 58" without a path specified. Do you know what sequence of operations led to this state?