Bypass the dirty BDM enty no matter how it is produced
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
In Progress
|
Medium
|
Unassigned |
Bug Description
Sometimes the following dirty BDM enty (1.row) can be seen in the database that multiple BDMs with the same image_id and instance_uuid.
mysql> select * from block_device_
*******
delete_
connectio
destinatio
*******
delete_
connectio
destinatio
then it cause we fail to detach the volume and see the following error since connection_info of row 1 is NULL.
2017-03-23 13:28:05.360 1865733 TRACE oslo_messaging.
2017-03-23 13:28:05.360 1865733 TRACE oslo_messaging.
2017-03-23 13:28:05.360 1865733 TRACE oslo_messaging.
2017-03-23 13:28:05.360 1865733 TRACE oslo_messaging.
2017-03-23 13:28:05.360 1865733 TRACE oslo_messaging.
2017-03-23 13:28:05.360 1865733 TRACE oslo_messaging.
2017-03-23 13:28:05.360 1865733 TRACE oslo_messaging.
2017-03-23 13:28:05.360 1865733 TRACE oslo_messaging.
This kind of dirty data can be produced when happened to fail to run this line _attach_
1, lose the database during the operation volume_
2, lose an MQ connection or RPC timing out during the operation volume_
If you lose the database during any operation, things are going to be bad, so in general I'm not sure how realistic guarding for that case is. Losing an MQ connection or RPC timing out is probably more realistic. Seems the fix [2] is trying to solve the point 2.
However, I'm thinking if we can bypass the dirty BDM entry according to the condition that connection_info is NULL no matter how it is produced.
[1] https:/
[2] https:/
Changed in nova: | |
status: | Confirmed → In Progress |
Yeah we did attempt to fix this with [2] but couldn't find a reasonable way to handle >1 bdm with the same instance_uuid and volume_id.
I don't think connection_info being NULL is the correct way to avoid this as all newly created BDMs would meet this criteria, making it impossible for us to find the BDM later when calling intialize_ connection etc.
Should we mark this as a duplicate of bug#1427060 and continue there?