Swap volume of multiattached volume will corrupt data
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Cinder |
New
|
High
|
Unassigned | ||
OpenStack Compute (nova) |
Fix Released
|
High
|
Matt Riedemann | ||
Queens |
Fix Committed
|
High
|
Matt Riedemann | ||
Rocky |
Fix Committed
|
Undecided
|
Matt Riedemann | ||
Stein |
Fix Committed
|
Undecided
|
Matt Riedemann |
Bug Description
We currently permit the following:
Create multiattach volumes a and b
Create servers 1 and 2
Attach volume a to servers 1 and 2
swap_volume(server 1, volume a, volume b)
In fact, we have a tempest test which tests exactly this sequence: api.compute.
The problem is that writes from server 2 during the copy operation on server 1 will continue to hit the underlying storage, but as server 1 doesn't know about them they won't be reflected on the copy on volume b. This will lead to an inconsistent copy, and therefore data corruption on volume b.
Also, this whole flow makes no sense for a multiattached volume because even if we managed a consistent copy all we've achieved is forking our data between the 2 volumes. The purpose of this call is to allow the operator to move volumes. We need a fundamentally different approach for multiattached volumes.
In the short term we should at least prevent data corruption by preventing swap volume of a multiattached volume. This would also cause the above tempest test to fail, but as I don't believe it's possible to implement the test safely this would be correct.
Changed in cinder: | |
importance: | Undecided → High |
Changed in cinder: | |
assignee: | nobody → Douglas (viroel) |
Changed in nova: | |
assignee: | Matt Riedemann (mriedem) → Lee Yarwood (lyarwood) |
Changed in cinder: | |
assignee: | Douglas Viroel (dviroel) → nobody |
Changed in nova: | |
assignee: | Lee Yarwood (lyarwood) → Matt Riedemann (mriedem) |
Changed in nova: | |
assignee: | Matt Riedemann (mriedem) → Lee Yarwood (lyarwood) |
Changed in nova: | |
assignee: | Lee Yarwood (lyarwood) → Matt Riedemann (mriedem) |
Changed in nova: | |
assignee: | Matt Riedemann (mriedem) → Lee Yarwood (lyarwood) |
Changed in nova: | |
assignee: | Lee Yarwood (lyarwood) → Stephen Finucane (stephenfinucane) |
Changed in nova: | |
assignee: | Stephen Finucane (stephenfinucane) → Matt Riedemann (mriedem) |
Changed in cinder: | |
assignee: | nobody → Jun Chen (loongstore) |
assignee: | Jun Chen (loongstore) → nobody |
As mentioned in the mailing list, I think this is also something to be controlled in Cinder during retype or volume live migration since that would be a fast fail for this scenario:
http:// lists.openstack .org/pipermail/ openstack- dev/2018- June/131234. html
Otherwise cinder calls swap volume in nova, which will fail back to cinder, and then cinder has to rollback; it's just easier to fail fast in the cinder API.