Images in inconsistent state when calls to registry fail during image deletion
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Glance |
Invalid
|
Critical
|
Prateek Goel | ||
Juno |
New
|
Undecided
|
Unassigned | ||
Kilo |
New
|
Undecided
|
Unassigned | ||
Liberty |
New
|
Undecided
|
Unassigned | ||
OpenStack Security Advisory |
Won't Fix
|
Undecided
|
Unassigned |
Bug Description
[0] shows a sample image that was left in an inconsistent state when a call to registry failed during image deletion.
Glance v1 API makes two registry calls when deleting an image.
The first call [1] is made to to set the status of an image to deleted/
And, the other [2], to delete the rest of the metadata, which sets 'deleted_at' and 'deleted' fields in the db.
If the first call fails, the image deletion request fails and the image is left intact in it's previous status.
However, if the first call succeeds and the second one fails, the image is left in an inconsistent status where it's status is set to pending_
If delayed delete is turned on, these images are never collected by the scrubber as they won't appear as deleted images because their deleted field is not set. So, these images will continue to occupy storage in the backend.
Also, further attempts at deleting these images will fail with a 404 because the status is already set to pending_
[0] http://
[1]: https:/
[2]: https:/
description: | updated |
Changed in glance: | |
status: | New → Triaged |
importance: | Undecided → Critical |
milestone: | none → mitaka-1 |
assignee: | nobody → Hemanth Makkapati (hemanth-makkapati) |
information type: | Public Security → Public |
Changed in glance: | |
assignee: | Hemanth Makkapati (hemanth-makkapati) → Prateek Goel (prateek.goel) |
status: | Triaged → In Progress |
1. I agree, the image deletion operation should be atomic.
2. Image data left behind means there is a potential risk of filling up storage quota and resulting into a DoS; be mindful that it's a denial of service but NOT a exploit as it is dependent on the operators' failure scenarios of g-api <-> reg communication.
3. The original description has information related to failure scenarios for only v1. So, a check is needed for the v2 as applicable.