Instances can be stuck in BACKUP status
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack DBaaS (Trove) |
In Progress
|
Low
|
Yang Youseok |
Bug Description
If a Backup is issued for an instance, and the backup hangs or the GuestAgent never processes the message, the Backup state can remain in NEW state. This prevents any other Backups from occurring on that instance, and leaves the Instance a BACKUP state from then on.
How we report instance state:
### Check if there is a backup running for this instance
if Backup.
return InstanceStatus.
Option 1:
- Immediately set the Backup State to be BUILDING on the BackupAgent -- which means that _only_ the API ever sets a backup to NEW
- If the Backup state is NEW and the created date > some configured time, this means the Guest was not able to consume the message, so a periodic poll should set it to FAILED
Option 2:
- Implement a periodic task that finds all Backups in NEW or BUILDING state that have exceeded the backup duration window and marks them as failed.
Changed in trove: | |
importance: | Undecided → Critical |
Changed in trove: | |
assignee: | nobody → Nikhil Manchanda (slicknik) |
milestone: | none → icehouse-2 |
Changed in trove: | |
milestone: | icehouse-2 → next |
Changed in trove: | |
importance: | Critical → Wishlist |
importance: | Wishlist → Medium |
Changed in trove: | |
importance: | Medium → Low |
Questions about Opt #2:
- who will set duration window ?
- does two periodic task wound't load VM memory ?
I think it would be better not to use another one periodic task, Opt. #1 look easier.