OpenStack DBaaS (Trove)

Instances can be stuck in BACKUP status

Bug #1252897 reported by Vipul Sabhaya on 2013-11-19

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	OpenStack DBaaS (Trove)	In Progress	Low	Yang Youseok	OpenStack DBaaS (Trove) next

Bug Description

If a Backup is issued for an instance, and the backup hangs or the GuestAgent never processes the message, the Backup state can remain in NEW state. This prevents any other Backups from occurring on that instance, and leaves the Instance a BACKUP state from then on.

How we report instance state:

        ### Check if there is a backup running for this instance
        if Backup.running(self.id):
            return InstanceStatus.BACKUP

Option 1:
- Immediately set the Backup State to be BUILDING on the BackupAgent -- which means that _only_ the API ever sets a backup to NEW
- If the Backup state is NEW and the created date > some configured time, this means the Guest was not able to consume the message, so a periodic poll should set it to FAILED

Option 2:
- Implement a periodic task that finds all Backups in NEW or BUILDING state that have exceeded the backup duration window and marks them as failed.

Vipul Sabhaya (vipuls) on 2013-11-19

Changed in trove:
importance:	Undecided → Critical

Revision history for this message

Denis M. (dmakogon) wrote on 2013-11-26:

Questions about Opt #2:
- who will set duration window ?
- does two periodic task wound't load VM memory ?

I think it would be better not to use another one periodic task, Opt. #1 look easier.

Revision history for this message

Craig Vyvial (cp16net) wrote on 2013-11-27:

Well what if the agent got the message after the period configured?
Would the states ever change?

Nikhil Manchanda (slicknik) on 2013-12-03

Changed in trove:
assignee:	nobody → Nikhil Manchanda (slicknik)
milestone:	none → icehouse-2

Revision history for this message

Denis M. (dmakogon) wrote on 2013-12-05:

It could get change state after restart.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-01-08: Fix proposed to trove (master)

Fix proposed to branch: master
Review: https://review.openstack.org/65553

Changed in trove:
status:	New → In Progress

Michael Basnight (hubcap) on 2014-01-21

Changed in trove:
milestone:	icehouse-2 → next

Michael Basnight (hubcap) on 2014-02-04

Changed in trove:
importance:	Critical → Wishlist
importance:	Wishlist → Medium

Nikhil Manchanda (slicknik) on 2014-04-18

Changed in trove:
importance:	Medium → Low

Revision history for this message

Yang Youseok (ileixe) wrote on 2018-03-25:

Although, this bug is quite old, it seems to be appeared until recent version.

I think it's reasonable to consider NEW state not to RUNNING state because there is no way to revert the failed backup which guest-agent do not received for any reasons. (In my case, upgrading Liberty to Newton makes incompatible oslo.context and did not backup message stucking NEW state).

For caller side, this NEW state seems to be only used for blocking 'delete backup during running'.
Since it actually dost not start backup at all, it does not matters I think.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-03-25:

Fix proposed to branch: master
Review: https://review.openstack.org/556176

Changed in trove:
assignee:	Nikhil Manchanda (slicknik) → Yang Youseok (ileixe)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-17: Change abandoned on trove (master)

Change abandoned by Yang Youseok (<email address hidden>) on branch: master
Review: https://review.openstack.org/556176
Reason: Invalid commit.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.