restore-backup fails to delete model instances

Bug #1559715 reported by Curtis Hovey
4
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-ci-tools
Triaged
High
Unassigned
juju-core
Triaged
High
Unassigned

Bug Description

As seen in
    http://reports.vapour.ws/releases/issue/570e785e749a5606ed5519ce
restore backup can pass, but is it not very reliable. The model seems to have bene lost. There are two overlapping issues.

1. Juju needs to make restore reliable

2. juju-ci-tools needs to verify that a restore provided a working model.

Instances are being left behind, stealing resources need for testing. The instances claim to be machine-0, The machine is ubuntu/0. The model is not torndown.

At this time, a person needs to visit aws/us-east-1 each day and delete machines left running.

Curtis Hovey (sinzui)
tags: added: destroy-controller
Changed in juju-ci-tools:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Cheryl Jennings (cherylj) wrote :

@sinzui - do you have any more details on when / which machines are left behind?

Controllers? Other machines?

Left over after a failed restore? or successful?

Curtis Hovey (sinzui)
description: updated
summary: - restore-backup is unreliable
+ restore-backup fails to delete model instnces
Revision history for this message
Curtis Hovey (sinzui) wrote :

The assess-rescovery.py test needs an update to fail juju when juju looses models/instances. The the test does not provide enough information about the controllers, models, or show the machines in the admin and hosted models. we also want the recovery test to do some thing to exercise juju after juju reports it is successful.

summary: - restore-backup fails to delete model instnces
+ restore-backup fails to delete model instances
Curtis Hovey (sinzui)
description: updated
Curtis Hovey (sinzui)
description: updated
description: updated
Revision history for this message
Curtis Hovey (sinzui) wrote :

From the duplicate

The assess_recovery.py script used by several tests can report success when it's actually failing late on.

<http://reports.vapour.ws/releases/3881/job/functional-backup-restore/attempt/4002>

2016-04-12 20:58:40 ERROR cmd supercommand.go:448 connecting with bootstrap config: unknown model: "7aef8efb-9fb8-4cb0-8751-a2e40e1a82fa" (not found)
2016-04-12 20:58:40 ERROR Command '('juju', '--show-log', 'show-status', '-m', 'functional-backup-restore', '--format', 'yaml')' returned non-zero exit status 1
...
Finished: SUCCESS

This is hiding breakage from the reports site.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.