Instance API task state can get stuck if locked

Bug #1025722 reported by Dan Smith
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Dan Smith

Bug Description

If an instance is locked by an admin and a user tries to make a state change, the instance can get a broken task_state in the API layer that prevents even an admin from taking action on it.

Steps to reproduce:
1. nova boot .... foo
2. nova lock foo (as admin)
3. nova stop foo (as user)
4. nova unlock foo (as admin)
5. nova stop foo (as either) -> fails due to task_state==stopping
6. nova start foo (as either) -> fails due to vm_state==active

The only way out of it is to delete the instance or hack the database

Dan Smith (danms)
Changed in nova:
assignee: nobody → Dan Smith (danms)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/9922

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/9932

Mark McLoughlin (markmc)
Changed in nova:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/9932
Committed: http://github.com/openstack/nova/commit/d8d7100f8c10ecd388d1943bee9298a913a6990a
Submitter: Jenkins
Branch: master

commit d8d7100f8c10ecd388d1943bee9298a913a6990a
Author: Dan Smith <email address hidden>
Date: Tue Jul 17 13:15:13 2012 -0700

    Revert task_state on failed instance actions

    Right now, the task_state logic in compute/api.py can be broken, such
    that instances can get stuck in an uneditable state if an action is
    performed that fails. The task_state remains something like 'stopping'
    even though the action has not been completed or queued, and further
    requests that depend on task_state will fail (see check_instance_state()
    in compute/apy.py).

    The only way out of it is to delete the instance or hack the database.

    This patch adds a reverts_task_state() decorator to compute/manager.py,
    which, upon operation failure, reverts the instance's task_state back
    to None.

    It also adds a test_state_revert() test to verify that all the actions
    marked for state reversion do the right thing. It also corrects several
    other tests that specifically expect the task_state to remain after
    an error has occurred.

    Fixes bug 1025722

    Change-Id: Id4358c508156c713cb953dfa0f01a6f598bc1e7d

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → folsom-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: folsom-3 → 2012.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.