Instance snapshot stuck in Shutoff status

Bug #997867 reported by Louis Kang
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Yun Mao

Bug Description

There is a sync problem where every now and then the snapshot action will leave the snapshotted instance in permanent “Shutoff” Status. But checking the hypervisor shows that the instance is actually in “Running” Status.

It seems that the instance will go into shutoff state to perform the snapshot operation which is normal. However if the snapshot is performed at the wrong moment, the snapshot action will cause the instance status in the DB to go out of sync with the instance status of the hypervisor.

Nova will eventually correct the Power State to “Running” but the Status will never be corrected and stay in “Shutoff” indefinitely.

To reproduce the problem:

- Create a VM, find the compute node it is running on.
- Tail the nova-compute.log file and grep for ‘_sync_power_state’
You should get “2012-05-10 04:07:11 DEBUG nova.manager [-] Skipping ComputeManager._sync_power_states, 1 ticks left until next run from (pid=1528) periodic_tasks /opt/stack/nova/nova/manager.py:147”
- If you start a snapshot when you see the "1 tick left", it will cause the error.

Looking at the code, the _sync_power_states will detect that the VM Status and Power State are Shutoff (because of the snapshot) and update the DB, but it seems that after the snapshot is completed there is no mechanism to update the Status.
If the _sync_power_states is run again it will only update the Power State for Shutoff to Running, but the Status will remain in Shutoff

The only workaround is to Reboot the instance.

This is happening on the Essex 2012-2-dev (LOCALBRANCH) with hypervisor KVM/QEMU

Changed in nova:
status: New → Triaged
importance: Undecided → Medium
tags: added: low-hanging-fruit
Louis Kang (louiskang)
tags: added: in-stable-essex
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/8254

Changed in nova:
assignee: nobody → Yun Mao (yunmao)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/8254
Committed: http://github.com/openstack/nova/commit/129b87e17d3333aeaa9e855a70dea51e6581ea63
Submitter: Jenkins
Branch: master

commit 129b87e17d3333aeaa9e855a70dea51e6581ea63
Author: Yun Mao <email address hidden>
Date: Tue Jun 5 14:55:34 2012 -0400

    vm state and task state management

    partially implements bp task-management
    fixes bug 997867

    also see http://wiki.openstack.org/VMState

    Refactored the following API/state:
    * rebuild
    * migrate
    * resize
    * start
    * stop
    * delete
    * soft delete
    * rework sync_power_state in compute/manager.

    fix broken tests, add transition diagram in dot

    Change-Id: I3c5a97508a6dad7175fba12828bd3fa6ef1e50ee

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → folsom-2
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: folsom-2 → 2012.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.