Instance snapshot stuck in Shutoff status
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Medium
|
Yun Mao |
Bug Description
There is a sync problem where every now and then the snapshot action will leave the snapshotted instance in permanent “Shutoff” Status. But checking the hypervisor shows that the instance is actually in “Running” Status.
It seems that the instance will go into shutoff state to perform the snapshot operation which is normal. However if the snapshot is performed at the wrong moment, the snapshot action will cause the instance status in the DB to go out of sync with the instance status of the hypervisor.
Nova will eventually correct the Power State to “Running” but the Status will never be corrected and stay in “Shutoff” indefinitely.
To reproduce the problem:
- Create a VM, find the compute node it is running on.
- Tail the nova-compute.log file and grep for ‘_sync_power_state’
You should get “2012-05-10 04:07:11 DEBUG nova.manager [-] Skipping ComputeManager.
- If you start a snapshot when you see the "1 tick left", it will cause the error.
Looking at the code, the _sync_power_states will detect that the VM Status and Power State are Shutoff (because of the snapshot) and update the DB, but it seems that after the snapshot is completed there is no mechanism to update the Status.
If the _sync_power_states is run again it will only update the Power State for Shutoff to Running, but the Status will remain in Shutoff
The only workaround is to Reboot the instance.
This is happening on the Essex 2012-2-dev (LOCALBRANCH) with hypervisor KVM/QEMU
Changed in nova: | |
status: | New → Triaged |
importance: | Undecided → Medium |
tags: | added: low-hanging-fruit |
tags: | added: in-stable-essex |
Changed in nova: | |
milestone: | none → folsom-2 |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | folsom-2 → 2012.2 |
Fix proposed to branch: master /review. openstack. org/8254
Review: https:/