Nova instance not boot after host restart but still show as Running
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Expired
|
Undecided
|
Unassigned | ||
Juno |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
The nova host lost power and after restarted, the previous running instance is still shown in
"Running" state but actually not started:
root@allinone-
+------
| ID | Name | Status | Task State | Power State | Networks |
+------
| 13d9eead-
+------
root@allinone-
root 95513 90291 0 14:46 pts/0 00:00:00 grep --color=auto -i qemu
Please note the resume_
Changed in nova: | |
assignee: | nobody → Alex Xu (xuhj) |
Changed in nova: | |
status: | New → Confirmed |
I have a customer reporting a slightly different issue where resume_ guests_ state_on_ host_boot= True but it doesn't matter because the power_state from the driver (libvirt) is the same as the power_state in the database so it's ignored on _init_instance() in the compute manager.
The recreate is:
- create an instance, wait for it to be ACTIVE lifecycle_ event code in the compute manager will call _sync_instance_ power_state which will see that the vm_state
- reboot the host/hypervisor
- libvirt will shutdown the guest which emits the STOPPED lifecycle event from the libvirt driver to the compute manager
- the handle_
is ACTIVE but the vm_power_state (from the driver) is 4 (shutdown) so it will stop the instance via the stop API which sets the
vm_state to stopped.
- once the host/hypervisor is back up, libvirt will automatically restart the guest VM which triggers a lifecycle event from the
libvirt driver to the compute manager. Since the vm_state is STOPPED from before, nova assumes the user wants the instance to
be stopped and calls the stop API to shutdown the instance again.
Even with resume_ guests_ state_on_ host_boot= True, the libvirt driver's method for resume_ state_on_ host_boot will ignore the operation if the guest VM is running.