Removing nova-compute unit with scheduled but stopped VM breaks hypervisor-list api call
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Nova Cloud Controller Charm |
Invalid
|
Undecided
|
Unassigned | ||
OpenStack Nova Compute Charm |
Invalid
|
Undecided
|
Unassigned | ||
Ubuntu Cloud Archive |
Fix Released
|
Medium
|
Unassigned | ||
Mitaka |
Triaged
|
Medium
|
Unassigned | ||
nova (Ubuntu) |
Fix Released
|
Medium
|
Unassigned | ||
Xenial |
Triaged
|
Medium
|
Unassigned |
Bug Description
After removing nova-compute unit on node mycloud-cs-003, nova hypervisor-list stopped working and started to return below error to user and nova-api-
While this may be an upstream issue, I believe the charm should probably handle this edge case.
When querying the database, I find that the service and the compute_node entry for the host are both in deleted status, but I see that there is a scheduled vm on the node mycloud-cs-003. I went in and did a nova delete <instanceid> on the instance that was scheduled on that node, and that succeeded, but the "running_vms" total in compute_nodes table did not decrease, so I updated that row to running_vms = 0, and I'm still experiencing the below traceback in nova-api-
2017-12-19 17:59:35.733 218705 DEBUG nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
--
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.044 218705 ERROR nova.api.
2017-12-19 17:59:36.046 218705 INFO nova.api.
<class 'nova.exception
Steps to recreate:
1. deploy nova-cloud-
2. deploy a vm to the nova-compute environment
3. stop the instance
4. juju remove-unit <nova-compute/X> for the unit that the VM was scheduled on
5. nova hypervisor-list should exhibit this error.
Please let me know if this does not work.
Notes: this environment was previously upgraded from either icehouse or liberty to mitaka. (guessing liberty since the service deleted and compute_node deleted columns are ordered, incrementing numbers, and not just 0 or 1)
Running openstack 17.02 charms, I believe on trusty/mitaka cloud.
Changed in charm-nova-compute: | |
status: | New → Invalid |
Changed in charm-nova-cloud-controller: | |
status: | New → Invalid |
FWIW the cloud mentioned was indeed upgraded from Icehouse to Mitaka, and is running trusty