hypervisor statistics could be incorrect
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Low
|
Zhenyu Zheng | ||
Newton |
Fix Committed
|
Low
|
Matt Riedemann | ||
Ocata |
Fix Committed
|
Low
|
Matt Riedemann | ||
Ubuntu Cloud Archive |
Fix Released
|
Undecided
|
Unassigned | ||
Mitaka |
Fix Released
|
Low
|
Unassigned | ||
nova (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Xenial |
Fix Released
|
Low
|
Unassigned |
Bug Description
[Impact]
If you deploy a nova-compute service to a node, delete that service (via the api), then deploy a new nova-compute service to that same node i.e. same hostname, the database will now have two service records one marked as deleted and the other not. So far so good until you do an 'openstack hypervisor stats show' at which point the api will aggregate the resource counts from both services. This has been fixed and backported all the way down to Newton so the problem still exists on Mitaka. I assume the reason why the patch was not backported to Mitaka is that the code in nova.db.
[Test Case]
* Deploy Mitaka with bundle http://
* Do 'openstack hypervisor stats show' and verify that count is 3
* Do 'juju remove-unit nova-compute/2' to delete a compute service but not its physical host
* Do 'openstack compute service delete <id>' to delete a compute service we just removed (choosing correct id)
* Do 'openstack hypervisor stats show' and verify that count is 2
* Do juju add-unit nova-compute --to <machine id of deleted unit>
* Do 'openstack hypervisor stats show' and verify that count is 3 (not 4 as it would be before fix)
[Regression Potential]
None anticipated other than for clients that were interpreting invalid counts as correct.
[Other Info]
=======
Hypervisor statistics could be incorrect:
When we killed a nova-compute service and deleted the service from nova DB, and then
start the nova-compute service again, the result of Hypervisor/
incorrect;
How to reproduce:
Step1. Check the correct statistics before we do anything:
root@SZX1000291
+------
| Property | Value |
+------
| count | 1 |
| current_workload | 0 |
| disk_available_
| free_disk_gb | 34 |
| free_ram_mb | 6936 |
| local_gb | 35 |
| local_gb_used | 1 |
| memory_mb | 7960 |
| memory_mb_used | 1024 |
| running_vms | 1 |
| vcpus | 8 |
| vcpus_used | 1 |
+------
Step2. Kill the compute service:
root@SZX1000291
root 120419 120411 0 11:06 pts/27 00:00:00 sg libvirtd /usr/local/
root 120420 120419 0 11:06 pts/27 00:00:07 /usr/bin/python /usr/local/
root@SZX1000291
root@SZX1000291
root@SZX1000291
+----+-
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+----+-
| 4 | nova-conductor | SZX1000291919 | internal | enabled | up | 2017-05-
| 6 | nova-scheduler | SZX1000291919 | internal | enabled | up | 2017-05-
| 7 | nova-consoleauth | SZX1000291919 | internal | enabled | up | 2017-05-
| 8 | nova-compute | SZX1000291919 | nova | enabled | down | 2017-05-
| 9 | nova-cert | SZX1000291919 | internal | enabled | down | 2017-05-
+----+-
Step3. Delete the service from DB:
root@SZX1000291
root@SZX1000291
+----+-
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+----+-
| 4 | nova-conductor | SZX1000291919 | internal | enabled | up | 2017-05-
| 6 | nova-scheduler | SZX1000291919 | internal | enabled | up | 2017-05-
| 7 | nova-consoleauth | SZX1000291919 | internal | enabled | up | 2017-05-
| 9 | nova-cert | SZX1000291919 | internal | enabled | down | 2017-05-
+----+-
Step4. Start the compute service again:
root@SZX1000291
+----+-
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+----+-
| 4 | nova-conductor | SZX1000291919 | internal | enabled | up | 2017-05-
| 6 | nova-scheduler | SZX1000291919 | internal | enabled | up | 2017-05-
| 7 | nova-consoleauth | SZX1000291919 | internal | enabled | up | 2017-05-
| 9 | nova-cert | SZX1000291919 | internal | enabled | down | 2017-05-
| 10 | nova-compute | SZX1000291919 | nova | enabled | up | 2017-05-
+----+-
Step5. Check again the hyervisor statistics, the result is incorrect:
root@SZX1000291
+------
| Property | Value |
+------
| count | 2 |
| current_workload | 0 |
| disk_available_
| free_disk_gb | 68 |
| free_ram_mb | 13872 |
| local_gb | 70 |
| local_gb_used | 2 |
| memory_mb | 15920 |
| memory_mb_used | 2048 |
| running_vms | 2 |
| vcpus | 16 |
| vcpus_used | 2 |
+------
Changed in nova: | |
assignee: | nobody → Zhenyu Zheng (zhengzhenyu) |
description: | updated |
description: | updated |
description: | updated |
Changed in nova: | |
importance: | Undecided → Low |
Changed in nova: | |
assignee: | Zhenyu Zheng (zhengzhenyu) → Matt Riedemann (mriedem) |
Changed in nova: | |
assignee: | Matt Riedemann (mriedem) → Zhenyu Zheng (zhengzhenyu) |
tags: | added: sts-sponsor |
Changed in nova (Ubuntu Xenial): | |
status: | New → Triaged |
importance: | Undecided → Low |
Fix proposed to branch: master /review. openstack. org/467220
Review: https:/