Resource tracker causes update of compute_nodes table every minute

Bug #1658629 reported by Yoshihiko Atsumi
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Medium
Unassigned

Bug Description

Resource tracker in nova-compute checks the resource usage every minute.
If the latest resource usage is different from the previous of that,
nova-compute request nova-scheduler to update compute_nodes table in DB.
https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L712-L728
_update() calls _resource_change(). And in _resource_change(), resource usages are compared by obj_equal_prims().

In obj_equal_prims(), the resource usage lists are created.
https://github.com/openstack/nova/blob/master/nova/objects/base.py#L328-L360
And I found that the lists contained "updated_at" in their keys.

 prim_1 ={'nova_object.version': u'1.16',
'nova_object.name': 'ComputeNode',
'nova_object.data': {'pci_device_pools': {'nova_object.version': '1.1',
'nova_object.name': 'PciDevicePoolList',
'nova_object.data': {'objects': []}, 'nova_object.namespace': 'nova'},
'updated_at': '2017-01-20T01:07:43Z', ★here
(omission)

prim_2 ={'nova_object.version': '1.16',
'nova_object.name': 'ComputeNode',
'nova_object.data': {'pci_device_pools': {'nova_object.version': '1.1',
'nova_object.name': 'PciDevicePoolList',
'nova_object.data': {'objects': []}, 'nova_object.namespace': 'nova'},
'updated_at': '2017-01-20T01:06:42Z', ★here
(omission)

These difference in "updated_at" values makes unequal resource usage in recent 2 times checks,
and causes update of compute_nodes table every minute.
I noticed this problem in Mitaka. According to the code, I think this happens in master.

Revision history for this message
Chris Dent (cdent) wrote :

I can confirm that this does indeed appear to be the case in master (as of 20170123): The compute_node in a single node devstack is updated every 61 seconds.

tags: added: resource-tracker scheduler
Changed in nova:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Chris Dent (cdent) wrote :

I can also confirm that changing the prims check like to:

  if not obj_base.obj_equal_prims(compute_node, old_compute, ['updated_at']):

will cause the conditional to fail as desired. However, this does not change that the compute node table still gets written every sixty seconds. Something else is still doing that, so fixing this could fix one extraneous update but leave at least one more to fix.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/424305

Changed in nova:
assignee: nobody → Chris Dent (cdent)
status: Confirmed → In Progress
Changed in nova:
assignee: Chris Dent (cdent) → Dan Smith (danms)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/424305
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b2a4fcf84633413c322afb4f2c8066358b943d6b
Submitter: Jenkins
Branch: master

commit b2a4fcf84633413c322afb4f2c8066358b943d6b
Author: Chris Dent <email address hidden>
Date: Mon Jan 23 19:34:44 2017 +0000

    Avoid redundant call to update_resource_stats from RT

    When the resource_tracker calls _update, when the new compute node
    and the old compute node only differ by updated_at,
    update_resource_stats was still being called. Manual testing shows
    that updated_at is generally different because there's something
    else that is also leading to a compute_node.save() approximately
    every sixty seconds. So while this fix removes one redundant
    save, it doesn't get all of them, thus the "partial" below.

    A unit test is added to exercise different updated_at values.

    Change-Id: If688ae4d92ecdea83479f37de8856b668b8bc7a6
    Partial-Bug: #1658629

Revision history for this message
Maciej Szankin (mszankin) wrote :

The merged code indicates that it was a partial fix - any update on what is left?

Revision history for this message
Chris Dent (cdent) wrote :

There's some discussion of the other issues in this email thread: http://lists.openstack.org/pipermail/openstack-dev/2017-January/110953.html

It might make sense, however, to call this bug closed as fixing the rest of things is a bigger change and not really a bug fix. The code is behaving as intended right now (since the merge of the fix above), it's just that intent is not quite right

Revision history for this message
Sean Dague (sdague) wrote :

Automatically discovered version mitaka in description. If this is incorrect, please update the description to include 'nova version: ...'

tags: added: openstack-version.mitaka
Revision history for this message
Sean Dague (sdague) wrote :

There are no currently open reviews on this bug, changing the status back to the previous state and unassigning. If there are active reviews related to this bug, please include links in comments.

Changed in nova:
status: In Progress → Confirmed
assignee: Dan Smith (danms) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.