State computation incorrect

Bug #1499065 reported by Tim Hinrichs
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
congress
Fix Released
Critical
Rui Chen
Liberty
Fix Released
Critical
Rui Chen

Bug Description

It seems there's a problem with how we compute the change in state after pulling the current state from external services and before we send the delta to the policy engine. Below is a trace showing that one server has status PAUSED; then the next poll it has status ACTIVE, but there is an empty delta sent to the policy engine.

***** Poll 1: Notice alpha: PAUSED, so raw data is giving PAUSED. Also notice that the data set that gets published has PAUSED, as we would expect. *****

2015-09-23 20:06:08.065 INFO congress.dse.deepsix [-] nova:: polling
2015-09-23 20:06:08.283 INFO congress.datasources.nova_driver [-] nova:: Server: alpha: PAUSED
...
2015-09-23 20:06:08.533 DEBUG congress.dse.deepsix [-] nova:: publishing to dataindex servers with data set([('fc8a769c-f173-44db-8660-e4885899808e', 'alpha', '762628cc2f28361d7ce8a46b774857fb2212520b941f2aed61ca8d65', 'PAUSED', '366f97a203354c6883dd5fb26822e49a', 'acbca360725a4745b22074c0ac5a60e6', 'c30a094a-fb43-44c8-8728-fd8b6a0e8da5', '1')]) from (pid=3713) log_debug /opt/stack/congress/congress/dse/deepsix.py:549
2015-09-23 20:06:08.534 DEBUG congress.dse.deepsix [-] nova:: pushing dataindex servers to subscribers {'engine': {'type': 'push', 'correlationId': '82f67632-6124-463a-b814-6219e94c21f6'}} and requesters {} from (pid=3713) log_debug /opt/stack/congress/congress/dse/deepsix.py:549
2015-09-23 20:06:08.536 DEBUG congress.dse.deepsix [-] nova:: to_add: set([]) from (pid=3713) log_debug /opt/stack/congress/congress/dse/deepsix.py:549
2015-09-23 20:06:08.536 DEBUG congress.dse.deepsix [-] nova:: to_del: set([]) from (pid=3713) log_debug /opt/stack/congress/congress/dse/deepsix.py:549
2015-09-23 20:06:08.539 INFO congress.dse.deepsix [-] nova:: finished polling
...

***** Poll 2: Notice alpha: ACTIVE, so the raw data is saying ACTIVE, but the data set that gets published still has PAUSED, where it should have ACTIVE. *****

2015-09-23 20:06:18.547 INFO congress.dse.deepsix [-] nova:: polling
2015-09-23 20:06:18.739 INFO congress.datasources.nova_driver [-] nova:: Server: alpha: ACTIVE
...
2015-09-23 20:06:19.002 DEBUG congress.dse.deepsix [-] nova:: publishing to dataindex servers with data set([('fc8a769c-f173-44db-8660-e4885899808e', 'alpha', '762628cc2f28361d7ce8a46b774857fb2212520b941f2aed61ca8d65', 'PAUSED', '366f97a203354c6883dd5fb26822e49a', 'acbca360725a4745b22074c0ac5a60e6', 'c30a094a-fb43-44c8-8728-fd8b6a0e8da5', '1')]) from (pid=3713) log_debug /opt/stack/congress/congress/dse/deepsix.py:549
2015-09-23 20:06:19.002 DEBUG congress.dse.deepsix [-] nova:: pushing dataindex servers to subscribers {'engine': {'type': 'push', 'correlationId': '82f67632-6124-463a-b814-6219e94c21f6'}} and requesters {} from (pid=3713) log_debug /opt/stack/congress/congress/dse/deepsix.py:549
2015-09-23 20:06:19.002 DEBUG congress.dse.deepsix [-] nova:: to_add: set([]) from (pid=3713) log_debug /opt/stack/congress/congress/dse/deepsix.py:549
2015-09-23 20:06:19.003 DEBUG congress.dse.deepsix [-] nova:: to_del: set([]) from (pid=3713) log_debug /opt/stack/congress/congress/dse/deepsix.py:549

Revision history for this message
Tim Hinrichs (thinrichs) wrote :

The problem is that the nova datasource driver takes the raw data and computes the wrong tuples. Since the structure of the tuples looks okay, I don't think it's a problem with the algorithm that converts JSON-like objects into tuples. I'd say there's a bug in update_state_on_changed code in datasource_utils.

I tried to look through that code, but I don't quite understand what's happening.

description: updated
Revision history for this message
Tim Hinrichs (thinrichs) wrote :

Assigning this to Rui Chen as he wrote that code.

Changed in congress:
assignee: nobody → Rui Chen (kiwik-chenrui)
status: New → Triaged
Revision history for this message
Tim Hinrichs (thinrichs) wrote :

Rui Chen might be able to find the problem quickly.

Changed in congress:
assignee: Rui Chen (kiwik-chenrui) → nobody
Revision history for this message
Rui Chen (kiwik-chenrui) wrote :

I will try to fix this bug in next one and two days.

Changed in congress:
assignee: nobody → Rui Chen (kiwik-chenrui)
Revision history for this message
Rui Chen (kiwik-chenrui) wrote :

This is a interesting bug, spend my some time to find out the root reason.

in @update_state_on_changed we use == to check whether the raw_data is equals with the cache data in nova datasource, but the nova server object's __eq__() method is overrided to return True if the two object have the same id, even if they have the different status, like: "ACTIVE" and "PAUSED" in novaclient. So the change of server status isn't pushed to PE.

The implemented of method __eq__() is weird and don't match the semantics of object equal. The objects that have different value should be different objects.

https://github.com/openstack/oslo-incubator/blob/master/openstack/common/apiclient/base.py#L522-L523

Because novaclient and cinderclient use the oslo-incubator.apiclient.base.Resource as super class, we should fix it in oslo-incubator. I had filed a bug in oslo-incubator project to trace the issue.

https://bugs.launchpad.net/oslo-incubator/+bug/1499369

The code that we prepare to do fix is in oslo-incubator, if that patch can be merged, then we need to sync the patch to novaclient and cinderclient manually, so I think it will spend some time to complete in three or more separate projects, so I will push a temporary patch to fix this issue in Congress side in order to avoid the issue exist in stable/liberty branch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to congress (master)

Fix proposed to branch: master
Review: https://review.openstack.org/227342

Changed in congress:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to congress (master)

Reviewed: https://review.openstack.org/227342
Committed: https://git.openstack.org/cgit/openstack/congress/commit/?id=897bcfca781835e73736d9e232a647a817aa8759
Submitter: Jenkins
Branch: master

commit 897bcfca781835e73736d9e232a647a817aa8759
Author: Rui Chen <email address hidden>
Date: Thu Sep 24 22:22:58 2015 +0800

    Fix state computation incorrect

    The Resource of oslo-incubator is used as the
    super class of client object in novaclient and
    cinderclient, but the implemented of __eq__ method
    don't match the semantics of object equal. Check
    equal between two objects by using obj1 == obj2,
    that will always be True if the objects have the
    same id, even if they have the different other
    attribute value. The issue had been traced by
    oslo-incubator bug/1499369, fix it in Congress side
    as temporary workaround in order to avoid the issue
    exist in stable/liberty branch.

    Change-Id: I4304c32432dc1f813377c60704f817b4f3020da2
    Closes-Bug: #1499065

Changed in congress:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to congress (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/227402

Eric K (ekcs)
Changed in congress:
status: Fix Committed → Fix Released
Tim Hinrichs (thinrichs)
Changed in congress:
status: Fix Released → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to congress (stable/liberty)

Reviewed: https://review.openstack.org/227402
Committed: https://git.openstack.org/cgit/openstack/congress/commit/?id=3ff3e846c102cb99c8fb3d0259566b307e6f8fdb
Submitter: Jenkins
Branch: stable/liberty

commit 3ff3e846c102cb99c8fb3d0259566b307e6f8fdb
Author: Rui Chen <email address hidden>
Date: Thu Sep 24 22:22:58 2015 +0800

    Fix state computation incorrect

    The Resource of oslo-incubator is used as the
    super class of client object in novaclient and
    cinderclient, but the implemented of __eq__ method
    don't match the semantics of object equal. Check
    equal between two objects by using obj1 == obj2,
    that will always be True if the objects have the
    same id, even if they have the different other
    attribute value. The issue had been traced by
    oslo-incubator bug/1499369, fix it in Congress side
    as temporary workaround in order to avoid the issue
    exist in stable/liberty branch.

    Change-Id: I4304c32432dc1f813377c60704f817b4f3020da2
    Closes-Bug: #1499065
    (cherry picked from commit 897bcfca781835e73736d9e232a647a817aa8759)

tags: added: in-stable-liberty
Tim Hinrichs (thinrichs)
description: updated
no longer affects: congress/liberty
Tim Hinrichs (thinrichs)
Changed in congress:
milestone: none → liberty-rc2
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/congress 2.0.0

This issue was fixed in the openstack/congress 2.0.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.