pxc grow hangs if shrink was called before

Bug #1544708 reported by Craig Vyvial
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack DBaaS (Trove)
Fix Released
High
Morgan Jones

Bug Description

Noticed that if you call cluster-shrink before you call cluster-grow there was an error and causes the cluster to hang up in a GROWING_CLUSTER state.

The issue is when looking for an existing instance in the cluster we are not ignoring the deleted instances.

2016-02-06 16:52:45.976 DEBUG oslo_messaging._drivers.amqpdriver [-] received message msg_id: None reply to None from (pid=22260) __call__ /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:197
2016-02-06 16:52:45.992 DEBUG trove.instance.models [-] Instance 571a0fa1-fd0b-4af6-845e-7da3795e1eaf service status is new. from (pid=22260) load_instance /opt/stack/trove/trove/instance/models.py:488
2016-02-06 16:52:46.001 INFO trove.taskmanager.models [-] Creating instance 571a0fa1-fd0b-4af6-845e-7da3795e1eaf.
2016-02-06 16:52:46.146 DEBUG oslo_messaging._drivers.amqpdriver [-] received message msg_id: None reply to None from (pid=22260) __call__ /usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:197
2016-02-06 16:52:46.162 DEBUG trove.common.strategies.cluster.strategy [-] Loading class trove.common.strategies.cluster.experimental.pxc.taskmanager.PXCTaskManagerStrategy from (pid=22260) load_taskmanager_strategy /opt/stack/trove/trove/common/strategies/cluster/strategy.py:37
2016-02-06 16:52:46.175 DEBUG trove.common.strategies.cluster.experimental.pxc.taskmanager [-] Begin pxc grow_cluster for id: 7f4238ce-6147-484e-85c2-568b9c724db6. from (pid=22260) grow_cluster /opt/stack/trove/trove/common/strategies/cluster/experimental/pxc/taskmanager.py:155
2016-02-06 16:52:46.186 ERROR trove.common.strategies.cluster.experimental.pxc.taskmanager [-] Error growing cluster 7f4238ce-6147-484e-85c2-568b9c724db6.
2016-02-06 16:52:46.186 TRACE trove.common.strategies.cluster.experimental.pxc.taskmanager Traceback (most recent call last):
2016-02-06 16:52:46.186 TRACE trove.common.strategies.cluster.experimental.pxc.taskmanager File "/opt/stack/trove/trove/common/strategies/cluster/experimental/pxc/taskmanager.py", line 225, in grow_cluster
2016-02-06 16:52:46.186 TRACE trove.common.strategies.cluster.experimental.pxc.taskmanager _grow_cluster()
2016-02-06 16:52:46.186 TRACE trove.common.strategies.cluster.experimental.pxc.taskmanager File "/opt/stack/trove/trove/common/strategies/cluster/experimental/pxc/taskmanager.py", line 162, in _grow_cluster
2016-02-06 16:52:46.186 TRACE trove.common.strategies.cluster.experimental.pxc.taskmanager if db_inst.id not in new_instance_ids]
2016-02-06 16:52:46.186 TRACE trove.common.strategies.cluster.experimental.pxc.taskmanager File "/opt/stack/trove/trove/instance/models.py", line 647, in load
2016-02-06 16:52:46.186 TRACE trove.common.strategies.cluster.experimental.pxc.taskmanager return load_instance(cls, context, id, needs_server=True)
2016-02-06 16:52:46.186 TRACE trove.common.strategies.cluster.experimental.pxc.taskmanager File "/opt/stack/trove/trove/instance/models.py", line 466, in load_instance
2016-02-06 16:52:46.186 TRACE trove.common.strategies.cluster.experimental.pxc.taskmanager db_info = get_db_info(context, id, include_deleted=include_deleted)
2016-02-06 16:52:46.186 TRACE trove.common.strategies.cluster.experimental.pxc.taskmanager File "/opt/stack/trove/trove/instance/models.py", line 449, in get_db_info
2016-02-06 16:52:46.186 TRACE trove.common.strategies.cluster.experimental.pxc.taskmanager raise exception.NotFound(uuid=id)
2016-02-06 16:52:46.186 TRACE trove.common.strategies.cluster.experimental.pxc.taskmanager NotFound: Resource 580097ca-ec56-4fb5-aaaa-644fff79349b cannot be found.
2016-02-06 16:52:46.186 TRACE trove.common.strategies.cluster.experimental.pxc.taskmanager

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to trove (master)

Fix proposed to branch: master
Review: https://review.openstack.org/279252

Changed in trove:
assignee: nobody → Craig Vyvial (cp16net)
status: New → In Progress
Craig Vyvial (cp16net)
Changed in trove:
importance: Undecided → High
milestone: none → mitaka-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on trove (master)

Change abandoned by Craig Vyvial (<email address hidden>) on branch: master
Review: https://review.openstack.org/279252
Reason: morgan will include this change in his patch of refactoring this code to be reused with the mariadb clustering.

Revision history for this message
Craig Vyvial (cp16net) wrote :

Morgan will fix this in his mariadb clustering patch.

Amrith Kumar (amrith)
Changed in trove:
milestone: mitaka-3 → mitaka-rc1
Revision history for this message
Amrith Kumar (amrith) wrote :

Morgan, I've assigned this to you per: Craig's comment of 02/12. Please confirm if this was fixed.

Changed in trove:
assignee: Craig Vyvial (cp16net) → Morgan Jones (6-morgan)
Amrith Kumar (amrith)
Changed in trove:
status: In Progress → Fix Committed
Amrith Kumar (amrith)
Changed in trove:
milestone: mitaka-rc1 → newton-1
Morgan Jones (6-morgan)
Changed in trove:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.