Fuel for OpenStack

[Upgrade]After the upgrade of Fuel master 8 > 9.1 old 8.0 cluster doesn't scale

Bug #1606823 reported by Sergey Novikov on 2016-07-27

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Fuel for OpenStack	Fix Committed	High	Dmitry Guryanov	Fuel for OpenStack 10.0
	Mitaka	Fix Released	High	Dmitry Guryanov	Fuel for OpenStack 9.1

Bug Description

Detailed bug description:

After the upgrade of Fuel master node from 8 to 9.1 with the existing 8.0 cluster and the further scaling this cluster fails with error in the puppet's log:

Could not run: Could not find file /etc/puppet/modules/osnailyfacter/modular/hiera/hiera.pp

This happens because of
The root cause is task "rsync_core_puppet" doesn't execute and manifests are not delivered to the deploying node

Was found that granular_deploy modifies graph in case of some nodes are affected by deployment (e.g. in case of scaling a cluster):

    2016-08-16 09:00:31.242 DEBUG [7f0057829880] (manager) There are nodes to deploy: node-9.test.domain.local
    2016-08-16 09:00:31.285 DEBUG [7f0057829880] (manager) There are nodes affected by deployment: node-7.test.domain.local node-4.test.domain.local node-5.test.domain.local node-8.test.domain.local node-
    2.test.domain.local node-6.test.domain.local node-3.test.domain.local node-1.test.domain.local

and this modification of the graph makes all non-reexecutable tasks as `skipped`:

    ...
    2016-08-16 14:59:03.297 DEBUG [7f0057829880] (orchestrator_graph) Task rsync_core_puppet will be skipped.
    2016-08-16 14:59:03.297 DEBUG [7f0057829880] (orchestrator_graph) Task clear_nodes_info will be skipped.
    2016-08-16 14:59:03.297 DEBUG [7f0057829880] (orchestrator_graph) Task generate_keys will be skipped.
    2016-08-16 14:59:03.297 DEBUG [7f0057829880] (orchestrator_graph) Task copy_keys will be skipped.
    2016-08-16 14:59:03.297 DEBUG [7f0057829880] (orchestrator_graph) Task generate_haproxy_keys will be skipped.
    2016-08-16 14:59:03.297 DEBUG [7f0057829880] (orchestrator_graph) Task copy_haproxy_keys will be skipped.
    2016-08-16 14:59:03.298 DEBUG [7f0057829880] (orchestrator_graph) Task sync_time will be skipped.
    ...

after that the graph is modified to handle only affected nodes and is incorrect for nodes that have to be deployed.

Steps to reproduce:
1. Deploy 8.0 Fuel node
2. Create cluster with 3 cntrl + 3 compute+ceph-osd
3. Upgrade Fuel master node from 8.0 to 9.1 (http://docs.openstack.org/developer/fuel-docs/userdocs/fuel-install-guide/upgrade/upgrade-fuel.html)
4. Add node with role "compute" to created cluster
5. Deploy changes

Expected result:
Deployment succeeds

Actual result:
Deployment fails as rsync_core_puppet task does not run on the newly added node.
This happens due to the fact that pre_|post_deployment stages do not contain any tasks. Here is a excerpt of debugged print from task.py for granular_deploy:

'post_deployment': [],
'pre_deployment': []

What's also important that the deployment succeeds if we provision the node first and deploy it then.

Workaround:
Some modifications can be done to have two graphs but it leads to run whole deployment graph on affected nodes too:

    diff --git a/nailgun/nailgun/task/task.py b/nailgun/nailgun/task/task.py
    index b5f3b6a..e985652 100644
    --- a/nailgun/nailgun/task/task.py
    +++ b/nailgun/nailgun/task/task.py
    @@ -330,9 +330,10 @@ class DeploymentTask(BaseDeploymentTask):
             cls._save_deployment_info(transaction, serialized_cluster)

             if affected_nodes:
    - graph.reexecutable_tasks(events)
    + reexec_graph = graph.copy()
    + reexec_graph.reexecutable_tasks(events)
                 serialized_cluster.extend(deployment_serializers.serialize(
    - graph, transaction.cluster, affected_nodes
    + reexec_graph, transaction.cluster, affected_nodes
                 ))
                 nodes = nodes + affected_nodes
             pre_deployment = stages.pre_deployment_serialize(

Octane's versions:
8.0.0-1.mos1192
9.0.0-1.mos1208

ISO's versions:
8.0 - http://paste.openstack.org/show/538992/
9.1 - http://paste.openstack.org/show/542518/

See original description

Tags:

Sergey Novikov (snovikov) on 2016-07-27

Changed in fuel:
importance:	Undecided → High

Vladimir Kuklin (vkuklin) on 2016-07-27

description:

updated

Ilya Kharin (akscram) on 2016-07-27

Changed in fuel:
status:	New → Confirmed

Alexander Kislitsky (akislitsky) on 2016-07-27

Changed in fuel:
assignee:	nobody → Fuel Octane (fuel-octane-team)

Sergey Shevorakov (sshevorakov) on 2016-08-10

tags:

added: blocker-for-qa

Ilya Kharin (akscram) on 2016-08-12

Changed in fuel:
assignee:	Fuel Octane (fuel-octane-team) → Ilya Kharin (akscram)

Revision history for this message

Ilya Kharin (akscram) wrote on 2016-08-19:

It seems that some regression of functionality in 9.0 was done for the granular_deployment method that improperly handle the deployment graph in case of affected nodes.

description:	updated
Changed in fuel:
assignee:	Ilya Kharin (akscram) → Fuel Sustaining (fuel-sustaining-team)

Maksim Malchuk (mmalchuk) on 2016-08-19

tags:

added: area-python

Georgy Kibardin (gkibardin) on 2016-08-19

Changed in fuel:
assignee:	Fuel Sustaining (fuel-sustaining-team) → Dmitry Guryanov (dguryanov)

Dmitry Pyzhov (dpyzhov) on 2016-08-19

Changed in fuel:
milestone:	9.1 → 10.0

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-08-30: Fix proposed to fuel-web (master)

Fix proposed to branch: master
Review: https://review.openstack.org/362673

Changed in fuel:
status:	Confirmed → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-09-07: Fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/362673
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=1cdc01bb687b2e7d50ef30221f8db239ebdac63a
Submitter: Jenkins
Branch: master

commit 1cdc01bb687b2e7d50ef30221f8db239ebdac63a
Author: Dmitry Guryanov <email address hidden>
Date: Mon Sep 5 17:53:03 2016 +0300

Fix granular deployment on operational cluster

Tasks on new nodes shouldn't be skipped in reexecute
filter.

Change-Id: I09148b81bd157e1884785b12e2438614f13e700b
Closes-Bug: 1606823

Changed in fuel:
status:	In Progress → Fix Committed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-09-07: Fix proposed to fuel-web (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/366795

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-09-08: Fix merged to fuel-web (stable/mitaka)

Reviewed: https://review.openstack.org/366795
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=c3f3d40b22ecb30acf76a8a87fb489f97115752f
Submitter: Jenkins
Branch: stable/mitaka

commit c3f3d40b22ecb30acf76a8a87fb489f97115752f
Author: Dmitry Guryanov <email address hidden>
Date: Wed Sep 7 17:51:37 2016 +0300

Fix granular deployment on operational cluster

Tasks on new nodes shouldn't be skipped in reexecute
filter.

Backported from 1cdc01bb687b2e7d50ef30221f8db239ebdac63a
Closes-Bug: 1606823

Change-Id: Iafbb486219ad007d424979b51c9a2db4f713127f

Revision history for this message

Vladimir Khlyunev (vkhlyunev) wrote on 2016-09-09:

https://product-ci.infra.mirantis.net/view/upgrades/job/9.x.upgrades.ubuntu.upgrade_smoke_tests/23/testReport/(root)/upgrade_smoke_scale/
looks fine, fix released

Revision history for this message

Vladimir Khlyunev (vkhlyunev) wrote on 2016-09-13:

After resolving of https://bugs.launchpad.net/fuel/+bug/1622579 - the issue got back with new symptoms(snapshot 255):

2016-09-13 14:42:05.839 ERROR [7fe3ce45a880] (manager) Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/nailgun/task/manager.py", line 61, in _call_silently
    to_return = method(task, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/nailgun/task/task.py", line 268, in message
    dry_run=dry_run, **kwargs
  File "/usr/lib/python2.7/site-packages/nailgun/task/task.py", line 146, in call_deployment_method
    args = getattr(cls, method)(transaction, **kwargs)
  File "/usr/lib/python2.7/site-packages/nailgun/task/task.py", line 382, in granular_deploy
    cls._extend_tasks_list(pre_deployment, pre_deployment_affected)
  File "/usr/lib/python2.7/site-packages/nailgun/task/task.py", line 320, in _extend_tasks_list
    t['uids'].extend(src_dict[t['id']]['uids'])
AttributeError: 'set' object has no attribute 'extend'

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-09-13: Fix proposed to fuel-web (master)

Fix proposed to branch: master
Review: https://review.openstack.org/369528

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-09-13: Fix proposed to fuel-web (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/369535

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-09-14: Fix merged to fuel-web (stable/mitaka)

#10

Reviewed: https://review.openstack.org/369535
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=5801a22173378541dfc4e4bb85873571a042539a
Submitter: Jenkins
Branch: stable/mitaka

commit 5801a22173378541dfc4e4bb85873571a042539a
Author: Dmitry Guryanov <email address hidden>
Date: Tue Sep 13 18:24:27 2016 +0300

convert uids to list before passing to make_*_task

RoleResolver returns set of uids, but functions, which
make tasks want list.

Change-Id: I959a61a53ff55da400423ac871a2b61366c75f9a
Closes-Bug: #1606823

Revision history for this message

Vladimir Khlyunev (vkhlyunev) wrote on 2016-09-15:

#11

https://product-ci.infra.mirantis.net/view/upgrades/job/9.x.upgrades.ubuntu.upgrade_smoke_tests/29/console it scales! snapshot 265

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-10-14: Fix included in openstack/fuel-web 10.0.0rc1

#12

This issue was fixed in the openstack/fuel-web 10.0.0rc1 release candidate.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-11-08: Fix merged to fuel-web (master)

#13

Reviewed: https://review.openstack.org/369528
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=68c0d42ae03f137589d38c94b5e7274b8eec8525
Submitter: Jenkins
Branch: master

commit 68c0d42ae03f137589d38c94b5e7274b8eec8525
Author: Dmitry Guryanov <email address hidden>
Date: Tue Sep 13 18:24:27 2016 +0300

convert uids to list before passing to make_*_task

RoleResolver returns set of uids, but functions, which
make tasks want list.

Change-Id: I959a61a53ff55da400423ac871a2b61366c75f9a
Closes-Bug: #1606823

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-12-07: Fix included in openstack/fuel-web 10.0.0

#14

This issue was fixed in the openstack/fuel-web 10.0.0 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-01-25: Fix proposed to fuel-web (stable/newton)

#16

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/425010

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-02-09: Fix merged to fuel-web (stable/newton)

#17

Reviewed: https://review.openstack.org/425010
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=0b5b853317bc759d6dec338e72f2242ba76584be
Submitter: Jenkins
Branch: stable/newton

commit 0b5b853317bc759d6dec338e72f2242ba76584be
Author: Dmitry Guryanov <email address hidden>
Date: Tue Sep 13 18:24:27 2016 +0300

convert uids to list before passing to make_*_task

RoleResolver returns set of uids, but functions, which
make tasks want list.

    Change-Id: I959a61a53ff55da400423ac871a2b61366c75f9a
    Closes-Bug: #1606823
    (cherry picked from commit 68c0d42ae03f137589d38c94b5e7274b8eec8525)

tags:

added: in-stable-newton

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-02-27: Fix included in openstack/fuel-web 11.0.0.0rc1

#18

This issue was fixed in the openstack/fuel-web 11.0.0.0rc1 release candidate.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.