[Upgrade]After the upgrade of Fuel master 8 > 9.1 old 8.0 cluster doesn't scale
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Fix Committed
|
High
|
Dmitry Guryanov | ||
Mitaka |
Fix Released
|
High
|
Dmitry Guryanov |
Bug Description
Detailed bug description:
After the upgrade of Fuel master node from 8 to 9.1 with the existing 8.0 cluster and the further scaling this cluster fails with error in the puppet's log:
Could not run: Could not find file /etc/puppet/
This happens because of
The root cause is task "rsync_core_puppet" doesn't execute and manifests are not delivered to the deploying node
Was found that granular_deploy modifies graph in case of some nodes are affected by deployment (e.g. in case of scaling a cluster):
2016-08-16 09:00:31.242 DEBUG [7f0057829880] (manager) There are nodes to deploy: node-9.
2016-08-16 09:00:31.285 DEBUG [7f0057829880] (manager) There are nodes affected by deployment: node-7.
2.test.
and this modification of the graph makes all non-reexecutable tasks as `skipped`:
...
2016-08-16 14:59:03.297 DEBUG [7f0057829880] (orchestrator_
2016-08-16 14:59:03.297 DEBUG [7f0057829880] (orchestrator_
2016-08-16 14:59:03.297 DEBUG [7f0057829880] (orchestrator_
2016-08-16 14:59:03.297 DEBUG [7f0057829880] (orchestrator_
2016-08-16 14:59:03.297 DEBUG [7f0057829880] (orchestrator_
2016-08-16 14:59:03.297 DEBUG [7f0057829880] (orchestrator_
2016-08-16 14:59:03.298 DEBUG [7f0057829880] (orchestrator_
...
after that the graph is modified to handle only affected nodes and is incorrect for nodes that have to be deployed.
Steps to reproduce:
1. Deploy 8.0 Fuel node
2. Create cluster with 3 cntrl + 3 compute+ceph-osd
3. Upgrade Fuel master node from 8.0 to 9.1 (http://
4. Add node with role "compute" to created cluster
5. Deploy changes
Expected result:
Deployment succeeds
Actual result:
Deployment fails as rsync_core_puppet task does not run on the newly added node.
This happens due to the fact that pre_|post_
'post_
'pre_
What's also important that the deployment succeeds if we provision the node first and deploy it then.
Workaround:
Some modifications can be done to have two graphs but it leads to run whole deployment graph on affected nodes too:
diff --git a/nailgun/
index b5f3b6a..e985652 100644
--- a/nailgun/
+++ b/nailgun/
@@ -330,9 +330,10 @@ class DeploymentTask(
if affected_nodes:
- graph.reexecuta
+ reexec_graph = graph.copy()
+ reexec_
- graph, transaction.
+ reexec_graph, transaction.
))
Octane's versions:
8.0.0-1.mos1192
9.0.0-1.mos1208
ISO's versions:
8.0 - http://
9.1 - http://
Changed in fuel: | |
importance: | Undecided → High |
description: | updated |
Changed in fuel: | |
status: | New → Confirmed |
Changed in fuel: | |
assignee: | nobody → Fuel Octane (fuel-octane-team) |
tags: | added: blocker-for-qa |
Changed in fuel: | |
assignee: | Fuel Octane (fuel-octane-team) → Ilya Kharin (akscram) |
tags: | added: area-python |
Changed in fuel: | |
assignee: | Fuel Sustaining (fuel-sustaining-team) → Dmitry Guryanov (dguryanov) |
Changed in fuel: | |
milestone: | 9.1 → 10.0 |
It seems that some regression of functionality in 9.0 was done for the granular_deployment method that improperly handle the deployment graph in case of affected nodes.