[SWARM][8.0] Provision fail in "Stop reset cluster" scenario on CentOS Bootstrap
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Fix Released
|
High
|
Anton Chevychalov |
Bug Description
Steps to reproduce:
1. Choose CentOS bootstrap on master node
2. Bootstrap slaves
3. Verify bootstrap on slaves
4. Create cluster in Ha mode with 1 controller
5. Add 1 node with controller role
6. Add 1 node with compute role
7. Verify network
8. Deploy cluster
9. Stop deployment
10. Verify bootstrap on slaves
11. Add 1 node with cinder role
12. Re-deploy cluster
13. Reset cluster
14. Verify bootstrap on slaves
15. Re-deploy cluster
or use deploy_
test fail at line: https:/
Expected results: cluster deployment is successfull
Actual result: Task 'deploy' has incorrect status. error != ready
Traceback: http://
Reproducibility: 6 times in a row on ci: https:/
Changed in fuel: | |
assignee: | nobody → MOS Maintenance (mos-maintenance) |
Changed in fuel: | |
milestone: | none → 8.0-updates |
Changed in fuel: | |
importance: | Undecided → High |
status: | New → Confirmed |
milestone: | 8.0-updates → 8.0-mu-3 |
Changed in fuel: | |
assignee: | MOS Maintenance (mos-maintenance) → Anton Chevychalov (achevychalov) |
Changed in fuel: | |
status: | Confirmed → In Progress |
Changed in fuel: | |
status: | In Progress → Fix Committed |
tags: | added: on-verification |
Bug confirmed on test environment.
Reasons: git.openstack. org/cgit/ openstack/ fuel-nailgun- agent/tree/ agent?h= stable/ 8.0#n121) =bootstrap. That used by nailgun-agent as indicator of "provisioned" state (http:// git.openstack. org/cgit/ openstack/ fuel-nailgun- agent/tree/ agent?h= stable/ 8.0#n846 http:// git.openstack. org/cgit/ openstack/ fuel-nailgun- agent/tree/ agent?h= stable/ 8.0#n892
1. Message shows because there is no answer over RPC (rabbitmq) from nodes. (UI show information about first affected node only).
2. We have no answer because mcollective on affected nodes is not running.
3. Mcollective is not up because nailgun-agent did not shoot command "service mcollective restart" (http://
5. Start is blocked by hostname!
It is not clear in current moment what is the reason for node to be in that state.