Deployment was stuck as one node was stuck on reboot
Bug #1438933 reported by
Sergii Golovatiuk
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Fix Committed
|
High
|
Łukasz Oleś |
Bug Description
On large deployment installation we had a situation when one node was stuck on reboot (20 minutes)
root@node-16:~# uptime -s
2015-03-31 16:58:09
though in astute.log I see
2015-03-31T18:27:32 debug: [535] 135c09a6-
2015-03-31T18:27:33 debug: [535] Retry #1 to run mcollective agent on nodes: '16'
which means the reboot was issues somewhere around 16:25-26
We should add tolerate functions like what we do for provisioning.
Changed in fuel: | |
status: | New → Triaged |
importance: | Undecided → High |
assignee: | nobody → Łukasz Oleś (loles) |
milestone: | none → 6.1 |
Changed in fuel: | |
status: | Triaged → Won't Fix |
status: | Won't Fix → In Progress |
tags: | added: module-astute |
Changed in fuel: | |
assignee: | Łukasz Oleś (loles) → Evgeniy L (rustyrobot) |
Changed in fuel: | |
assignee: | Evgeniy L (rustyrobot) → Łukasz Oleś (loles) |
To post a comment you must log in.
Deployment fails if pre_deployment_ action fails on any node. It doesn't fail during pre_deploy action and during deploy. I will prepare a fix