50 nodes bootstrap slow (tftp server problems)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Fix Committed
|
Medium
|
Vladimir Sharshov | ||
5.0.x |
Fix Committed
|
Medium
|
Vladimir Sharshov |
Bug Description
Steps To Reproduce:
1. Create environment with 50 servers.
2. Deploy the OpenStack.
3. Delete this environment.
Expected Result:
All 50 nodes will return to Fuel master node as slaves nodes.
Observed Result:
We lost several servers and can see only 33 or 48 servers. If we will reboot other servers, they will be added to Fuel again.
Looks like we have some problem with tftp performance, and all servers can't bootstrap from one Fuel master node in parallel.
Note:
this issue doesn't reproduced with small environments or if we will remove servers from environment with several steps (for example, remove 10 servers on each step).
In production environmnent we have more than 50 servers and this issue will be critical for administrators, which have large clouds.
How we can fix it:
1. We can remove nodes 1 by one with timeout (1 second per server) in the result tftp service should work fine.
2. We can improve performance of tftp service.
Changed in fuel: | |
assignee: | nobody → Vladimir Sharshov (vsharshov) |
Changed in fuel: | |
status: | Confirmed → In Progress |
My suggestion is to remove 5-10 nodes in cycle without sleeps. /github. com/stackforge/ fuel-astute/ blob/master/ lib/astute/ config. rb#L74
Also need to make this parameter configurable as it was done for nodes deployment
https:/