Fuel for OpenStack

50 nodes bootstrap slow (tftp server problems)

Series 5.0.x
Bug #1330938

Bug #1330938 reported by Timur Nurlygayanov on 2014-06-17

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Fuel for OpenStack	Fix Committed	Medium	Vladimir Sharshov	Fuel for OpenStack 5.1
	5.0.x	Fix Committed	Medium	Vladimir Sharshov	Fuel for OpenStack 5.0.1

Bug Description

Steps To Reproduce:
1. Create environment with 50 servers.
2. Deploy the OpenStack.
3. Delete this environment.

Expected Result:
All 50 nodes will return to Fuel master node as slaves nodes.

Observed Result:
We lost several servers and can see only 33 or 48 servers. If we will reboot other servers, they will be added to Fuel again.
Looks like we have some problem with tftp performance, and all servers can't bootstrap from one Fuel master node in parallel.

Note:
this issue doesn't reproduced with small environments or if we will remove servers from environment with several steps (for example, remove 10 servers on each step).

In production environmnent we have more than 50 servers and this issue will be critical for administrators, which have large clouds.

How we can fix it:
1. We can remove nodes 1 by one with timeout (1 second per server) in the result tftp service should work fine.
2. We can improve performance of tftp service.

Mike Scherbakov (mihgen) on 2014-06-18

Changed in fuel:
assignee:	nobody → Vladimir Sharshov (vsharshov)

Revision history for this message

Evgeniy L (rustyrobot) wrote on 2014-06-18:

My suggestion is to remove 5-10 nodes in cycle without sleeps.
Also need to make this parameter configurable as it was done for nodes deployment
https://github.com/stackforge/fuel-astute/blob/master/lib/astute/config.rb#L74

Revision history for this message

Vladimir Sharshov (vsharshov) wrote on 2014-06-30:

If we try to remove nodes without sleep, we will get a same problem, because remove operation take very little time and Cobbler is going to be loaded as well as before.

I suggest also add second parameter: remove_interval in seconds between remove operations and set it to 10 second as default. For the fifth dozen it will get 40 second difference which can solve this problem.

Vladimir Sharshov (vsharshov) on 2014-06-30

Changed in fuel:
status:	Confirmed → In Progress

Revision history for this message

Vladimir Sharshov (vsharshov) wrote on 2014-07-01:

https://review.openstack.org/#/c/103518/

Timur Nurlygayanov, can you help with a test?

Revision history for this message

Timur Nurlygayanov (tnurlygayanov) wrote on 2014-07-01:

we will test it when we will use Fuel 5.0 with MOX project.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-07-01: Fix merged to fuel-astute (master)

Reviewed: https://review.openstack.org/103518
Committed: https://git.openstack.org/cgit/stackforge/fuel-astute/commit/?id=2b3a8592cc71c6883d82f5bc4820641fba9292a2
Submitter: Jenkins
Branch: master

commit 2b3a8592cc71c6883d82f5bc4820641fba9292a2
Author: Vladimir Sharshov <email address hidden>
Date: Mon Jun 30 13:56:42 2014 +0400

Avoid high load for Cobbler TFTP when delete many nodes.

    Delete operation does not have limit for nodes, but it
    erase disks and reboot nodes after which they try to boot
    using network. Cobbler installed at master node provide
    ability to boot using network and have limited resources.

    To prevent high load for Cobbler this changes add two things:
    * split all nodes to groups (default - 10) which process
    in series;
    * wait some time (default - 10 sec) between such groups.

Both of this parameters can be changed in config.

Closes-Bug: #1330938

Change-Id: I9c3af6e8ab3c7c610e31baa6e58ec86aae20708d

Changed in fuel:
status:	In Progress → Fix Committed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-07-08: Fix proposed to fuel-astute (stable/5.0)

Fix proposed to branch: stable/5.0
Review: https://review.openstack.org/105459

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-07-08: Fix merged to fuel-astute (stable/5.0)

Reviewed: https://review.openstack.org/105459
Committed: https://git.openstack.org/cgit/stackforge/fuel-astute/commit/?id=741b4fa9b964d73d9ab8f2fbcf5bb02836c98da1
Submitter: Jenkins
Branch: stable/5.0

commit 741b4fa9b964d73d9ab8f2fbcf5bb02836c98da1
Author: Vladimir Sharshov <email address hidden>
Date: Mon Jun 30 13:56:42 2014 +0400

Avoid high load for Cobbler TFTP when delete many nodes.

    To prevent high load for Cobbler this changes add two things:
    * split all nodes to groups (default - 10) which process
    in series;
    * wait some time (default - 10 sec) between such groups.

Both of this parameters can be changed in config.

Closes-Bug: #1330938
Closes-Bug: #1339024

Backport from 5.1

Change-Id: I9c3af6e8ab3c7c610e31baa6e58ec86aae20708d

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-04-28: Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/178119

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-07-28: Change abandoned on fuel-library (master)

Change abandoned by Igor Shishkin (<email address hidden>) on branch: master
Review: https://review.openstack.org/178119
Reason: This review is > 4 weeks without comment and currently blocked by a core reviewer with a -2. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and contacting the reviewer with the -2 on this review to ensure you address their concerns.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-09-29:

#10

Change abandoned by Fuel DevOps Robot (<email address hidden>) on branch: master
Review: https://review.openstack.org/178119
Reason: This review is > 4 weeks without comment and currently blocked by a core reviewer with a -2. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and contacting the reviewer with the -2 on this review to ensure you address their concerns.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.