LACP bond is in down state after deployment

Bug #1566974 reported by Leontiy Istomin
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Aleksandr Didenko
8.0.x
Fix Released
High
Alexey Stupnikov
Mitaka
Fix Released
High
Aleksandr Didenko

Bug Description

Detailed bug description:
deployment has been failed on verification of Ceph deployment step: http://paste.openstack.org/show/493192/
ceph osd tree: http://paste.openstack.org/show/493193/
Have found that bond0 is in down state, but bonded interfaces are in up state: http://paste.openstack.org/show/493195/
"ip link show bond0" command didn't help
But resetting status of bonded interfaces did: http://paste.openstack.org/show/493197/

Steps to reproduce:
environment with id=1 has been resetted and deployed again.

Expected results:
environmet successfully deployed

Actual result:
Deployment was failed

Reproducibility:
Didn't try to reproduce

Workaround:
It seems can reset states of bonded interfaces and click "deploy changes"

Impact:
Deployment with bonded interfaces

Description of the environment:
 Operation system: Ubuntu
 Versions of components: MOS8.0
 Reference architecture: 3 controllers, 7 computes+ceph
 Network model: vxlan segmentation
 Related projects installed: ironic and elasticsearch plugins are installed, but not used in the environment
Additional information:
There nailgun-agent and fuel-agent was changed due the following bugs:
https://bugs.launchpad.net/fuel/+bug/1543233
https://bugs.launchpad.net/fuel/+bug/1543221
[root@fuel ~]# cat /etc/fuel/8.0/version.yaml
VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "8.0"
  api: "1.0"
  build_number: "570"
  build_id: "570"
  fuel-nailgun_sha: "558ca91a854cf29e395940c232911ffb851899c1"
  python-fuelclient_sha: "4f234669cfe88a9406f4e438b1e1f74f1ef484a5"
  fuel-agent_sha: "658be72c4b42d3e1436b86ac4567ab914bfb451b"
  fuel-nailgun-agent_sha: "b2bb466fd5bd92da614cdbd819d6999c510ebfb1"
  astute_sha: "b81577a5b7857c4be8748492bae1dec2fa89b446"
  fuel-library_sha: "c2a335b5b725f1b994f78d4c78723d29fa44685a"
  fuel-ostf_sha: "3bc76a63a9e7d195ff34eadc29552f4235fa6c52"
  fuel-mirror_sha: "fb45b80d7bee5899d931f926e5c9512e2b442749"
  fuelmenu_sha: "78ffc73065a9674b707c081d128cb7eea611474f"
  shotgun_sha: "63645dea384a37dde5c01d4f8905566978e5d906"
  network-checker_sha: "a43cf96cd9532f10794dce736350bf5bed350e9d"
  fuel-upgrade_sha: "616a7490ec7199f69759e97e42f9b97dfc87e85b"
  fuelmain_sha: "d605bcbabf315382d56d0ce8143458be67c53434"

logs from wrong nodes (63 and 144) and fuel node are here:
http://mos-scale-share.mirantis.com/node-144_etc.tar.gz
http://mos-scale-share.mirantis.com/node-144_logs.tar.gz
http://mos-scale-share.mirantis.com/node-63_etc.tar.gz
http://mos-scale-share.mirantis.com/node-63_logs.tar.gz
http://mos-scale-share.mirantis.com/fuel_logs.tar.gz

tags: added: area-library feature-advanced-networking
Changed in fuel:
milestone: none → 9.0
assignee: nobody → Fuel Library Team (fuel-library)
importance: Undecided → High
status: New → Confirmed
tags: added: team-network
Revision history for this message
Leontiy Istomin (listomin) wrote :

Has been reproduced. Description of env is in this bug description https://bugs.launchpad.net/fuel/+bug/1569234
Nodes which have bond0 in down state http://paste.openstack.org/show/493730/

Revision history for this message
Leontiy Istomin (listomin) wrote :

cat /proc/net/bonding/bond0 from wron node-41: http://paste.openstack.org/show/493753/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/304728

Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Aleksandr Didenko (adidenko)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/304728
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=28d225069f7a038b9068d1237e062a2f8b2bb453
Submitter: Jenkins
Branch: master

commit 28d225069f7a038b9068d1237e062a2f8b2bb453
Author: Aleksandr Didenko <email address hidden>
Date: Tue Apr 12 17:59:40 2016 +0200

    Increase default updelay and downdelay for bonds

    Default 200ms value may be a bit too low for some network/switches
    configurations. Increasing it to 3 sec for 'updelay' and to 1 sec
    for 'downdelay' to avoid intermitent issues with bonds (especially
    with LACP).

    Doc-Impact

    Change-Id: Ia614448f1bef1e7c4ccdfa2a8ea77c0b259b4474
    Closes-bug: #1566974

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/305812

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/mitaka)

Reviewed: https://review.openstack.org/305812
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=35a1878a02e5f8c08e2b52dd8f2b7fbc2296e9ee
Submitter: Jenkins
Branch: stable/mitaka

commit 35a1878a02e5f8c08e2b52dd8f2b7fbc2296e9ee
Author: Aleksandr Didenko <email address hidden>
Date: Tue Apr 12 17:59:40 2016 +0200

    Increase default updelay and downdelay for bonds

    Default 200ms value may be a bit too low for some network/switches
    configurations. Increasing it to 3 sec for 'updelay' and to 1 sec
    for 'downdelay' to avoid intermitent issues with bonds (especially
    with LACP).

    Doc-Impact

    Change-Id: Ia614448f1bef1e7c4ccdfa2a8ea77c0b259b4474
    Closes-bug: #1566974
    (cherry picked from commit 28d225069f7a038b9068d1237e062a2f8b2bb453)

Dmitry Pyzhov (dpyzhov)
no longer affects: fuel/newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/8.0)

Fix proposed to branch: stable/8.0
Review: https://review.openstack.org/319341

Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

We have a situation here: this patch can't be tested in virtual environment, but Leontiy Istomin confirmed that updelay and downdelay changes fixes the issue completely on Mitaka. So it is possible to backport it to MOS8. Tests will be made on scale lab in June (we have scheduled MOS8 scale tests for that time).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/8.0)

Reviewed: https://review.openstack.org/319341
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=ae4add6843475a5f3b0274410327fa55817e0859
Submitter: Jenkins
Branch: stable/8.0

commit ae4add6843475a5f3b0274410327fa55817e0859
Author: Aleksandr Didenko <email address hidden>
Date: Tue Apr 12 17:59:40 2016 +0200

    Increase default updelay and downdelay for bonds

    Default 200ms value may be a bit too low for some network/switches
    configurations. Increasing it to 3 sec for 'updelay' and to 1 sec
    for 'downdelay' to avoid intermitent issues with bonds (especially
    with LACP).

    Doc-Impact

    Change-Id: Ia614448f1bef1e7c4ccdfa2a8ea77c0b259b4474
    Closes-bug: #1566974
    (cherry picked from commit 28d225069f7a038b9068d1237e062a2f8b2bb453)

tags: added: on-verification
Revision history for this message
Kyrylo Galanov (kgalanov) wrote :

Verified on ISO #465 RC1

Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.