Network verification fails with balance round-robin bonds
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Fix Committed
|
High
|
Artem Roma | ||
Mitaka |
Fix Released
|
High
|
Vladimir Kuklin |
Bug Description
Network verification fails if 'public' network is assigned to network bond in 'balance-rr' mode:
Verification failed.
Repo availability verification using public network failed on following nodes Untitled (be:be).
Following repos are not available - http://
. Check your public network settings and availability of the repositories from public network. Please examine nailgun and astute logs for additional details.
When few network interfaces are bonded, network checker takes only one its slave while testing connectivity to repositories:
D, [2016-02-
That works for bonds in 'active-backup' mode, but in case of round robin mode ports on network switch are aggregated too, so incoming traffic is balanced between interfaces. That's why a lot of data is lost if you try to use only one slave NIC on target node (4 interfaces are bonded):
root@bootstrap:~# ip a add 10.109.1.2/24 dev enp0s5
root@bootstrap:~# ip link set enp0s5 up
root@bootstrap:~# ping 10.109.1.1
PING 10.109.1.1 (10.109.1.1) 56(84) bytes of data.
64 bytes from 10.109.1.1: icmp_seq=5 ttl=64 time=0.166 ms
64 bytes from 10.109.1.1: icmp_seq=9 ttl=64 time=0.186 ms
^C
--- 10.109.1.1 ping statistics ---
12 packets transmitted, 2 received, 83% packet loss, time 11072ms
rtt min/avg/max/mdev = 0.166/0.
I think we have to disable network checks for balance-rr bonds like we previously did for LACP (see bug #1376908).
Steps to reproduce:
1. Prepare ports on network switch (aggregate appropriate links, enable round-robin mode)
2. Create new environment, choose Neutron + VXLAN
3. Add 3 controller and 2 compute+ceph nodes
4. Configure network bonds for all nodes using 'balance-rr' mode, assign public network to them
5. Verify networks
Expected result: verification is passed
Actual: verification fails, repositories aren't accessible via public network
Changed in fuel: | |
status: | New → Confirmed |
tags: | added: area-python team-network |
tags: | added: keep-in-9.0 |
Changed in fuel: | |
assignee: | Fuel Python Team (fuel-python) → Artem Roma (aroma-x) |
tags: | added: move-to-mu |
Changed in fuel: | |
status: | Fix Committed → Confirmed |
milestone: | 9.0 → 9.0-updates |
Changed in fuel: | |
milestone: | 9.1 → 9.2 |
Changed in fuel: | |
milestone: | 9.2 → 10.1 |
status: | Confirmed → Fix Committed |
Round-robin bonds is important configuration for our users. Changing priority to high because without working netchecker the only way to check feature health is by running manual deployment.