[EVPN VXLAN] Multi Homing: Traffic from BMS to VM dropped at compute for Invalid NH
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
R4.1 |
Fix Committed
|
Critical
|
Divakar Dharanalakota | |||
Trunk |
Fix Committed
|
Critical
|
Divakar Dharanalakota |
Bug Description
Tring this with private agent binary for L2 ECMP
Description:
When BMS mac is having composite netxthop, ICMP traffic from BMS to VM is getting dropped at respective compute due to Invalid NextHOP. This is happening on when composite next hop is programmed. Eventually, agent is programming the vtep IP of the QFX where MAC is locally learned and traffic is resuming.
Steps to reproduce:
Step 1: Initially Traffic is fine. Vtep source from where traffic is coming is only programmed in agent.
root@5b11s15:~# tcpdump -ni ens2f1 udp port 4789
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens2f1, link-type EN10MB (Ethernet), capture size 262144 bytes
13:25:19.465489 IP 172.16.2.1.52689 > 172.16.
IP 1.1.1.6 > 1.1.1.10: ICMP echo request, id 3535, seq 9, length 64
13:25:19.465708 IP 172.16.
IP 1.1.1.10 > 1.1.1.6: ICMP echo reply, id 3535, seq 9, length 64
13:25:20.465482 IP 172.16.2.1.52689 > 172.16.
IP 1.1.1.6 > 1.1.1.10: ICMP echo request, id 3535, seq 10, length 64
root@5b11s15:~# rt --dump 2 --family bridge
Flags: L=Label Valid, Df=DHCP flood, Mm=Mac Moved, L2c=L2 Evpn Control Word, N=New Entry, Ec=EvpnControlP
vRouter bridge table 0/2
Index DestMac Flags Label/VNID Nexthop Stats
31264 0:0:5e:0:1:0 Df - 3 15942
40992 90:e2:ba:c4:2e:6c LDf 4 29 75
54820 2:72:59:8:e6:1 - 31 13314950
112924 ff:ff:ff:ff:ff:ff LDf 4 37 1075
170732 2:62:15:24:9c:d6 LDf 4 19 0
229640 90:e2:ba:a7:30:ad Df - 3 0
root@5b11s15:~# nh --get 29
Id:29 Type:Tunnel Fmly: AF_INET Rid:0 Ref_cnt:3 Vrf:0
Oif:0 Len:14 Data:84 b5 9c c8 00 00 90 e2 ba a7 30 ad 08 00
root@5b11-qfx2# run show ethernet-switching table
MAC flags (S - static MAC, D - dynamic MAC, L - locally learned, P - Persistent static
SE - statistics enabled, NM - non configured MAC, R - remote PE MAC, O - ovsdb MAC)
Ethernet switching table : 3 entries, 3 learned
Routing instance : default-switch
Vlan MAC MAC Logical Active
name address flags interface source
contrail_
contrail_
contrail_
Ethernet switching table : 3 entries, 3 learned
Routing instance : default-switch
Vlan MAC MAC Logical Active
name address flags interface source
contrail_
contrail_
contrail_
Step 2: Disable the active interface on QFX2 . So traffic is only learned on 172.16.3.1. Agent programs the next hop accordingly and traffic continues.
{master:0}[edit]
root@5b11-qfx2# set interfaces xe-0/0/46 disable
root@5b11s15:~# rt --dump 2 --family bridge
Flags: L=Label Valid, Df=DHCP flood, Mm=Mac Moved, L2c=L2 Evpn Control Word, N=New Entry, Ec=EvpnControlP
vRouter bridge table 0/2
Index DestMac Flags Label/VNID Nexthop Stats
31264 0:0:5e:0:1:0 Df - 3 15966
40992 90:e2:ba:c4:2e:6c LDf 4 18 76
54820 2:72:59:8:e6:1 - 31 13315203
112924 ff:ff:ff:ff:ff:ff LDf 4 34 1075
170732 2:62:15:24:9c:d6 LDf 4 19 0
229640 90:e2:ba:a7:30:ad Df - 3 0
root@5b11s15:~# nh --get 18
Id:18 Type:Tunnel Fmly: AF_INET Rid:0 Ref_cnt:3 Vrf:0
Oif:0 Len:14 Data:84 b5 9c c8 00 00 90 e2 ba a7 30 ad 08 00
Step 3: Now enable the interface on QFX2. Here from sometime in agent both qfx is getting programmed for BMS mac. As long BMS mac is composite nexthop traffic dropped in compute for Invalid NH. Eventually agent again programs the only active vtep QFX and traffic resumes.
root@5b11s15:~# rt --dump 2 --family bridge
Flags: L=Label Valid, Df=DHCP flood, Mm=Mac Moved, L2c=L2 Evpn Control Word, N=New Entry, Ec=EvpnControlP
vRouter bridge table 0/2
Index DestMac Flags Label/VNID Nexthop Stats
31264 0:0:5e:0:1:0 Df - 3 16001
40992 90:e2:ba:c4:2e:6c LDf -1 33 428
54820 2:72:59:8:e6:1 - 31 13315557
112924 ff:ff:ff:ff:ff:ff LDf 4 36 1075
170732 2:62:15:24:9c:d6 LDf 4 19 0
229640 90:e2:ba:a7:30:ad Df - 3 0
root@5b11s15:~# nh --get 33
Id:33 Type:Composite Fmly: AF_INET Rid:0 Ref_cnt:2 Vrf:2
Valid Hash Key Parameters: Proto,SrcIP,
Sub NH(label): 29(4) 18(4)
Id:29 Type:Tunnel Fmly: AF_INET Rid:0 Ref_cnt:3 Vrf:0
Oif:0 Len:14 Data:84 b5 9c c8 00 00 90 e2 ba a7 30 ad 08 00
Id:18 Type:Tunnel Fmly: AF_INET Rid:0 Ref_cnt:3 Vrf:0
Oif:0 Len:14 Data:84 b5 9c c8 00 00 90 e2 ba a7 30 ad 08 00
root@5b11s15:~# dropstats | grep -v " 0$"
IF Drop 3
Flow Action Drop 11184705
Flow Queue Limit Exceeded 36
Discards 13
Cloned Original 50
Invalid NH 1409
Invalid Mcast Source 2
Invalid Source 187
No L2 Route 3
root@5b11s15:~# dropstats | grep -v " 0$"
IF Drop 3
Flow Action Drop 11184705
Flow Queue Limit Exceeded 36
Discards 13
Cloned Original 50
Invalid NH 1421
Invalid Mcast Source 2
Invalid Source 192
No L2 Route 3
Changed in juniperopenstack: | |
assignee: | Manish Singh (manishs) → Divakar Dharanalakota (ddivakar) |
Gcore in problemetic state copied to /auto/cores/1724681