[1.10-30] Traffic Drop seen in a transparent service-chain case when one of the Service VMs is deleted
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Juniper Openstack |
Fix Committed
|
Critical
|
Naveen N | ||
R1.1 |
Fix Committed
|
Critical
|
Naveen N |
Bug Description
1]. Setup:
========
nodea26 - cfgm
10.204.216.140 & 10.204.216.141 - ctrl
nodeg16 & nodeg26 - compute
2]. Created a service-chain between left-vn(
3]. Started a ping from left-vm(10.10.10.5) to right-vm(
4]. Deleted Service VM 'trans-si-1_3' in Service Instance 'trans-si-1'.
5]. Ping between left-vm and right-vm fails.
The following stats seen on nodeg26 housing the left-vm :
root@nodeg26:~# flow -l
Flow table
Index Source:Port Destination:Port Proto(V)
-------
53776<=>286204 10.10.10.5:1522 20.20.20.2:0 1 (6)
(K(nh):101, Action:F, S(nh):35, Statistics:
196276<=>253160 20.20.20.2:1522 10.10.10.5:0 1 (5)
(K(nh):74, Action:F, S(nh):44, Statistics:
253160<=>196276 10.10.10.5:1522 20.20.20.2:0 1 (1->5)
(K(nh):10, Action:F, E:1, S(nh):10, Statistics:
286204<=>53776 20.20.20.2:1522 10.10.10.5:0 1 (6)
(K(nh):101, Action:F, S(nh):35, Statistics:0/0)
6]. Dropstats show an increase in Invalid Source :
Checksum errors 0
No Fmd 0
Ivalid VNID 0
Fragment errors 0
Invalid Source 3500
root@nodeg26:~# dropstats
.
.
Checksum errors 0
No Fmd 0
Ivalid VNID 0
Fragment errors 0
Invalid Source 3502
Getting the route to right-vm(
root@nodeg26:~# rt --dump 5 | grep 20.20.20.2
20.20.20.2/32 32 - 44
Getting the nh for 44 shows that it is a Composite next-Hop :
root@nodeg26:~# nh --get 44
Id:044 Type:Composite Fmly: AF_INET Flags:Valid, Policy, Ecmp, Rid:0 Ref_cnt:3
Sub NH(label): 35(56) 74(44)
Since the ECMP Index for the flow was shown as 1, need to check the nh for 74:
root@nodeg26:~# nh --get 74
Id:074 Type:Encap Fmly: AF_INET Flags:Valid, Rid:0 Ref_cnt:4
The Oif points to the left-interface of trans-si-1_2
root@nodeg26:~# vif --get 13
vif0/13 OS: tap04d31895-82
Vrf:2 Flags:SL3L2D MTU:9160 Ref:7
RX packets:1807 bytes:184314 errors:0
TX packets:3884 bytes:395808 errors:0
VRF table(vlan:vrf):
1:5,
Packets are seen exiting the right-interface of trans-si-1_2 as well :
root@nodeg26:~# tcpdump -eni tapa3d7a4b4-e5
tcpdump: WARNING: tapa3d7a4b4-e5: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tapa3d7a4b4-e5, link-type EN10MB (Ethernet), capture size 65535 bytes
15:17:30.014330 02:00:00:00:00:01 > 02:00:00:00:00:02, ethertype 802.1Q (0x8100), length 102: vlan 1, p 0, ethertype IPv4, 10.10.10.5 > 20.20.20.2: ICMP echo request, id 1522, seq 2299, length 64
15:17:31.022456 02:00:00:00:00:01 > 02:00:00:00:00:02, ethertype 802.1Q (0x8100), length 102: vlan 1, p 0, ethertype IPv4, 10.10.10.5 > 20.20.20.2: ICMP echo request, id 1522, seq 2300, length 64
The nh of 35 seen in the flow shows it to be a Tunnel, which is incorrect:
root@nodeg26:~# nh --get 35
Id:035 Type:Tunnel Fmly: AF_INET Flags:Valid, MPLSoUDP, Rid:0 Ref_cnt:22
Oif:0 Len:14 Flags Valid, MPLSoUDP, Data:00 25 90 c4 76 bd 00 25 90 c5 59 45 08 00
Vrf:0 Sip:22.22.22.26 Dip:22.22.22.16
Naveen and Praveen are aware of the issue.
I have kept the gcore at:
summary: |
- [1.10-30] Ping fails in a transparent service-chain case when one of the - Service VMs is deleted + [1.10-30] Traffic Drop seen in a transparent service-chain case when one + of the Service VMs is deleted |
information type: | Proprietary → Public |
tags: | added: releasenote |
Changed in juniperopenstack: | |
status: | New → Fix Committed |
Observation
==========
1]. Instead of deleting the SVMs, I shutdown(suspended) the SVMs . This changed the setup from ECMP to non ECMP.
2]. Powered-on the SVMs and the ECMP kicks in.
3]. Saw traffic loss because the ECMP Index is not set in reverse flow.
Praveen and Naveen are aware of the issue.