neutron

[OVN][HWOL] traffic problems when sriov and non-sriov ports are bound on the same hypervisor

Bug #2020168 reported by Michal Nasiadka on 2023-05-19

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	neutron	New	Medium	Rodolfo Alonso

Bug Description

Environment:
OpenStack Yoga
Mellanox ConnectX-6 cards
OpenvSwitch 2.17
OVN 22.09
ML2/OVN driver

I have two instances on one hypervisor in one vlan type network (tagged)
VM1 is Mellanox ASAP2 SRIOV port with binding profile "switchdev" (10.1.112.89)
VM2 is normal port instance (10.1.112.15)

In that VLAN we have an external router (10.1.112.254)

When both VMs are up - ping to external router - I get a reply for 1/2 first packets, and then nothing (the same with tcp traffic).

What is interesting - if I send the ICMP packets from VM1 to the gateway:
1. I can see ICMP echo request and reply packets on external OVS port (bond0):
# tcpdump -nei bond0 vlan 112 and icmp
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond0, link-type EN10MB (Ethernet), capture size 262144 bytes
07:23:14.722184 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype 802.1Q (0x8100), length 102: vlan 112, p 0, ethertype IPv4, 10.1.112.89 > 10.1.112.254: ICMP echo request, id 2, seq 1, length 64
07:23:14.722395 1c:34:da:b0:97:68 > fa:16:3e:34:dd:93, ethertype 802.1Q (0x8100), length 102: vlan 112, p 0, ethertype IPv4, 10.1.112.254 > 10.1.112.89: ICMP echo reply, id 2, seq 1, length 64
07:23:15.723068 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype 802.1Q (0x8100), length 102: vlan 112, p 0, ethertype IPv4, 10.1.112.89 > 10.1.112.254: ICMP echo request, id 2, seq 2, length 64
(and then it stops)

2. I can see the ICMP echo requests on VM2 port (but no replies).
# tcpdump -nei tap53e35d44-27 icmp
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap53e35d44-27, link-type EN10MB (Ethernet), capture size 262144 bytes
07:18:10.991163 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 1, seq 1, length 64
07:18:11.992577 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 1, seq 2, length 64
07:18:12.993063 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 1, seq 3, length 64
07:18:14.018573 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 1, seq 4, length 64
07:18:15.043013 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 1, seq 5, length 64
07:18:16.066584 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 1, seq 6, length 64
07:18:17.090599 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 1, seq 7, length 64

Is this a Neutron bug or rather an OVN/OpenvSwitch bug?

Tags:

Elvira García Ruiz (elviragr) on 2023-05-22

tags:

added: ovn sriov-pci-pt

Revision history for this message

Rodolfo Alonso (rodolfo-alonso-hernandez) wrote on 2023-05-22:

Hello Michal:

If I'm not wrong, what you are using is ML2/OVN with HW offload, right? I'm saying that in order to make this distinction clear. In shake of clarity, is better if the remove the SRIOV tag from the title and add HWOL (just to avoid confusing that with ML2/SRIOV, that could be used too with ML2/OVN).

I have some questions:
* Are you using FIPs?
* Did you try pinging another IP on the external network?
* In your reployment, do you have [1]?
* Related to the last point, what is the router configuration? Network attached, type, etc.
* Do you have HA? How many controllers do you have? I'm assuming the GW is in one of these controllers.
* Did you do a full trace of the ICMP packets? I mean, tracking the packet from the VM, though the compute node interface, the switch, the controller HW interface, the controller GW port, etc.

Regards.

[1]https://review.opendev.org/q/I25e5ee2cf8daee52221a640faa7ac09679742707

Revision history for this message

Michal Nasiadka (mnasiadka) wrote on 2023-05-22:

Download full text (4.3 KiB)

Yes, ML2/OVN with HW offload.

I have some questions:
* Are you using FIPs?

* Did you try pinging another IP on the external network?

Yes, other IPs work (the destination IP is a VRR on Cumulus Linux switch)

* In your deployment, do you have [1]?

Yes I do, I did run ovn sync util after updating to a version containing that before raising the bug

* Related to the last point, what is the router configuration? Network attached, type, etc.

The router it outside of OpenStack/Neutron (VRR on Cumulus Linux switch)

* Do you have HA? How many controllers do you have? I'm assuming the GW is in one of these controllers.

3 controllers, but the GW is not on them

* Did you do a full trace of the ICMP packets? I mean, tracking the packet from the VM, though the compute node interface, the switch, the controller HW interface, the controller GW port, etc.

Yes, I did tracing on:
1) ens0f1_6 (HW offload instance port)

(first one/two replies, then nothing)
listening on ens1f0_6, link-type EN10MB (Ethernet), capture size 262144 bytes
11:09:02.305764 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 1, length 64
11:09:02.307805 1c:34:da:b0:97:68 > fa:16:3e:34:dd:93, ethertype IPv4 (0x0800), length 98: 10.1.112.254 > 10.1.112.89: ICMP echo reply, id 14, seq 1, length 64
11:09:03.307190 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 2, length 64
11:09:03.307912 1c:34:da:b0:97:68 > fa:16:3e:34:dd:93, ethertype IPv4 (0x0800), length 98: 10.1.112.254 > 10.1.112.89: ICMP echo reply, id 14, seq 2, length 64
11:09:04.308288 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 3, length 64
11:09:05.313646 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 4, length 64
11:09:06.337654 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 5, length 64
11:09:07.361663 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 6, length 64
11:09:08.385652 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 7, length 64
11:09:09.409641 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 8, length 64
11:09:10.433645 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 9, length 64
11:09:10.786348 1c:34:da:b0:97:68 > fa:16:3e:34:dd:93, ethertype IPv4 (0x0800), length 62: 10.1.112.252 > 10.1.112.89: ICMP echo reply, id 1540, seq 1, length 28
11:09:11.457645 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 14, seq 10, length 64

2) tap53e35d44-27
(request packets arriving)
11:08:2...

Yes, ML2/OVN with HW offload.

I have some questions:
* Are you using FIPs?

* Did you try pinging another IP on the external network?

Yes, other IPs work (the destination IP is a VRR on Cumulus Linux switch)

* In your deployment, do you have [1]?

Yes I do, I did run ovn sync util after updating to a version containing that before raising the bug

* Related to the last point, what is the router configuration? Network attached, type, etc.

The router it outside of OpenStack/Neutron (VRR on Cumulus Linux switch)

* Do you have HA? How many controllers do you have? I'm assuming the GW is in one of these controllers.

3 controllers, but the GW is not on them

* Did you do a full trace of the ICMP packets? I mean, tracking the packet from the VM, though the compute node interface, the switch, the controller HW interface, the controller GW port, etc.

Yes, I did tracing on:
1) ens0f1_6 (HW offload instance port)

2) tap53e35d44-27
(request packets arriving)
11:08:22.433669 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 13, seq 179, length 64
11:08:23.457669 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 13, seq 180, length 64
11:08:24.481684 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 13, seq 181, length 64
11:08:25.505683 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype IPv4 (0x0800), length 98: 10.1.112.89 > 10.1.112.254: ICMP echo request, id 13, seq 182, length 64

3) bond0 (external port)

listening on bond0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:10:14.515228 fa:16:3e:34:dd:93 > 00:00:5e:00:01:12, ethertype 802.1Q (0x8100), length 102: vlan 112, p 0, ethertype IPv4, 10.1.112.89 > 10.1.112.254: ICMP echo request, id 15, seq 1, length 64
11:10:14.515452 1c:34:da:b0:97:68 > fa:16:3e:34:dd:93, ethertype 802.1Q (0x8100), length 102: vlan 112, p 0, ethertype IPv4, 10.1.112.254 > 10.1.112.89: ICMP echo reply, id 15, seq 1, length 64

Basically when I stop one of the instances - then everything works as it should.

Michal Nasiadka (mnasiadka) on 2023-05-22

summary:

- [OVN][SRIOV] traffic problems when sriov and non-sriov ports are bound
- on the same hypervisor
+ [OVN][HWOL] traffic problems when sriov and non-sriov ports are bound on
+ the same hypervisor

Rodolfo Alonso (rodolfo-alonso-hernandez) on 2023-05-23

Changed in neutron:
assignee:	nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)
importance:	Undecided → Medium

Revision history for this message

Michal Nasiadka (mnasiadka) wrote on 2023-05-23 (last edit on 2023-05-23):

According to imaximets in #openvswitch - this might be rather kernel CT bug or VF bug - so I'll try updating the firmware for the NIC and updating kernel + NVIDIA OFED drivers - and retest.

Revision history for this message

Rodolfo Alonso (rodolfo-alonso-hernandez) wrote on 2023-05-23:

Ok, I'll wait for your update.

In this case, reading c#2, the Neutron network configuration seems trivial: the VM port is sending the traffic to the VLAN network; this traffic is egressing the host via the HWOL NIC. As you mentioned, we are not considering any Neutron GW, router or any other network device.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.