VM with tap flow lost connection if tap flow port is on provider network

Bug #2034445 reported by Felipe Figueroa Vergara
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
tap-as-a-service
In Progress
Undecided
Lajos Katona

Bug Description

I have two VMs set up on different compute hosts, each with a port connected to a provider network. When attempting to configure port mirroring for a VM port on the provider network, the VM loses its connection, and the tap service is unable to mirror traffic. Upon analyzing the flow entries created for the tap flow and service, I identified the problematic flow that disrupts the VM's connection:

table=0, n_packets=0, n_bytes=0, priority=20,dl_dst=fa:16:3e:7f:7d:d1 actions=NORMAL,mod_vlan_vid:3902,output:"patch-int-tap"

I observed that deleting this flow manually from the br-int resolves the issue and restores connectivity to the VM. But, the incoming mirrored traffic to the tap service is still lost; i.e. on the tap service, as spected, I see only outgoing traffic from the mirrored VM, because the conflictive flow allows the mirror for incoming traffic by assigning the vlan_tci to all traffic addressed to the VM mirrored and send it to the tap bridge.

If I recreate the conflictive flow but changing the actions to actions=output:6,mod_vlan_vid:3900,output:patch-int-tap, where port 6 is the port attached to the mirrored VM on br-int, the VM recover connectivity and the mirroring functionality works as expected.

It appears that using the "NORMAL" action is causing conflicts when the mirrored port is connected to a provider network.

Tags: ovs
Revision history for this message
Lajos Katona (lajos-katona) wrote :

Thanks for reporting, I check this issue.

tags: added: ovs
Changed in tap-as-a-service:
assignee: nobody → Lajos Katona (lajos-katona)
Revision history for this message
Lajos Katona (lajos-katona) wrote :

I was able tot reproduce this issue.
By providernet you mean:

openstack network create my_provider_net --provider-network-type flat --provider-physical-network physnet1 ?

Removing the flow action normal seems to be working also, thanks for proposing it.

Changed in tap-as-a-service:
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tap-as-a-service (master)
Changed in tap-as-a-service:
status: Confirmed → In Progress
Revision history for this message
Lajos Katona (lajos-katona) wrote :

As I wrote to the comment on the patch, your idea to change Normal action to output:tap-xy is working with the provider nets, the issue is on the other hand destroys the other usecase where we have non-provider networks (like vxlan or similar). so I have to dig a little more, if you have an idea for that also , don't hesitate :-)

Revision history for this message
Felipe Figueroa Vergara (felipeafv) wrote :

Thanks for taking the time to address this issue.
I have now identified the problem related to non-provider networks. I will carefully examine the code to devise a method for performing a check on the network type before creating the flow, to determine whether the port belongs to a provider or non-provider network.

Revision history for this message
Felipe Figueroa Vergara (felipeafv) wrote :

Investigating a solution to the problem, I believe I have reached something concrete. In the `create_tap_flow` function ( https://github.com/openstack/tap-as-a-service/blob/master/neutron_taas/services/taas/drivers/linux/ovs_taas.py#L324 ), the `port` object is used, which contains the `network_id` attribute. As we need to determine whether we are in a provider network or not in order to create one flow or another, we can use this `network_id` to query the Neutron DB and perform the validation.
In the Neutron DB, I see the networksegments(id, network_id, network_type, physical_network, ...) table, and by using the `physical_network` field, we can determine whether the network is of provider or non-provider type.
The `network_type` field will also be useful because I have noticed that when the provider network is of type VLAN, a `strip_vlan` action is required as the first step in the flow for it to work correctly.

If I'm not leaving anything out, with these changes, we can enable TAAS for both provider and non-provider networks. I hope you can review and validate what I have mentioned.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tap-as-a-service (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tap-as-a-service/+/896515

Revision history for this message
Felipe Figueroa Vergara (felipeafv) wrote :

After the last comment we decided to propose the patch with the changes needed to enables the mirroring in external network without breaking the internal network setup. You can check the changes that we propose here https://review.opendev.org/c/openstack/tap-as-a-service/+/896515.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tap-as-a-service (master)

Change abandoned by "Lajos Katona <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tap-as-a-service/+/894539
Reason: https://review.opendev.org/c/openstack/tap-as-a-service/+/896515 is a better approach

Revision history for this message
Felipe Figueroa Vergara (felipeafv) wrote :

The changes on https://review.opendev.org/c/openstack/tap-as-a-service/+/896515 were tested successfully on TAAS with mirror with external network of type VLAN and FLAT, and also tested with internal networks of type VXLAN.

Revision history for this message
Lajos Katona (lajos-katona) wrote :

Thanks, beginning of next week I will also test and review it

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.