Restarting OVS with DVR creates a network loop
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
In Progress
|
Undecided
|
Jakub Libosvar |
Bug Description
restarting OVS agent with DVR enabled creates a network loop between the external network and a tunneling network for a very short period of time. This causes big problems when 2 agents are restarted at the same time.
Steps to reproduce:
1) Have ml2/ovs with DVR enabled
2) Have a VM with a FIP on compute node A
3) Have a gw port for snat traffic on network node B
4) ping the FIP with -i 0.1 option to send icmp request every 0.1 seconds
5) restart OVS agents on both compute node A and network node B at the same time
Now the replies for the FIP traffic gets dropped on the compute node A for about 3-5 minutes because the loop causes that local OVS on compute node A learns that GW port MAC is on the tunneling interface. All reply traffic uses that MAC in its destination field and normal OVS action no longer floods such traffic but as per its FDB entry forwards it to the patch port between br-int and br-tun, where it's dropped until the FDB entry expires.
Changed in neutron: | |
status: | New → In Progress |
Related / closing patch: https:/ /review. opendev. org/c/openstack /neutron/ +/889752