Openvswitch Agent - Connexion openvswitch DB Broken
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
Medium
|
Slawek Kaplonski | ||
neutron (Ubuntu) |
Incomplete
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
(For SRU template, please see bug 1869808, as the SRU info there applies to this bug also)
Hi all,
We have deployed more OpenStack plateform in my company.
We used kolla ansible to deploy our plateforms.
Here is the configuration that we applied :
kolla_base_distro: "centos"
kolla_install_type : "binary"
openstack_version : "stein"
Neutron architecture :
HA l3 enable
DVR enable
SNAT Enabled
multiple vlan provider : True
Note: Our plateforms are multi-region
Recently, we have upgraded a master region from rocky to stein with kolla ansible upgrade procedure.
Since ugrade, sometimes openvswitch agent lost connexion to ovsdb.
We have found this error in neutron-
And we have found this errors in ovsdb-server.log :
2020-02-
2020-02-
2020-02-
2020-02-
When we experience this issue, all "NORMAL" type flows inside br-ex doesn't get out.
Example of flows stuck:
(neutron-
cookie=
cookie=
cookie=
Workaround to solve this issue:
- stop openvswitch_db openvswitch_
- start containers: openvswitch_db openvswitch_
- start neutron_l3_agent neutron_
Note: we have keep ovs connection timeout options by default :
- of_connect_timeout: 300
- of_request_timeout: 300
- of_inactivity_
Thank you in advance for your help.
summary: |
- Openvswitch Agent - connexion openvswitch DB Broken + Openvswitch Agent - Connexion openvswitch DB Broken |
James Denton (james-denton) wrote : | #1 |
Acoss69 (acoss69) wrote : | #2 |
Hi,
I have join a screenshot which display the entire flow table for br-ex during the issue (for security reasons, i have hidden vlan id).
Thank you for your help.
Acoss69 (acoss69) wrote : | #3 |
- 2020-02-26 15_37_07-Safeguard Desktop Player.png Edit (242.3 KiB, image/png)
Hi,
Sorry, i have forgot to join the screen.
Thanks.
James Denton (james-denton) wrote : | #4 |
It is that first flow, the 'drop' flow, that is responsible for the issue. I am having a similar issue in a Stein environment and am trying to reproduce in a lab. That drop flow is implemented shortly after the disconnect observed in the logs.
Do you see a log message like this in the neutron-
> "Physical bridge br-ex was just re-created"
Shortly after that message, the drop flow is implemented and the cookies are changed for a subset of the flows. A restart of the agent only temporarily addresses the issue.
I am not positive, but think this patch may have something to do with it:
To address this temporarily, I made this change to the ovs agent code:
--- ovs_neutron_
+++ ovs_neutron_
@@ -2223,7 +2223,7 @@
# Check if any physical bridge wasn't recreated recently
- added_bridges)
+ idl_monitor.
sync |= bridges_recreated
# Notify the plugin of tunnel IP
if self.enable_
So far, it's working as expected. Still trying to find the root cause.
Acoss69 (acoss69) wrote : | #5 |
In fact, we have also found "Physical bridge br-ex was just re-created" in the openvswitch log (just after broken pipe error with openvswitch db).
Regarding the patch indicate in your message, it is apply in our configuration.
We supposed that when the "broken pipe" message appear, a table 0 with a drop action is created and the br-ex is recreated.
But we don't understand why the table 0 which drop packets stay active until we restart openvswitch_
Thanks in advance.
Acoss69 (acoss69) wrote : | #6 |
Just to know, can you tell me if you have reproduce this issue on your lab with kolla "centos7/binary" release?
Note: in our plateform, we currently use openvswitch 2.11.0-4 release.
Thanks in advance.
YAMAMOTO Takashi (yamamoto) wrote : | #7 |
confirmed by James Denton
tags: | added: ovs |
Changed in neutron: | |
importance: | Undecided → Medium |
status: | New → Confirmed |
James Denton (james-denton) wrote : | #8 |
I experienced this issue in OpenStack-Ansible Stein (Originally 19.0.2 and upgraded to 19.0.10) release on Ubuntu 18.04 LTS. So far I have been unable to reproduce in a virtual-based lab.
Can you humor me and share the make/model of your NIC connected to br-ex?
James Denton (james-denton) wrote : | #9 |
Also meant to mention OVS 2.11.0. Thanks.
Acoss69 (acoss69) wrote : | #10 |
Our compute node model is a Blade server HP BLGEn10.
The NIC model connected to br-ex is : FLB Adapter 1: HP FlexFabric 20Gb 2-port 630FLB Adapter
We noticed that we don't have experiencing this issue on network node which is a baremetal node (NIC model: HPE Ethernet 10Gb 2-port 562FLR-SFP+ Adpt).
Thanks in advance.
James Denton (james-denton) wrote : | #11 |
So, I have not seen this issue in production since implementing that small patch in https:/
However, I can sorta simulate what happens if/when the connection to :6640 is lost, which we did experience in production and Acoss69 referenced in the opening comment. This may help with developing a patch to the OVS agent that could help recover from this condition.
What we see is this: a normal set of flows on the provider bridge (br-ex or br-vlan, in this example):
Every 1.0s: ovs-ofctl dump-flows br-vlan compute1: Tue Mar 3 07:42:42 2020
NXST_FLOW reply (xid=0x4):
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
When we see "tcp:127.
...
2020-03-03 07:33:50.061 3705 INFO neutron.
2020-03-03 07:33:50.065 3705 INFO neutron.
2020-03-03 07:33:50.153 3705 INFO neutron.
2020-03-03 07:33:50.271 3705 INFO neutron.
Acoss69 (acoss69) wrote : | #12 |
Hi,
We have suffer again this week-end about the bug describe in above.
We noticed that the connection broken when our system is overloaded.
In fact, we have accumulate an other problem on our compute node : 35000 mount point.
@James Denton, can you tell us if you have integrate your patch "https:/
Thanks in advance.
Regards,
James Denton (james-denton) wrote : | #13 |
Hi -
I think my patch is/was just a bandaid for whatever is really happening.
We have seen similar issues in other environments lately, and further research led me to this bug:
https:/
Your logs indicated "no response to inactivity probe after 5 seconds", which you bumped to 10. Ours are already at 10, which may not be sufficient under heavy load. We are considering bumping to 30 seconds to see if this addresses the issue.
James Denton (james-denton) wrote : | #14 |
Update:
Modifying those inactivity probe timers did not appear to make a difference. In a few different Stein environments we are seeing similar behavior. If the connection to OVS is lost, like so:
2020-04-02 14:50:21.372 3526056 WARNING ovsdbapp.
2020-04-02 14:50:41.987 3526056 ERROR OfctlService [-] unknown dpid 112915483458894
2020-04-02 14:50:42.050 3526056 WARNING ovsdbapp.
Then the agent will implement a drop flow and few other flows (with new cookies) similar to this:
https:/
Here are some agent log messages around the time of the issue:
Slawek Kaplonski (slaweq) wrote : | #15 |
I looked into this issue today. I wasn't able to reproduce this issue on my local devstack setup.
The problem which I see (and You already mentioned) is that some rules with old cookie id stays in br-ex bridge. So maybe solution/workaround for that would be to add cleaning of rules with old cookie always when bridge was "re-created".
Normally it wouldn't be needed as when new bridge is created it will not have any OF rules, but maybe forcing to clean such leftovers would help neutron-ovs-agent to recover from such problem with connection to ovsdb.
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master) | #16 |
Fix proposed to branch: master
Review: https:/
Changed in neutron: | |
assignee: | nobody → Slawek Kaplonski (slaweq) |
status: | Confirmed → In Progress |
James Denton (james-denton) wrote : | #17 |
Thanks for looking into this.
There is definitely the issue of stale flows, but more importantly IMO is that DROP flow implemented in table 0 when it sees the bridge was "re-created":
NXST_FLOW reply (xid=0x4):
cookie=
That ends up dropping all outbound traffic from br-int, and only gets replaced on a restart of the agent.
James Denton (james-denton) wrote : | #18 |
Hi Slawek,
I've tested the patch. Here's a before and after:
BEFORE:
root@aio1:
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
Simulate disconnect with a restart of openvswitch-switch:
root@aio1:
Check flows. Notice the new cookies and the DROP flow in table 0:
root@aio1:
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
Restart the agent:
root@aio1:
Flows are rebuilt with a new cookie and the flow...
Slawek Kaplonski (slaweq) wrote : | #19 |
I'm not sure if this DROP rule is really a culprit of the issue.
It comes from: https:/
I checked on my local env with devstack and I also have such rule:
cookie=
but it doesn't causes any problems.
In Your logs from last comment, I also see that number of packets matched for that drop rule is 0.
And also, even when I'm restarting openvswitch-switch process, I can't reproduce this issue with flows on my env. But I'm doing it on master branch. Tomorrow I will try on stable/Stein.
Can You also tell me if You have e.g. dvr enabled in this setup? Or anything else which may help me to reproduce that.
James Denton (james-denton) wrote : | #20 |
Yes, this is OVS+DVR w/ openstack-ansible (fairly recent master). Neutron 16.0.0.0b2.dev61.
I simulated failure again by restarting openvswitch-switch. Here is an ongoing ping from the qdhcp namespace for a network mapped to VLAN 6 out br-ex (172.23.208.1 being the gateway, a physical firewall):
root@aio1:~# ip netns exec qdhcp-04a47906-
PING 172.23.208.1 (172.23.208.1) 56(84) bytes of data.
64 bytes from 172.23.208.1: icmp_seq=1 ttl=255 time=1.49 ms
64 bytes from 172.23.208.1: icmp_seq=2 ttl=255 time=0.576 ms
64 bytes from 172.23.208.1: icmp_seq=3 ttl=255 time=0.511 ms
64 bytes from 172.23.208.1: icmp_seq=4 ttl=255 time=0.495 ms
64 bytes from 172.23.208.1: icmp_seq=5 ttl=255 time=0.608 ms
64 bytes from 172.23.208.1: icmp_seq=6 ttl=255 time=0.502 ms
64 bytes from 172.23.208.1: icmp_seq=7 ttl=255 time=0.496 ms
64 bytes from 172.23.208.1: icmp_seq=8 ttl=255 time=0.494 ms
64 bytes from 172.23.208.1: icmp_seq=9 ttl=255 time=0.486 ms
64 bytes from 172.23.208.1: icmp_seq=10 ttl=255 time=0.551 ms
64 bytes from 172.23.208.1: icmp_seq=11 ttl=255 time=0.513 ms
64 bytes from 172.23.208.1: icmp_seq=12 ttl=255 time=0.516 ms
^C
--- 172.23.208.1 ping statistics ---
32 packets transmitted, 12 received, 62% packet loss, time 31708ms
rtt min/avg/max/mdev = 0.486/0.
The packet loss began when I restarted OVS. You can see the drop flow w/ packets matched (dropped outbound ICMP):
Every 1.0s: ovs-ofctl dump-flows br-ex aio1: Thu Apr 9 14:37:05 2020
NXST_FLOW reply (xid=0x4):
cookie=
cookie=
cookie=
cookie=
cookie=
cookie=
Note: This behavior is also seen on computes. I only have an AIO at the moment. On computes, the connection to between agent and ovsdb is lost which seems to trigger this, rather than a forced restart of openvswitch.
James Denton (james-denton) wrote : | #21 |
@Slawek - I was able to reproduce this behavior in Devstack (Master) configured for DVR. For your convenience, I have provided the output from "stock" devstack (legacy routers) and DVR-enabled devstack (dvr_snat).
NOTE: This seems to impact 'vlan' networks, NOT 'flat' networks, in a DVR scenario.
-=-=-=-=-=-
== STOCK ==
local.conf:
[[local|localrc]]
ADMIN_PASSWORD=
DATABASE_
RABBIT_
SERVICE_
Q_PLUGIN=ml2
Q_ML2_TENANT_
disable_service horizon cinder swift
DevStack Version: ussuri
Change: 01826e1c5b65e8d
OS Version: CentOS 7.7.1908 Core
2020-04-11 22:23:12.009 | stack.sh completed in 1034 seconds.
[jdenton@localhost devstack]$
[jdenton@localhost ~]$ cat /etc/neutron/
#agent_mode = legacy
[jdenton@localhost devstack]$ sudo ovs-ofctl dump-flows br-ex
cookie=
cookie=
cookie=
source openrc admin admin
openstack network create --provider-
openstack subnet create --subnet-range 192.168.77.0/24 --network vlan1000 subnet1000
[jdenton@localhost devstack]$ sudo ovs-ofctl dump-flows br-ex
cookie=
cookie=
cookie=
cookie=
>> Simulate disconnect by restarting OVS:
sudo systemctl restart openvswitch
[jdenton@localhost devstack]$ sudo ovs-ofctl dump-flows br-ex
cookie=
cookie=
cookie=
cookie=
>> Everything looks OK.
-=-=-=-=-=-=-
== DVR ==
local.conf:
[[local|localrc]]
ADMIN_PASSWORD=
DATABASE_
RABBIT_
SERVICE_
Q_PLUGIN=ml2
Q_ML2_TENANT_
Q_DVR_MODE=dvr_snat
disable_service horizon cinder swift
DevStack Version: ussuri
Change: 01826e1c5b65e8d
Acoss69 (acoss69) wrote : | #22 |
Hi,
We confirme that we have experience this issue only on our plateform with DVR enable and vlan external provider type.
We have an another plateform without DVR enable and we don't have suffer about this issue.
Thank you very much for your help and contribution.
Regards,
Slawek Kaplonski (slaweq) wrote : | #23 |
Hi,
Quick update. I am able to reproduce this issue with dvr enabled. I know more or less what is wrong there but I don't know yet exactly how to fix it. I will continue work on it.
OpenStack Infra (hudson-openstack) wrote : | #24 |
Fix proposed to branch: master
Review: https:/
Slawek Kaplonski (slaweq) wrote : | #25 |
Please try my last patch https:/
Acoss69 (acoss69) wrote : | #26 |
Hi,
We have apply your patch in our sandbox plateform (only on compute node).
We have kill openvswitch db to simulate a drop connexion between openvswitch agent and db.
After 5 minutes, we restart the openvswith db and all "NORMAL" type flows inside br-ex get out normaly.
We will continue tomorrow to test your patch in our plateforms.
Thanks a lot for all.
Wei Hui (huiweics) wrote : | #27 |
any idea why ovs agent lost connection to ovsdb-server, why physical bridge was re-create? if physical bridge was deleted then created, all physical flows lost, bus flows remained.
James Denton (james-denton) wrote : | #28 |
@Slawek - Happy to report that your patch appears to be working as described. I restarted the openvswitch service and observed the same flows with new cookies. The 'drop' flow was not implemented. This was tested on Ubuntu 18.04 w/ today's Master (victoria).
@Wei - I am not sure why, but we have noticed an increase of disconnects after deploying or upgrading to Stein. This would then result in the issue described here.
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master) | #29 |
Related fix proposed to branch: master
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master) | #30 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: master
commit 63c45b37666efc7
Author: Slawek Kaplonski <email address hidden>
Date: Thu Apr 9 14:37:38 2020 +0200
Ensure that stale flows are cleaned from phys_bridges
In case when neutron-ovs-agent will notice that any of physical
bridges was "re-created", we should also ensure that stale Open
Flow rules (with old cookie id) are cleaned.
This patch is doing exactly that.
Change-Id: I7c7c8a4c371d6f
Related-Bug: #1864822
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/ussuri) | #31 |
Related fix proposed to branch: stable/ussuri
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/train) | #32 |
Related fix proposed to branch: stable/train
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/stein) | #33 |
Related fix proposed to branch: stable/stein
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master) | #34 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: master
commit 91f0bf3c8511bf3
Author: Slawek Kaplonski <email address hidden>
Date: Tue Apr 21 10:30:52 2020 +0200
[DVR] Reconfigure re-created physical bridges for dvr routers
In case when physical bridge is removed and created again it
is initialized by neutron-ovs-agent.
But if agent has enabled distributed routing, dvr related
flows wasn't configured again and that lead to connectivity issues
in case of DVR routers.
This patch fixes it by adding configuration of dvr related flows
if distributed routing is enabled in agent's configuration.
It also adds reset list of phys_brs in dvr_agent. Without that there
were different objects used in ovs agent and dvr_agent classes thus
e.g. 2 various cookie ids were set on flows in physical bridge.
This was also the same issue in case when openvswitch was restarted and
all bridges were reconfigured.
Now in such case there is correctly new cookie_id configured for all
flows.
Change-Id: I710f00f0f542bc
Closes-Bug: #1864822
Changed in neutron: | |
status: | In Progress → Fix Released |
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ussuri) | #35 |
Fix proposed to branch: stable/ussuri
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/train) | #36 |
Fix proposed to branch: stable/train
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/stein) | #37 |
Fix proposed to branch: stable/stein
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/rocky) | #38 |
Fix proposed to branch: stable/rocky
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/queens) | #39 |
Fix proposed to branch: stable/queens
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/ussuri) | #40 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/ussuri
commit ae4b3edce23aa4d
Author: Slawek Kaplonski <email address hidden>
Date: Thu Apr 9 14:37:38 2020 +0200
Ensure that stale flows are cleaned from phys_bridges
In case when neutron-ovs-agent will notice that any of physical
bridges was "re-created", we should also ensure that stale Open
Flow rules (with old cookie id) are cleaned.
This patch is doing exactly that.
Change-Id: I7c7c8a4c371d6f
Related-Bug: #1864822
(cherry picked from commit 63c45b37666efc7
tags: | added: in-stable-ussuri |
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/rocky) | #41 |
Related fix proposed to branch: stable/rocky
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/queens) | #42 |
Related fix proposed to branch: stable/queens
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master) | #43 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: master
commit 45482e300aab781
Author: Slawek Kaplonski <email address hidden>
Date: Mon May 11 12:09:46 2020 +0200
Don't check if any bridges were recrected when OVS was restarted
In case when openvswitch was restarted, full sync of all bridges will
be always triggered by neutron-ovs-agent so there is no need to check
in same rpc_loop iteration if bridges were recreated.
Change-Id: I3cc1f1b7dc480d
Related-bug: #1864822
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/ussuri) | #44 |
Related fix proposed to branch: stable/ussuri
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ussuri) | #45 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/ussuri
commit b03f9d4c433dbe3
Author: Slawek Kaplonski <email address hidden>
Date: Tue Apr 21 10:30:52 2020 +0200
[DVR] Reconfigure re-created physical bridges for dvr routers
In case when physical bridge is removed and created again it
is initialized by neutron-ovs-agent.
But if agent has enabled distributed routing, dvr related
flows wasn't configured again and that lead to connectivity issues
in case of DVR routers.
This patch fixes it by adding configuration of dvr related flows
if distributed routing is enabled in agent's configuration.
It also adds reset list of phys_brs in dvr_agent. Without that there
were different objects used in ovs agent and dvr_agent classes thus
e.g. 2 various cookie ids were set on flows in physical bridge.
This was also the same issue in case when openvswitch was restarted and
all bridges were reconfigured.
Now in such case there is correctly new cookie_id configured for all
flows.
Change-Id: I710f00f0f542bc
Closes-Bug: #1864822
(cherry picked from commit 91f0bf3c8511bf3
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/ussuri) | #46 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/ussuri
commit b8e7886d8b52a14
Author: Slawek Kaplonski <email address hidden>
Date: Mon May 11 12:09:46 2020 +0200
Don't check if any bridges were recrected when OVS was restarted
In case when openvswitch was restarted, full sync of all bridges will
be always triggered by neutron-ovs-agent so there is no need to check
in same rpc_loop iteration if bridges were recreated.
Change-Id: I3cc1f1b7dc480d
Related-bug: #1864822
(cherry picked from commit 45482e300aab781
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/train) | #47 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/train
commit 0ce032275a1c6fa
Author: Slawek Kaplonski <email address hidden>
Date: Thu Apr 9 14:37:38 2020 +0200
Ensure that stale flows are cleaned from phys_bridges
In case when neutron-ovs-agent will notice that any of physical
bridges was "re-created", we should also ensure that stale Open
Flow rules (with old cookie id) are cleaned.
This patch is doing exactly that.
Change-Id: I7c7c8a4c371d6f
Related-Bug: #1864822
(cherry picked from commit 63c45b37666efc7
tags: | added: in-stable-train |
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/train) | #48 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/train
commit 2376c54000e5a6a
Author: Slawek Kaplonski <email address hidden>
Date: Tue Apr 21 10:30:52 2020 +0200
[DVR] Reconfigure re-created physical bridges for dvr routers
In case when physical bridge is removed and created again it
is initialized by neutron-ovs-agent.
But if agent has enabled distributed routing, dvr related
flows wasn't configured again and that lead to connectivity issues
in case of DVR routers.
This patch fixes it by adding configuration of dvr related flows
if distributed routing is enabled in agent's configuration.
It also adds reset list of phys_brs in dvr_agent. Without that there
were different objects used in ovs agent and dvr_agent classes thus
e.g. 2 various cookie ids were set on flows in physical bridge.
This was also the same issue in case when openvswitch was restarted and
all bridges were reconfigured.
Now in such case there is correctly new cookie_id configured for all
flows.
Change-Id: I710f00f0f542bc
Closes-Bug: #1864822
(cherry picked from commit 91f0bf3c8511bf3
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/stein) | #49 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/stein
commit 89fd5ec537f6991
Author: Slawek Kaplonski <email address hidden>
Date: Tue Apr 21 10:30:52 2020 +0200
[DVR] Reconfigure re-created physical bridges for dvr routers
In case when physical bridge is removed and created again it
is initialized by neutron-ovs-agent.
But if agent has enabled distributed routing, dvr related
flows wasn't configured again and that lead to connectivity issues
in case of DVR routers.
This patch fixes it by adding configuration of dvr related flows
if distributed routing is enabled in agent's configuration.
It also adds reset list of phys_brs in dvr_agent. Without that there
were different objects used in ovs agent and dvr_agent classes thus
e.g. 2 various cookie ids were set on flows in physical bridge.
This was also the same issue in case when openvswitch was restarted and
all bridges were reconfigured.
Now in such case there is correctly new cookie_id configured for all
flows.
Change-Id: I710f00f0f542bc
Closes-Bug: #1864822
(cherry picked from commit 91f0bf3c8511bf3
tags: | added: in-stable-stein |
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/rocky) | #50 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/rocky
commit 9087c3a4ea309d3
Author: Slawek Kaplonski <email address hidden>
Date: Thu Apr 9 14:37:38 2020 +0200
Ensure that stale flows are cleaned from phys_bridges
In case when neutron-ovs-agent will notice that any of physical
bridges was "re-created", we should also ensure that stale Open
Flow rules (with old cookie id) are cleaned.
This patch is doing exactly that.
Conflicts:
Change-Id: I7c7c8a4c371d6f
Related-Bug: #1864822
(cherry picked from commit 63c45b37666efc7
tags: | added: in-stable-rocky |
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/rocky) | #51 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/rocky
commit d3d93b4077428c3
Author: Slawek Kaplonski <email address hidden>
Date: Tue Apr 21 10:30:52 2020 +0200
[DVR] Reconfigure re-created physical bridges for dvr routers
In case when physical bridge is removed and created again it
is initialized by neutron-ovs-agent.
But if agent has enabled distributed routing, dvr related
flows wasn't configured again and that lead to connectivity issues
in case of DVR routers.
This patch fixes it by adding configuration of dvr related flows
if distributed routing is enabled in agent's configuration.
It also adds reset list of phys_brs in dvr_agent. Without that there
were different objects used in ovs agent and dvr_agent classes thus
e.g. 2 various cookie ids were set on flows in physical bridge.
This was also the same issue in case when openvswitch was restarted and
all bridges were reconfigured.
Now in such case there is correctly new cookie_id configured for all
flows.
Conflicts:
Change-Id: I710f00f0f542bc
Closes-Bug: #1864822
(cherry picked from commit 91f0bf3c8511bf3
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/queens) | #52 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/queens
commit 2baf0aa83f551b5
Author: Slawek Kaplonski <email address hidden>
Date: Thu Apr 9 14:37:38 2020 +0200
Ensure that stale flows are cleaned from phys_bridges
In case when neutron-ovs-agent will notice that any of physical
bridges was "re-created", we should also ensure that stale Open
Flow rules (with old cookie id) are cleaned.
This patch is doing exactly that.
Conflicts:
Change-Id: I7c7c8a4c371d6f
Related-Bug: #1864822
(cherry picked from commit 63c45b37666efc7
tags: | added: in-stable-queens |
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/queens) | #53 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/queens
commit 33217c9c43e6418
Author: Slawek Kaplonski <email address hidden>
Date: Tue Apr 21 10:30:52 2020 +0200
[DVR] Reconfigure re-created physical bridges for dvr routers
In case when physical bridge is removed and created again it
is initialized by neutron-ovs-agent.
But if agent has enabled distributed routing, dvr related
flows wasn't configured again and that lead to connectivity issues
in case of DVR routers.
This patch fixes it by adding configuration of dvr related flows
if distributed routing is enabled in agent's configuration.
It also adds reset list of phys_brs in dvr_agent. Without that there
were different objects used in ovs agent and dvr_agent classes thus
e.g. 2 various cookie ids were set on flows in physical bridge.
This was also the same issue in case when openvswitch was restarted and
all bridges were reconfigured.
Now in such case there is correctly new cookie_id configured for all
flows.
Conflicts:
Change-Id: I710f00f0f542bc
Closes-Bug: #1864822
(cherry picked from commit 91f0bf3c8511bf3
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/stein) | #54 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/stein
commit 6cf718c5056c1d9
Author: Slawek Kaplonski <email address hidden>
Date: Thu Apr 9 14:37:38 2020 +0200
Ensure that stale flows are cleaned from phys_bridges
In case when neutron-ovs-agent will notice that any of physical
bridges was "re-created", we should also ensure that stale Open
Flow rules (with old cookie id) are cleaned.
This patch is doing exactly that.
Change-Id: I7c7c8a4c371d6f
Related-Bug: #1864822
(cherry picked from commit 63c45b37666efc7
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/train) | #55 |
Related fix proposed to branch: stable/train
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/stein) | #56 |
Related fix proposed to branch: stable/stein
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/rocky) | #57 |
Related fix proposed to branch: stable/rocky
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/queens) | #58 |
Related fix proposed to branch: stable/queens
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/queens) | #59 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/queens
commit 2f4bb76338433f2
Author: Slawek Kaplonski <email address hidden>
Date: Mon May 11 12:09:46 2020 +0200
Don't check if any bridges were recrected when OVS was restarted
In case when openvswitch was restarted, full sync of all bridges will
be always triggered by neutron-ovs-agent so there is no need to check
in same rpc_loop iteration if bridges were recreated.
Conflicts:
Change-Id: I3cc1f1b7dc480d
Related-bug: #1864822
(cherry picked from commit 45482e300aab781
(cherry picked from commit b8e7886d8b52a14
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/stein) | #60 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/stein
commit a93e6c922c397de
Author: Slawek Kaplonski <email address hidden>
Date: Mon May 11 12:09:46 2020 +0200
Don't check if any bridges were recrected when OVS was restarted
In case when openvswitch was restarted, full sync of all bridges will
be always triggered by neutron-ovs-agent so there is no need to check
in same rpc_loop iteration if bridges were recreated.
Change-Id: I3cc1f1b7dc480d
Related-bug: #1864822
(cherry picked from commit 45482e300aab781
(cherry picked from commit b8e7886d8b52a14
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/train) | #61 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/train
commit 25078dd730c74bf
Author: Slawek Kaplonski <email address hidden>
Date: Mon May 11 12:09:46 2020 +0200
Don't check if any bridges were recrected when OVS was restarted
In case when openvswitch was restarted, full sync of all bridges will
be always triggered by neutron-ovs-agent so there is no need to check
in same rpc_loop iteration if bridges were recreated.
Change-Id: I3cc1f1b7dc480d
Related-bug: #1864822
(cherry picked from commit 45482e300aab781
(cherry picked from commit b8e7886d8b52a14
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/rocky) | #62 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/rocky
commit 8bba2a8f6e6500c
Author: Slawek Kaplonski <email address hidden>
Date: Mon May 11 12:09:46 2020 +0200
Don't check if any bridges were recrected when OVS was restarted
In case when openvswitch was restarted, full sync of all bridges will
be always triggered by neutron-ovs-agent so there is no need to check
in same rpc_loop iteration if bridges were recreated.
Conflicts:
Change-Id: I3cc1f1b7dc480d
Related-bug: #1864822
(cherry picked from commit 45482e300aab781
(cherry picked from commit b8e7886d8b52a14
Brian Murray (brian-murray) wrote : Missing SRU information | #63 |
Thanks for uploading the fix for this bug report to -proposed. However, when reviewing the package in -proposed and the details of this bug report I noticed that the bug description is missing information required for the SRU process. You can find full details at http://
Changed in neutron (Ubuntu): | |
status: | New → Incomplete |
description: | updated |
Dan Streetman (ddstreet) wrote : | #64 |
> bug description is missing information required for the SRU process
sorry, I updated the description to refer this bug to the sru template in bug 1869808 as that sru info applies to all other bugs in the upload.
Łukasz Zemczak (sil2100) wrote : Please test proposed package | #65 |
Hello Acoss69, or anyone else affected,
Accepted neutron into bionic-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-
Further information regarding the verification process can be found at https:/
N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.
Changed in neutron (Ubuntu Bionic): | |
status: | New → Fix Committed |
tags: | added: verification-needed verification-needed-bionic |
Edward Hope-Morley (hopem) wrote : | #66 |
All SRU verification completed and performed in https:/
tags: |
added: verification-done verification-done-bionic removed: verification-needed verification-needed-bionic |
Łukasz Zemczak (sil2100) wrote : Update Released | #69 |
The verification of the Stable Release Update for neutron has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.
Launchpad Janitor (janitor) wrote : | #70 |
This bug was fixed in the package neutron - 2:12.1.1-0ubuntu4
---------------
neutron (2:12.1.1-0ubuntu4) bionic; urgency=medium
* Fix interrupt of VLAN traffic on reboot of neutron-ovs-agent:
- d/p/0001-
- d/p/0002-
- d/p/0003-
- d/p/0004-
- d/p/0005-
- d/p/0006-
- d/p/0007-
-- Edward Hope-Morley <email address hidden> Mon, 22 Feb 2021 16:55:40 +0000
Changed in neutron (Ubuntu Bionic): | |
status: | Fix Committed → Fix Released |
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron queens-eol | #71 |
This issue was fixed in the openstack/neutron queens-eol release.
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron rocky-eol | #72 |
This issue was fixed in the openstack/neutron rocky-eol release.
Hi,
When this issue occurs, what does the entire flow table for br-ex look like? I am most curious about the first flow of table 0.
Thanks,
James