neutron cleanup service can interfere with agents on reboot
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Triaged
|
High
|
Brent Eagles |
Bug Description
originally reported here: https:/
From analysis included in rhbz:
This is exactly when this issue can appear: node reboot and high port count.
When the node is rebooted, the cleanup script is executed. If the port count is reduced, the script will remove the ports and the bridges fast enough not to interfere with the OVS agent. But as we can see in the high loaded node (hundreds of ports), the script takes more time to delete the ports [1]. When this loop is finished, the tunnel bridge is deleted [2] but at this point the OVS has already cached the br-tun datapath ID and assumes that this bridge will be always present (during the OVS agent execution, no other manual operation can be performed to the OVS instance).
This script should be executed first and the OVS agent start should be delayed until the script finalization.
The fix would be to make sure the cleanup service is complete before the agent's are started.
Changed in tripleo: | |
milestone: | none → wallaby-2 |
assignee: | nobody → Brent Eagles (beagles) |
importance: | Undecided → High |
status: | New → Triaged |
tags: | added: train-backport-potential |
tags: | added: queens-backport-potential |
Changed in tripleo: | |
milestone: | wallaby-2 → wallaby-3 |
Changed in tripleo: | |
milestone: | wallaby-3 → wallaby-rc1 |
Changed in tripleo: | |
milestone: | wallaby-rc1 → xena-1 |
Changed in tripleo: | |
milestone: | xena-1 → xena-2 |
Changed in tripleo: | |
milestone: | xena-2 → xena-3 |
Reviewed: https:/ /review. opendev. org/c/openstack /tripleo- heat-templates/ +/785217 /opendev. org/openstack/ tripleo- heat-templates/ commit/ a22239e27938e43 f08c0322bb78224 499d10a167
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/train
commit a22239e27938e43 f08c0322bb78224 499d10a167
Author: Brent Eagles <email address hidden>
Date: Thu Jan 28 13:28:11 2021 -0330
Add service ordering to cleanup service to avoid conflicts with agent startup
If the port cleanup takes too long, the neutron agents might begin
operations on the ovs bridges while cleanup is still ongoing. This can
cause undefined behavior and errors in the agent.
Change-Id: Ia0e31c9469033c 50a8b65af7fee1a df03b22d4c2 0aaedba980270df fbf4528fc3)
Closes-Bug: #1913623
(cherry picked from commit 0c20e1e1ac320e3