L3HA behavior is preemptive in OVN
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
networking-ovn |
New
|
Undecided
|
Unassigned |
Bug Description
3HA behavior is preemptive in OVN, since that would balance back the routers to the original chassis,
As part of QE test plan review, some thoughts were raised that this behavior increases downtime of the external connectivity.
In that case, we will experience the failure twice.
This is a regression compared to the L3HA on ML2/OVS version.
In L3HA on ML/OVS version when failure detected the master router is moving to another controller node.
If the "failed" controller node is back to life it will not become the Active gateway node.
This is how L3HA in OVN is designed and implemented. When high priority chassis comes back it will claim all HA master routers.
Version-Release number of selected component (if applicable):
Queen
How reproducible:
always
Steps to Reproduce:
1. deploy HA OVN setup
2. create network, router & Instances
3. create failover of the Active gateway node
4. after one of the back gateway node become to be the Active, return "failed" controller node up.
This bug / RFE would belong on ovs-discuss / ovs-dev mail list.
We discussed it during development [1], may be it's worth working on it on ovn-northd.
The failover/failback is so quick that I'm not sure it's even worth working on such feature which adds extra complexity.
[1] https:/ /lists. linux-foundatio n.org/pipermail /ovs-dev/ 2017-May/ 332825. html (see the 2nd part)