Implement consistency check and self-healing for SDN-managed fabrics
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
New
|
Wishlist
|
Unassigned |
Bug Description
When SDN mechanism driver is used in Neutron (on our site we use mlnx_sdn_assist but this issue isn’t limited just to this driver, we hear about similar issues with at least three other SDN solutions) there is no consistency checking applied to the fabric past the initial port configuration. If there is an issue with the SDN layer after Neutron issues the request to the SDN controller and the requested configuration is not implemented appropriately, there is no way for Neutron to know about this. Ideally such scenarios should not happen but the feedback from operators indicates that these issues occasionally do happen for a variety of reasons and when they happen the user impact is significant as the state of neutron and SDN needs to be merged manually which is generally non-trivial.
If SDN mechanism drivers are not used and the standard openvswitch based networking is configured, neutron-
It would be very valuable for the SDN-based cloud operators to be able to:
Have neutron poll SDN to check the state of each of the ports and
Have neutron “push” the state of each port to make sure that the SDN state is consistent with neutron state
Ensure that each SDN solution supported with OpenStack provides support for those actions
Initially these actions could be triggered manually (or from a monitoring system) and later on it would likely become a periodic task adding self-healing capabilities to SDN-based OpenStack installations.
Changed in neutron: | |
importance: | Undecided → Wishlist |
Hi,
Isn't https:/ /review. opendev. org/#/c/ 565463/ something what tries to address this problem already? Can You check this proposed spec? Thx.