[RFE] Active-active L3 Gateway with Multihoming

Bug #2002687 reported by Dmitrii Shcherbakov
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
In Progress
Wishlist
Dmitrii Shcherbakov

Bug Description

Some network designs include multiple L3 gateways to:

* Share the load across different gateways;
* Provide independent network paths for the north-south direction (e.g. via
  different ISPs).

Having multi-homing implemented at the instance level imposes additional burden
on the end user of a cloud and support requirements for the guest OS, whereas
utilizing ECMP and BFD at the router side alleviates the need for instance-side
awareness of a more complex routing setup.

Adding more than one gateway port implies extending the existing data model
which was described in the multiple external gateways spec (https://specs.openstack.org/openstack/neutron-specs/specs/xena/multiple-external-gateways.html). However, it left
adding additional gateway routes out of scope leaving this to future
improvements around dynamic routing. Also the focus of neutron-dynamic-routing
has so far been around advertising routes, not accepting new ones from the
external peers - so dynamic routing support like this is a very different
subject. However, manual addition of extra routes does not utilize the default
gateway IP information available from subnets in Neutron while this could be
addressed by implementing an extra conditional behavior when adding more than
one gateway port to a router.

ECMP routes can result in black-holing of traffic should the next-hop of a
route becomes unreachable. BFD is a standard protocol adopted by IETF
for next-hop failure detection which can be used for route eviction. OVN
supports BFD as of v21.03.0 (https://github.com/ovn-org/ovn/commit/6e0a69ad4bcdf9e4cace5c73ef48ab06065e8519) with a data model that allows enabling
BFD on a per next-hop basis by associating BFD session information with routes,
however, it is not modeled at the Neutron level even if a backend supports it.

From the Neutron data model perspective, ECMP for routes is already a supported
concept since ECMP support spec got implemented (https://specs.openstack.org/openstack/neutron-specs/specs/wallaby/l3-router-support-ecmp.html) in Wallaby (albeit the
spec focused on the L3-agent based implementation only).

As for OVN and BFD, the OVN database state needs to be populated by Neutron
based on the data from the Neutron database, therefore, data model changes to
the Neutron DB are needed to represent the BFD session parameters.

---

The previous work on multiple gateway ports did not get completed and the neutron-lib changes were reverted. Likewise, the scope of this RFE is bigger with some overlap and augmentation compared to prior art. The spec will follow for this RFE with more details as to how the data model and API changes are proposed to be made.

Upd: https://review.opendev.org/c/openstack/neutron-specs/+/870030

Tags: rfe-approved
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-specs (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron-specs/+/870030

description: updated
Changed in neutron:
importance: Undecided → Wishlist
assignee: nobody → Dmitrii Shcherbakov (dmitriis)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-lib (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron-lib/+/870887

Revision history for this message
Lajos Katona (lajos-katona) wrote :

Hi, interesting proposal, thanks for it.
There was a spec for BFD also:
https://specs.openstack.org/openstack/neutron-specs/specs/xena/bfd_support.html, not sure it is useful as we aimed OVS originally.

The dynamic-routing/BGP spec perhaps also partially covered the things you need:
https://specs.openstack.org/openstack/neutron-specs/specs/xena/bgpaas-enhancements.html

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Hi Lajos, thanks a lot for the references! I can definitely use the prior art on the BFD API and data model.

The BGP peering support is the next logical step so it's quite useful too.

Revision history for this message
Brian Haley (brian-haley) wrote :

Let's talk about this at the drivers meeting next week.

tags: added: rfe-triaged
tags: added: rfe-approved
removed: rfe rfe-triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/873698

Changed in neutron:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/873699

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/874199

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/874760

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/878527

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/878531

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-specs (master)

Reviewed: https://review.opendev.org/c/openstack/neutron-specs/+/870030
Committed: https://opendev.org/openstack/neutron-specs/commit/9763752c73644be92f9a2597c198968d88c1d220
Submitter: "Zuul (22348)"
Branch: master

commit 9763752c73644be92f9a2597c198968d88c1d220
Author: Dmitrii Shcherbakov <email address hidden>
Date: Thu Jan 12 22:50:52 2023 +0300

    Active-active L3 Gateway with Multihoming

    The aim of the RFE is to add multihoming support to routers along with
    automatic management of ECMP for default routes and BFD next-hop
    reachability verification.

    https://bugs.launchpad.net/neutron/+bug/2002687

    Co-Authored-By: Frode Nordahl <email address hidden>
    Related-Bug: #2002687
    Change-Id: I95a0d5f1b7aef985df5625cd83222799db811f2b

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/879462

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-lib (master)

Reviewed: https://review.opendev.org/c/openstack/neutron-lib/+/870887
Committed: https://opendev.org/openstack/neutron-lib/commit/e52a9372f76aa104543ab04cf1723e58da17d379
Submitter: "Zuul (22348)"
Branch: master

commit e52a9372f76aa104543ab04cf1723e58da17d379
Author: Dmitrii Shcherbakov <email address hidden>
Date: Wed Jan 18 02:45:03 2023 +0300

    ext-gw-multihoming: api-def and api-ref

    API additions for [1].

    * Added a new router attribute: external_gateways;
    * Added new API definitions for:
        PUT add_external_gateways
        PUT update_external_gateways
        PUT remove_external_gateways
    * Added extensions for each of the new router-level attributes:
      * enable_default_route_ecmp
      * enable_default_route_bfd
    * Combined the validation logic for the external_gateway_info type
      across extensions (l3_ext_gw_mode, qos_gateway_ip and the new
      extension called l3_ext_gw_multihoming).

    [1] https://review.opendev.org/c/openstack/neutron-specs/+/870030/

    Change-Id: I2618475636b2bb9bfd743a62f5d4859d4f68a547
    Related-Bug: #2002687

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/877831
Committed: https://opendev.org/openstack/neutron/commit/d67d1c273668ad4bb0d6906b7684f4a7b095c8d4
Submitter: "Zuul (22348)"
Branch: master

commit d67d1c273668ad4bb0d6906b7684f4a7b095c8d4
Author: Frode Nordahl <email address hidden>
Date: Fri Mar 17 18:42:53 2023 +0100

    [ovn] Drop use of OVN_GW_PORT_EXT_ID_KEY

    At present Neutron maintains an external_id on the
    Logical_Router (LR) representing the gw port (singluar). This
    is problematic when introducing multiple gateway ports.

    Instead we can find Logical Router Port (LRP) that act as
    gateways for the LR at runtime by looking for configuration
    present on all GW ports.

    Partial-Bug: #2002687
    Signed-off-by: Frode Nordahl <email address hidden>
    Needed-By: I95a0d5f1b7aef985df5625cd83222799db811f2b
    Change-Id: I8a915dca1410c70bdfe7a2d72931921d2a1a265e

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/c/openstack/neutron/+/877712
Committed: https://opendev.org/openstack/neutron/commit/e5d4499672fe4e4e57a24ae3194d3adaefe7be15
Submitter: "Zuul (22348)"
Branch: master

commit e5d4499672fe4e4e57a24ae3194d3adaefe7be15
Author: Frode Nordahl <email address hidden>
Date: Thu Mar 16 09:48:02 2023 +0100

    [ovn] Drop use of LR OVN_GW_NETWORK_EXT_ID_KEY

    An update to the OVN QoS driver to support the `qos_gateway_ip`
    QoS extension [0] introduced adding the GW network id as an
    external_id on the Logical_Router (LR). This is problematic
    when introducing multiple gateway ports, because a single LR
    can have gateways in multiple networks.

    The external_id key was presumably added because at the point in
    time when a LR is deleted, the code had no other source of this
    information. However, it turns out this step is redundant and
    not neccessary.

    To prove this I include a excerpt of a stack trace when deleting
    a router in the commit message:

        File "services/ovn_l3/plugin.py", line 210, in delete_router
          super(OVNL3RouterPlugin, self).delete_router(context, id)
        File "db/l3_db.py", line 612, in delete_router
          self._delete_current_gw_port(context, id, router, None)
        File "db/l3_db.py", line 452, in _delete_current_gw_port
          self._core_plugin.delete_port(
        File "plugins/ml2/drivers/ovn/mech_driver/mech_driver.py",
            line 886, in delete_port_postcommit
          self._ovn_client.delete_port(context.plugin_context, port['id'],
        File "plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py",
            line 830, in delete_port
          self._delete_port(port_id, port_object=port_object)

    Essentially, a routers GW port(s) will be removed prior to
    deleting the router itself.

    The `ovn_client.delete_port` method will call on the QoS extension
    to remove rules matching the GW port, and that will be the same
    rules as has previously been added for the router.

    I also added a functional test that confirms this fact [1].

    0: I46864b9234af64f190f6b6daebfd94d2e3bd0c17
    1: Ic92a7b3bd73920d08dee41974bfe3aeb1c64b557

    Partial-Bug: #2002687
    Signed-off-by: Frode Nordahl <email address hidden>
    Needed-By: I95a0d5f1b7aef985df5625cd83222799db811f2b
    Change-Id: If7c22bc8a95fa13e746c86a1e9d4a6fa25496e1f

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/c/openstack/neutron/+/878527
Committed: https://opendev.org/openstack/neutron/commit/5510cdab92d1d3eac080053a51779d34a8a19614
Submitter: "Zuul (22348)"
Branch: master

commit 5510cdab92d1d3eac080053a51779d34a8a19614
Author: Frode Nordahl <email address hidden>
Date: Fri Mar 24 12:06:39 2023 +0100

    [ovn] OVNClient._get_router_ports: Drop unused parameter

    The `get_gw_port` parameter is currently unused, the
    implementation hiding behind it also contains a bug, remove it.

    Partial-Bug: #2002687
    Signed-off-by: Frode Nordahl <email address hidden>
    Change-Id: Ie0ba5a478fabe9880746f892ef9c00d7e5660195

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/c/openstack/neutron/+/879462
Committed: https://opendev.org/openstack/neutron/commit/b1cc242faddd4c378dfee7b65ed72b4b190b2467
Submitter: "Zuul (22348)"
Branch: master

commit b1cc242faddd4c378dfee7b65ed72b4b190b2467
Author: Dmitrii Shcherbakov <email address hidden>
Date: Tue Apr 18 20:04:06 2023 +0300

    Add a method to retrieve router gateway ports

    A method is added as opposed to having a synthetic field on a router for
    performance reasons: gateways will only be queried when needed to use
    the external gateways feature API calls.

    Partial-Bug: #2002687
    Change-Id: Iddde9d986b024109bdb7c2aa777a1b017b6a35ab

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/c/openstack/neutron/+/873593
Committed: https://opendev.org/openstack/neutron/commit/a221764751de05e42069f1c097b1025bd9c4fc52
Submitter: "Zuul (22348)"
Branch: master

commit a221764751de05e42069f1c097b1025bd9c4fc52
Author: Dmitrii Shcherbakov <email address hidden>
Date: Mon Feb 13 18:30:15 2023 +0300

    Allow Multiple External Gateways

    * Add a new API for adding/updating/removing multiple gateway ports
      on routers;

    * Implement the necessary backend changes.

    Partial-Bug: #2002687
    Depends-On: I2618475636b2bb9bfd743a62f5d4859d4f68a547
    Change-Id: Id885565e88f6f1898ca5cfac709a24dd62605d1a

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/c/openstack/neutron/+/874797
Committed: https://opendev.org/openstack/neutron/commit/89702218db2476a17f6ff36cf51b909db563d887
Submitter: "Zuul (22348)"
Branch: master

commit 89702218db2476a17f6ff36cf51b909db563d887
Author: Dmitrii Shcherbakov <email address hidden>
Date: Wed Feb 22 21:11:31 2023 +0300

    Add extra router attributes for ECMP and BFD

    * enable_default_route_ecmp
    * enable_default_route_bfd

    Partial-Bug: #2002687
    Change-Id: I3fcd0458d20f20ce40378f90f073f37c41400865

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/893023

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/893025

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-lib (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron-lib/+/893026

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron-lib/+/893181

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron-lib/+/893182

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/893184

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/893185

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron-lib (master)

Change abandoned by "Rodolfo Alonso <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/neutron-lib/+/893181

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by "Rodolfo Alonso <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/neutron-lib/+/893182

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/893184
Committed: https://opendev.org/openstack/neutron/commit/c6b6ecc75196bd47ebcea0284e7d714c861fa630
Submitter: "Zuul (22348)"
Branch: master

commit c6b6ecc75196bd47ebcea0284e7d714c861fa630
Author: Frode Nordahl <email address hidden>
Date: Wed Aug 30 15:28:54 2023 +0200

    Drop release notes for l3-ext-gw-multihoming and adjacent features

    We did unfortunately not make it into Bobcat, and will try again
    to get it into Caracal.

    Partial-Bug: #2002687
    Change-Id: I5b4579a96152b8bdb1d34e59fb492c6f2a01b71e

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/c/openstack/neutron/+/893023
Committed: https://opendev.org/openstack/neutron/commit/113f3f668952d4bd2b78ef08c04fb606ca8fc0a5
Submitter: "Zuul (22348)"
Branch: master

commit 113f3f668952d4bd2b78ef08c04fb606ca8fc0a5
Author: Frode Nordahl <email address hidden>
Date: Mon Aug 28 16:38:31 2023 +0200

    Add missing extension classes for router BFD/ECMP extra attributes

    Change I3fcd0458d20f20ce40378f90f073f37c41400865 added the
    implementation for router BFD/ECMP extra attributes, but omitted
    the APIExtensionDescriptor classes that are required for loading
    the extension.

    Partial-Bug: #2002687
    Change-Id: I5f59087a1ff8d37f136ac88e50e0246de68455a8

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.