switching from external-network-id and external-port to data-port and bridge-mappings does not remove incorrect nics from bridges

Bug #1809190 reported by Xav Paice
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Charms Deployment Guide
Fix Released
Medium
Alex Kavanagh
OpenStack Neutron Gateway Charm
Invalid
High
Unassigned
OpenStack Neutron Open vSwitch Charm
Invalid
Undecided
Unassigned

Bug Description

charm cs:neutron-gateway-258

I upgraded a site from Mitaka to Newton. There are 4 neutron-gateway applications, one for each external network, hosting a set of routers that connect to those nets.

The charm config includes a setting for external-network-id and external-port. Each of the applications has a different nic for the external network, i.e. eth1, eth2, eth3 and eth4, so if I have application gateway3 it is using eth3 as the external-port.

Each of the external networks is configured in openstack as such:

:~$ neutron net-show ext_net
+---------------------------+--------------------------------------+
| Field | Value |
+---------------------------+--------------------------------------+
| admin_state_up | True |
| availability_zone_hints | |
| availability_zones | nova |
| created_at | 2016-08-18T00:12:43Z |
| description | |
| id | foo |
| ipv4_address_scope | |
| ipv6_address_scope | |
| is_default | False |
| l2_adjacency | True |
| mtu | 1458 |
| name | ext_net |
| project_id | foo |
| provider:network_type | gre |
| provider:physical_network | |
| provider:segmentation_id | 85 |
| revision_number | 0 |
| router:external | True |
| shared | False |
| status | ACTIVE |
| subnets | foo |
| tags | |
| tenant_id | foo |
| updated_at | 2016-09-12T21:01:44Z |
+---------------------------+--------------------------------------+

On running the openstack-upgrade action on the neutron-gateway-X units, network connectivity to the external networks was lost.

After some time, we discovered that these deprecated options should have been changed out, so we reset external-network-id, and external-port to default and configured all 4 gateway applications:

data-port='br-ex:eth1 br-2:eth2 br-3:eth3 br-4:eth4' bridge-mappings='physnet1:br-ex net2:br-3 net3:br-3 net4:br-4'

We also used mysql edits to reconfigure the networks to have network_type of flat, and the physical_network set to whichever of physnet1 or whatever it needed.

We expected this to reconfigure the networks correctly, but in fact what happened is that the 'old' interface, e.g. eth3, was left in br-ex as well as the eth1 added. We probably should have just created an entirely new bridge rather than re-using br-ex. The two interfaces in the bridge caused some kind of storm and the entire physical network was saturated.

While this is clearly a design fault from the deployment point of view, it would be good to firstly have some massive warning flags about these older configs breaking on upgrade, also if an interface is in a bridge that shouldn't be it would be good to remove it rather than leave it there.

Ryan Beisner (1chb1n)
Changed in charm-neutron-gateway:
importance: Undecided → High
milestone: none → 19.04
assignee: nobody → Frode Nordahl (fnordahl)
Revision history for this message
Corey Bryant (corey.bryant) wrote : Re: [Bug 1809190] [NEW] switching from external-network-id and external-port to data-port and bridge-mappings does not remove incorrect nics from bridges

On Wed, Dec 19, 2018 at 8:40 PM Xav Paice <email address hidden> wrote:

After some time, we discovered that these deprecated options should have
> been changed out, so we reset external-network-id, and external-port to
> default and configured all 4 gateway applications

The options should still work even though they are deprecated. If they are
in config.yaml and not removed they should still work. If they don't that's
a bug in the charm.

It would be nice to have a formal way to run a health check on config
options prior to charm upgrade. For example if ext-port had been removed in
a charm release you'd be able to run it and find out.

Changed in charm-neutron-gateway:
assignee: Frode Nordahl (fnordahl) → Alex Kavanagh (ajkavanagh)
Changed in charm-deployment-guide:
status: New → In Progress
assignee: nobody → Alex Kavanagh (ajkavanagh)
importance: Undecided → Medium
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

So the actual neutron-gateway code doesn't attempt to delete existing bridges. i.e. if any of the parameters changes a mapping, the old mappings are left with the changes to the config being added via ovs. The question is whether the neutron-gateway charm should enumerate what is currently configured in ovs (for example), see what the config is, diff that, and then take the appropriate action to remove bridge/port mappings that are no longer configured.

Changed in charm-neutron-gateway:
status: New → Confirmed
Revision history for this message
Frode Nordahl (fnordahl) wrote :
Changed in charm-deployment-guide:
status: In Progress → Fix Committed
David Ames (thedac)
Changed in charm-neutron-gateway:
milestone: 19.04 → 19.07
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Having discussed this bug in the OpenStack team, we've come to the following conclusion:

"Re-write the config-changed hook to reflect the options

i.e. delete bridge mappings/ports that no longer exist and add port mappings that now do exist. This would mean that any additional configuration done directly on the unit (for example by installers to create networking situations that the charm config doesn't handle) would be deleted/broken on the next config-changed hook for any config item.

This ensures that the charm is in full control of the OVS bridges, and that any tinkering is transient and will potentially be scrubbed; OVS is sufficiently verbose in the data you can get to be able to manage this effectively - we could even add extra data to the ports we add so it makes it easier to discover what the charm did vs anything done outside of the charm operations."

Changed in charm-neutron-gateway:
milestone: 19.07 → 19.10
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-neutron-gateway (master)

Fix proposed to branch: master
Review: https://review.opendev.org/673849

Changed in charm-neutron-gateway:
status: Confirmed → In Progress
David Ames (thedac)
Changed in charm-neutron-gateway:
milestone: 19.10 → 20.01
Changed in charm-deployment-guide:
status: Fix Committed → Fix Released
James Page (james-page)
Changed in charm-neutron-gateway:
milestone: 20.01 → 20.05
Changed in charm-neutron-gateway:
assignee: Alex Kavanagh (ajkavanagh) → Aurelien Lourot (aurelien-lourot)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/717074

Revision history for this message
Ryan Beisner (1chb1n) wrote :

To completely address the issue, I think work also needs to be done on the neutron-openvswitch charm for scenarios where neutron-gateway is not used.

https://opendev.org/openstack/charm-neutron-openvswitch/src/commit/677a31b95ecc261446b92fd2608c34f08061d6aa/config.yaml#L103-L114

David Ames (thedac)
Changed in charm-neutron-gateway:
milestone: 20.05 → 20.08
James Page (james-page)
Changed in charm-neutron-gateway:
milestone: 20.08 → none
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-neutron-gateway (master)

Reviewed: https://review.opendev.org/717074
Committed: https://git.openstack.org/cgit/openstack/charm-neutron-gateway/commit/?id=bbc621edca348c2d0a9061bffb6d65f823b236c1
Submitter: Zuul
Branch: master

commit bbc621edca348c2d0a9061bffb6d65f823b236c1
Author: Aurelien Lourot <email address hidden>
Date: Mon Mar 30 13:19:07 2020 +0200

    Mark OVS bridges and ports as managed by charm-neutron-gateway

    This patchset updates the configure_ovs() function in
    hooks/neutron_utils.py such that ports and bridges in OVS are marked as
    being managed by this charm. This will allow us to clean up obsolete
    managed bridges and ports in a later patchset. (On configuration change
    new ports and bridges might be created and former ones might become
    obsolete.)

    This patchset also fully deprecates the 'ext-port' config option such
    that if both 'data-port' and 'ext-port' config options are set, the unit
    is blocked. The README and config.yaml are updated to reflect this
    change.

    This patchset also fixes and removes a few dead links.

    Relies on a charm-helpers version containing these patchsets:
    https://github.com/juju/charm-helpers/pull/443
    https://github.com/juju/charm-helpers/pull/447
    https://github.com/juju/charm-helpers/pull/449

    Related documentation:
    * Deployment guide / Upgrades / Known issues: https://review.opendev.org/630290
    * Release notes: https://review.opendev.org/742660

    Change-Id: I8b459135d131e16865de40ff3eae16ea3bc7195e
    Partial-Bug: #1809190

Changed in charm-neutron-gateway:
milestone: none → 20.08
milestone: 20.08 → none
milestone: none → 20.10
David Ames (thedac)
Changed in charm-neutron-gateway:
milestone: 20.10 → 21.01
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

So a lot of work has been done on code related to this bug around the charm neutron code. Please could you indicate if it is still a problem? Setting to incomplete to get a reminder in 60 days. Thanks.

Changed in charm-neutron-gateway:
status: In Progress → Incomplete
assignee: Aurelien Lourot (aurelien-lourot) → nobody
Revision history for this message
Billy Olsen (billy-olsen) wrote :

With the updated documentation and the various changes in the neutron code, I'm removing this from field-high. The bug will not expire due to the multiple tasks at this point in time, but if there continue to be issues here please reopen on the charm-neutron-gateway with appropriate logs and commentary.

David Ames (thedac)
Changed in charm-neutron-gateway:
milestone: 21.01 → none
Revision history for this message
Drew Freiberger (afreiberger) wrote :

I've opened a new bug that is related to this issue. lp#1915967

I feel the documentation update addresses the subject of this bug, but there's another bug after you've changed to data-port that updates to data-port mappings do not remove old data-port interfaces from the bridge before adding new bridge interfaces.

Revision history for this message
Aurelien Lourot (aurelien-lourot) wrote :

Marking ports/bridges as managed by us in charm-neutron-openvswitch (as it was done in charm-neutron-gateway) is now tracked under lp:1917025.

The actual removal of obsolete ports/bridges is now tracked under lp:1915967 as mentioned above. Closing this bug.

Changed in charm-neutron-gateway:
status: Incomplete → Won't Fix
status: Won't Fix → Invalid
Changed in charm-neutron-openvswitch:
status: New → Invalid
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on charm-neutron-gateway (master)

Change abandoned by "James Page <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/charm-neutron-gateway/+/673849
Reason: This review is > 12 weeks without comment and currently blocked by a core reviewer with a -2. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and contacting the reviewer with the -2 on this review to ensure you address their concerns.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.