l3 agent external_network_bridge broken with ovs

Bug #1799737 reported by Junien Fridrick
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Confirmed
High
Unassigned
neutron (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Hi,

I'm running queens on xenial. The following commit introduced a regression : https://git.openstack.org/cgit/openstack/neutron/commit/?id=2b1d413ee90dfe2e9ae41c35ab37253df53fc6cd (fixing bug 1767422)

The following call is wrong (in router_info.py) :

self.driver.remove_vlan_tag(self.agent_conf.external_network_bridge, interface_name)

In this call, interface_name is something like "qg-abcdefg-12", but remove_vlan_tag() expects a tap interface. This results in the following log :

2018-10-24 00:28:40.880 23623 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn command(idx=0): DbClearCommand(column=tag, table=Port, record=qg-0410dbf1-51) do_commit /usr/lib/python2.7/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:84
2018-10-24 00:28:40.881 23623 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Transaction aborted do_commit /usr/lib/python2.7/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:112

Sadly, the cause of the "Transaction aborted" is hidden (in https://github.com/openstack/ovsdbapp/blob/master/ovsdbapp/backend/ovs_idl/transaction.py#L87). If I print the exception, I get the following :

2018-10-24 00:28:40.881 23623 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] EXCEPTION Cannot find Port with name=qg-0410dbf1-51 do_commit /usr/lib/python2.7/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:88

Checking the ovs database reveals that Port "names" are the tap interfaces, not the qg- interfaces. The ports staying the VLAN 4095 will basically make the network unusable. Using the following call fixes my problem :

self.driver.remove_vlan_tag(self.agent_conf.external_network_bridge, self.driver._get_tap_name(interface_name,prefix=EXTERNAL_DEV_PREFIX))

It would be nice to print the exception caught in https://github.com/openstack/ovsdbapp/blob/master/ovsdbapp/backend/ovs_idl/transaction.py#L87 by the way...

Thanks !

For reference :

$ dpkg -l|grep neutron
ii neutron-common 2:12.0.3-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - common
ii neutron-dhcp-agent 2:12.0.3-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - DHCP agent
ii neutron-l3-agent 2:12.0.3-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - l3 agent
ii neutron-lbaas-common 2:12.0.0-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - common
ii neutron-lbaasv2-agent 2:12.0.0-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - LBaaSv2 agent
ii neutron-metadata-agent 2:12.0.3-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - metadata agent
ii neutron-metering-agent 2:12.0.3-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - metering agent
ii neutron-openvswitch-agent 2:12.0.3-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - Open vSwitch plugin agent
ii python-neutron 2:12.0.3-0ubuntu1~cloud0 all Neutron is a virtual network service for Openstack - Python library
ii python-neutron-fwaas 1:12.0.0-0ubuntu1~cloud0 all Firewall-as-a-Service driver for OpenStack Neutron
ii python-neutron-lbaas 2:12.0.0-0ubuntu1~cloud0 all Loadbalancer-as-a-Service driver for OpenStack Neutron
ii python-neutron-lib 1.13.0-0ubuntu1~cloud0 all Neutron shared routines and utilities - Python 2.7
ii python-neutronclient 1:6.7.0-0ubuntu1~cloud0 all client API library for Neutron - Python 2.7

Revision history for this message
Albert Damen (albrt) wrote :

I guess you upgraded from an original neutron 8 config to queens. In queens external_network_bridge has been deprecated and should not be used (Using this will result in incorrect port statuses).

What happens if you use "external_network_bridge = "

Revision history for this message
Junien Fridrick (axino) wrote :

This cloud is indeed upgraded from pike, and using juju charms - said charms still offer the option to set external_network_bridge, although the option is clearly marked as "deprecated".

In my book, "deprecated" doesn't mean "completely broken" though...

If I use "external_network_bridge = ", networking is likely to break since it's not configured elsewhere.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

Hi Junien,

Thanks for reporting this and making Ubuntu better. I've added the upstream neutron project to this bug as this doesn't appear to be limited to the Ubuntu package. Does the charm also need a fix for this? If so we should add the charm project to this bug.

Thanks,
Corey

Revision history for this message
Junien Fridrick (axino) wrote :

The charm appears to do the right thing, it marks the feature as "deprecated" but properly configures it whenever asked.

tags: added: ovs
tags: added: l3-ipam-dhcp
Changed in neutron:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in neutron (Ubuntu):
status: New → Confirmed
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Hi,

Currently, since Stein there is no external_network_bridge config option at all. Is it still an issue? Maybe it's still a problem in some stable releases (which ones)?

Revision history for this message
Edward Hope-Morley (hopem) wrote :

@slaweq the bug description says this issue was observed in Queens which is currently under Extended Maintenance so presumably still eligible for fixes if there is sufficient consensus on their criticality and enough people to review. We also need to consider upgrades from Q -> R -> S where people are still using this config.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

@axino I assume the environment you have that is using external_network_bridge/external_network_id is quite old and was originally deployed with a version older than Queens? Using these option to configure external networks is really deprecated and since at least Juno we have used bridge_mappings for this purpose (and to allow > 1 external network). There is an annoying quirk here though (and perhaps this is why you have not switched) which is that with the old way the network will likely not have a provider name (in the db) and therefore migrating it as-is to a bridge_mappings type config will break the network (unless perhaps one can be set manually in the database).

Revision history for this message
Edward Hope-Morley (hopem) wrote :

Does feel like the code change in https://review.opendev.org/#/c/564825/10/neutron/agent/l3/router_info.py could be reverted though since it only affects the legacy config and is also breaking it.

Revision history for this message
Junien Fridrick (axino) wrote :

@hopem I think it got installed as Pike but with old charm config options used.

I can't argue against "don't use external_network_bridge", and I guess no one uses that in recent openstacks, so no one is getting hit by this bug ?

If anything, we should log the exception in https://github.com/openstack/ovsdbapp/blob/8275af1726cb2079afd8f5230377e064221ebcf3/ovsdbapp/backend/ovs_idl/transaction.py#L90

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.