Code to map neutron openvswitch bridge to existing linux (juju) bridge breaks networking after reboot by adding interfaces.d/* include back
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Charm Helpers |
New
|
Undecided
|
Unassigned | ||
OpenStack Neutron Gateway Charm |
Triaged
|
High
|
Unassigned | ||
OpenStack Nova Compute Charm |
Triaged
|
High
|
Unassigned |
Bug Description
When mapping a linux bridge (typically because juju wants to attach containers to that interface) to an openvswitch bridge for neutron (e.g. data-port=
The reason is that
- When juju/MAAS first deploys the machine, /etc/network/
- cloud-init writes the various network configurations to /etc/network/
- When juju deploys an LXD container, it takes the content of the 50-cloud-init.cfg file and then moves it into /etc/network/
- juju does NOT remove /etc/network/
- When configured with data-port=
- It then re-adds the "source /etc/network/
This causes networking to be broken after a reboot, because the IP address of ens3 is configured on BOTH ens3 and br-ens3. The exact reason is not 100% obvious (I guessed rp_filter but it seems not at fault, arp/neighbour discovery fails on ens3 and works on br-ens3 for some reason - in any case, the configuration is clearly incorrect)
It does not break before reboot presumably because ifdown for ens3 and ifup for br-ens3 is explicitly executed instead of ifup -a - or some similar sequence of events. I didn't check exactly why but it's likely mostly irrelevant.
The MTU is also not set on the veth pair, which means the linux bridge MTU will drop to 1500 if it was set to 9000. The veth pair itself and the openvswitch bridge mostly don't seem to care about MTU and will transmit packets anyway but linux bridges will drop packets with the wrong MTU. A separate bug for that is here: https:/
I would argue in some ways this is a bug in juju, as other things may rely on interfaces.d and it probably should rewrite 50-cloud-init.cfg or otherwise not remove the interfaces.d include (or it could leave the include there but remove the 50-cloud-init.cfg) -- but this behavior exists in multiple juju versions so we may need to fix the charm as a workaround and juju for a long term fix.
Ultimately this stems from multiple parties trying to configure the network, it would be even more ideal if MAAS originally configured a linux bridge but for various reasons that is difficult to predict and users not using lxd may not want the overhead of the bridge being configured for no reason.
tags: | added: cpe-onsite |
tags: | added: sts |
Changed in charm-neutron-gateway: | |
status: | New → Triaged |
Changed in charm-nova-compute: | |
status: | New → Triaged |
Changed in charm-neutron-gateway: | |
importance: | Undecided → High |
Changed in charm-nova-compute: | |
importance: | Undecided → High |
tags: | added: cold-start |
Seems my guess about rp_filter is wrong, setting it to 0 has no effect, for whatever reason the outbound traffic does not transmit. So removing that from the description.