LXD no longer activates all interfaces on initial deploy when using MAAS2rc3 and JUJU Beta13

Bug #1608105 reported by Ben
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Unassigned

Bug Description

When using MAAS 2RC3 and Juju 2 Beta 13 deploying charms to LXD on MAAS controlled machines only ETH0 activates. The ENI is perfect and if I issue sudo lxc restart juju-machine all of the interfaces are then activated. This is new to Beta 13.

Fresh deploy of a charm
juju deploy charm --to lxd:0

then from machine 0

ubuntu@controller:~$ sudo lxc list
+----------------------+---------+--------------------------------+------+------------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+----------------------+---------+--------------------------------+------+------------+-----------+
| juju-970ebd-0-lxd-0 | RUNNING | 10.1.0.2 (eth0) | | PERSISTENT | 0 |
+----------------------+---------+--------------------------------+------+------------+-----------+

ubuntu@controller:~$ sudo lxc restart juju-970ebd-0-lxd-0
ubuntu@controller:~$ sudo lxc list
+----------------------+---------+--------------------------------+------+------------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+----------------------+---------+--------------------------------+------+------------+-----------+
| juju-970ebd-0-lxd-0 | RUNNING | 10.1.0.2 (eth0) | | PERSISTENT | 0 |
| | | 10.10.0.2 (eth1) | | | |
| | | 10.20.0.2 (eth2) | | | |
| | | 10.2.0.2 (eth3) | | | |
+----------------------+---------+--------------------------------+------+------------+-----------+

no changes made anywhere, just a restart container and it works. Makes deploying bundles and large scripts difficult since the networks are not activated and subordinate services fail to deploy until you manually restart the containers.

3 physical machines to recreate issue

1 - MAAS
2 - JUJU controller
3 - Physical server for services

Changed in juju-core:
status: New → Triaged
importance: Undecided → Critical
milestone: none → 2.0-beta14
Revision history for this message
Sandor Zeestraten (szeestraten) wrote :

I have the same issue on MAAS 2.0rc3 and Juju 2.0-beta13 when deploying the openstack-base bundle.

The work around of restarting the containers fixes the issue, but is a hassle.

Changed in juju-core:
importance: Critical → High
Revision history for this message
Sandor Zeestraten (szeestraten) wrote :

I managed to reproduce the issue in Juju 2.0-beta12 with MAAS 2.0rc3 so I am not sure where the bug was introduced (either in Juju or MAAS).

Ben, could you perhaps try to reproduce it on Juju 2.0-beta12 as well?

P.S. I have attached some info from one of the affected machines. It lists the interfaces of the host machine, shows the IP's and info of a container fresh from a juju deploy. Then it shows the new IP's of the container after the lxc restart. Please let me know if the juju log from the machine is needed.

Revision history for this message
Anastasia (anastasia-macmood) wrote :

Juju log from the machine will certainly help \o/

Revision history for this message
Sandor Zeestraten (szeestraten) wrote :

See attachment for juju log from the machine in question.

Revision history for this message
Ben (bjenkins-x) wrote :

Some extra info.

I could not reproduce this issue with Juju Beta 12 and MAASrc3.

If I deploy Trusty charms to LXD they seem to work fine on Juju Beta 13 and MAASrc3. I have only tried a handful but Mongodb is a good example of one that works.

Have not had a single Xenial charm deploy where I did not have to manually restart container.

If this is a MAAS issue are you guys (JUJU team) close enough with the MAAS team to raise this issue with them? I am afraid that if I say juju charms don't deploy correctly they will send me back here.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

If restarting helps, I'm pretty sure this is due to a known issue with LXD and systemd (will try to find the reference). In the meantime I'm trying to reproduce it locally with:

MAAS Version 2.0.0 (rc3+bzr5180)
$ juju version
2.0-beta14-xenial-amd64

on dual-NIC NUC nodes with multiple VLANs on each NIC. Will post an update how it went.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :
Download full text (15.1 KiB)

I can't reproduce this :/

Steps I used:
$ juju bootstrap maashw maashw --upload-tools --to node-21
$ juju switch controller
$ juju deploy ubuntu --to lxd:0
$ juju ssh 0 -- 'cat /etc/network/interfaces'
auto lo
iface lo inet loopback
    dns-nameservers 10.14.0.1
    dns-search maas

auto eth0
iface eth0 inet manual
    mtu 1500

auto br-eth0
iface br-eth0 inet static
    address 10.14.1.121/20
    gateway 10.14.0.1
    bridge_ports eth0

auto eth1
iface eth1 inet manual
    mtu 1500

auto br-eth1
iface br-eth1 inet static
    address 10.14.2.121/20
    bridge_ports eth1

auto eth0.100
iface eth0.100 inet manual
    mtu 1500
    vlan-raw-device eth0
    vlan_id 100

auto br-eth0.100
iface br-eth0.100 inet static
    address 10.100.1.121/20
    bridge_ports eth0.100
    dns-nameservers 10.100.0.1

auto eth0.150
iface eth0.150 inet manual
    mtu 1500
    vlan-raw-device eth0
    vlan_id 150

auto br-eth0.150
iface br-eth0.150 inet static
    address 10.150.1.121/20
    bridge_ports eth0.150

auto eth0.50
iface eth0.50 inet manual
    mtu 1500
    vlan-raw-device eth0
    vlan_id 50

auto br-eth0.50
iface br-eth0.50 inet static
    address 10.50.1.121/20
    bridge_ports eth0.50

auto eth1.200
iface eth1.200 inet manual
    mtu 1500
    vlan-raw-device eth1
    vlan_id 200

auto br-eth1.200
iface br-eth1.200 inet static
    address 10.200.1.121/20
    bridge_ports eth1.200

auto eth1.250
iface eth1.250 inet manual
    mtu 1500
    vlan-raw-device eth1
    vlan_id 250

auto br-eth1.250
iface br-eth1.250 inet static
    address 10.250.1.121/20
    bridge_ports eth1.250
    dns-nameservers 10.250.0.1

auto eth1.30
iface eth1.30 inet manual
    mtu 1500
    vlan-raw-device eth1
    vlan_id 30

auto br-eth1.30
iface br-eth1.30 inet static
    address 10.30.1.121/20
    bridge_ports eth1.30

source /etc/network/interfaces.d/*.cfg
Connection to 10.14.1.121 closed.

$ juju ssh 0 -- 'ifconfig -a'
br-eth0 Link encap:Ethernet HWaddr b8:ae:ed:78:17:67
          inet addr:10.14.1.121 Bcast:10.14.15.255 Mask:255.255.240.0
          inet6 addr: fe80::baae:edff:fe78:1767/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:97006 errors:0 dropped:363 overruns:0 frame:0
          TX packets:90006 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:237735369 (237.7 MB) TX bytes:7972787 (7.9 MB)

br-eth1 Link encap:Ethernet HWaddr 00:e1:00:00:15:bc
          inet addr:10.14.2.121 Bcast:10.14.15.255 Mask:255.255.240.0
          inet6 addr: fe80::2e1:ff:fe00:15bc/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:500 errors:0 dropped:360 overruns:0 frame:0
          TX packets:5 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:26098 (26.0 KB) TX bytes:398 (398.0 B)

br-eth0.50 Link encap:Ethernet HWaddr b8:ae:ed:78:17:67
          inet addr:10.50.1.121 Bcast:10.50.15.255 Mask:255.255.240.0
          inet6 addr: fe80::baae:edff:fe78:1767/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:82 errors:0 dropped:0 overruns:0 ...

Changed in juju-core:
status: Triaged → Incomplete
no longer affects: maas
Changed in juju-core:
milestone: 2.0-beta14 → none
Revision history for this message
Luca (l-dellefemmine) wrote :
Download full text (10.5 KiB)

I have a similar issue. The lxd container has no ip address if the machine use bonding interfaces.

~$ sudo lxc list
+---------------------+---------+------+------+------------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+---------------------+---------+------+------+------------+-----------+
| juju-abc723-2-lxd-0 | RUNNING | | | PERSISTENT | 0 |
+---------------------+---------+------+------+------------+-----------+

~$ sudo lxc exec juju-abc723-2-lxd-0 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
34: eth5@if35: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:1b:14:8a brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::216:3eff:fe1b:148a/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever
36: eth1@if37: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:e2:2b:e1 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::216:3eff:fee2:2be1/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever
38: eth0@if39: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:44:5b:36 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::216:3eff:fe44:5b36/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever
40: eth2@if41: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:a1:71:27 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::216:3eff:fea1:7127/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever
42: eth3@if43: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:c4:82:3f brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::216:3eff:fec4:823f/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever
44: eth4@if45: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:aa:b4:99 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::216:3eff:feaa:b499/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever

~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000
    link/ether 94:57:a5:5c:e7:38 brd ff:ff:ff:ff:ff:ff
3: eno49: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq master bond1 state DOWN group default qlen 1000
    link/ether 5...

Changed in juju-core:
status: Incomplete → Triaged
milestone: none → 2.0.0
Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Luca, can you please paste (scrubbed of secrects ofc) /var/log/cloud-init-output.log and /var/log/juju/machine-0.log from that node after juju bootstrap has finished?

Revision history for this message
Ben (bjenkins-x) wrote :

Installed Juju Beta 14 and MAAS rc3 this morning and re-did all of my tests. The LXD to physical server now assigns all interfaces correctly and no need to restart.

Revision history for this message
Richard Harding (rharding) wrote :

Thanks for the confirmation Ben. Marking fix-released unless we get fresh reproductions with beta14.

Changed in juju-core:
status: Triaged → Fix Released
Revision history for this message
Sandor Zeestraten (szeestraten) wrote :

Just wanted to chime in and say that beta14 seemed to fix the issue on our end too, both for Trusty and Xenial charms. Thanks.

Changed in juju-core:
milestone: 2.0.0 → 2.0-beta15
Revision history for this message
Luca (l-dellefemmine) wrote :

Hi Dimiter,
I attached the logs you asked for.

affects: juju-core → juju
Changed in juju:
milestone: 2.0-beta15 → none
milestone: none → 2.0-beta15
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.