On Tue, May 9, 2017 at 10:32 AM, Dimitri John Ledkov <<email address hidden>
> wrote:
> Maybe this is easier to read.
>
> root@xnox-iad-nr5:~# journalctl -o short-monotonic -u <email address hidden>
> -- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09
> 15:22:34 UTC. --
> [ 6.740201] xnox-iad-nr5 systemd[1]: Started ifup for bond0.
> [ 6.750333] xnox-iad-nr5 sh[1184]: Waiting for a slave to join bond0
> (will timeout after 60s)
> [ 6.987241] xnox-iad-nr5 sh[1184]: ifquery: recursion detected for
> interface bond0 in pre-up phase
> [ 6.987341] xnox-iad-nr5 sh[1184]: ifquery: recursion detected for
> parent interface bond0 in pre-up phase
> [ 6.987425] xnox-iad-nr5 sh[1184]: ifquery: recursion detected for
> parent interface bond0 in pre-up phase
> root@xnox-iad-nr5:~# journalctl -o short-monotonic -u
> <email address hidden>
> -- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09
> 15:22:34 UTC. --
> [ 6.755723] xnox-iad-nr5 systemd[1]: Started ifup for bond0.101.
> [ 6.757056] xnox-iad-nr5 sh[1286]: ifup: waiting for lock on
> /run/network/ifstate.bond0
> [ 7.227572] xnox-iad-nr5 sh[1286]: Set name-type for VLAN subsystem.
> Should be visible in /proc/net/vlan/config
> root@xnox-iad-nr5:~# journalctl -o short-monotonic -u
> <email address hidden>
> -- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09
> 15:22:34 UTC. --
> [ 6.760568] xnox-iad-nr5 systemd[1]: Started ifup for bond0.401.
> [ 6.761920] xnox-iad-nr5 sh[1290]: ifup: waiting for lock on
> /run/network/ifstate.bond0
> [ 7.197983] xnox-iad-nr5 sh[1290]: Set name-type for VLAN subsystem.
> Should be visible in /proc/net/vlan/config
> root@xnox-iad-nr5:~# journalctl -o short-monotonic -u networking.service
> -- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09
> 15:22:34 UTC. --
> [ 6.645323] xnox-iad-nr5 systemd[1]: Starting Raise network
> interfaces...
> [ 6.692530] xnox-iad-nr5 ifup[992]: Waiting for bonding kernel module
> to be ready (will timeout after 5s)
> [ 6.693104] xnox-iad-nr5 ifup[992]: Waiting for bond master bond0 to be
> ready
> [ 7.221867] xnox-iad-nr5 ifup[992]: /sbin/ifup: waiting for lock on
> /run/network/ifstate.bond0
> [ 7.263179] xnox-iad-nr5 systemd[1]: Started Raise network interfaces.
>
>
What is the status/timestamp of the physical devices (enspf90 and 91) ?
Note that networking, <email address hidden>, and <email address hidden> all hit locks
that are held by bond0.
> Note that <email address hidden> beats networking.service to acquire the lock.
>
hrm, I didn't think we got ifup events for non-physical devices; this is
certainly problematic;
the design, IIUC, is for physical devices to get ifup@.service triggers via
udev events; see the the BindsTo=sys-subsystem-net-devices-%i.device
The result in this boot is that networking.service (which starts at 6.64
seconds calls ifup -a (pid 992) which is racing with
the ifup@bond0 (started at 6.74, pid 1184); so we have *two* ifups locking
bond0; I think this is the real deadlock.
I'm not sure why we're getting an ifup@bond0 unit execution, the udev
scripts are only triggered for physical devices;
Looks like when the module is loaded (modprobe bonding) that we get an
event for bond0 ... which may be triggering
the hotplug udev-rule which queues a ifup@bond0 job...
This is strange (to me):
# modprobe -vr bonding
rmmod bonding
root@x1:~# ifconfig -a
ens3 Link encap:Ethernet HWaddr 52:54:00:87:54:74
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:2 errors:0 dropped:0 overruns:0 frame:0
TX packets:2 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000
RX bytes:104 (104.0 B) TX bytes:184 (184.0 B)
On Tue, May 9, 2017 at 10:32 AM, Dimitri John Ledkov <<email address hidden>
> wrote:
> Maybe this is easier to read. iad-nr5: ~# journalctl -o short-monotonic -u <email address hidden> iad-nr5: ~# journalctl -o short-monotonic -u ifstate. bond0 vlan/config iad-nr5: ~# journalctl -o short-monotonic -u ifstate. bond0 vlan/config iad-nr5: ~# journalctl -o short-monotonic -u networking.service ifstate. bond0
>
> root@xnox-
> -- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09
> 15:22:34 UTC. --
> [ 6.740201] xnox-iad-nr5 systemd[1]: Started ifup for bond0.
> [ 6.750333] xnox-iad-nr5 sh[1184]: Waiting for a slave to join bond0
> (will timeout after 60s)
> [ 6.987241] xnox-iad-nr5 sh[1184]: ifquery: recursion detected for
> interface bond0 in pre-up phase
> [ 6.987341] xnox-iad-nr5 sh[1184]: ifquery: recursion detected for
> parent interface bond0 in pre-up phase
> [ 6.987425] xnox-iad-nr5 sh[1184]: ifquery: recursion detected for
> parent interface bond0 in pre-up phase
> root@xnox-
> <email address hidden>
> -- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09
> 15:22:34 UTC. --
> [ 6.755723] xnox-iad-nr5 systemd[1]: Started ifup for bond0.101.
> [ 6.757056] xnox-iad-nr5 sh[1286]: ifup: waiting for lock on
> /run/network/
> [ 7.227572] xnox-iad-nr5 sh[1286]: Set name-type for VLAN subsystem.
> Should be visible in /proc/net/
> root@xnox-
> <email address hidden>
> -- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09
> 15:22:34 UTC. --
> [ 6.760568] xnox-iad-nr5 systemd[1]: Started ifup for bond0.401.
> [ 6.761920] xnox-iad-nr5 sh[1290]: ifup: waiting for lock on
> /run/network/
> [ 7.197983] xnox-iad-nr5 sh[1290]: Set name-type for VLAN subsystem.
> Should be visible in /proc/net/
> root@xnox-
> -- Logs begin at Tue 2017-05-09 10:57:18 UTC, end at Tue 2017-05-09
> 15:22:34 UTC. --
> [ 6.645323] xnox-iad-nr5 systemd[1]: Starting Raise network
> interfaces...
> [ 6.692530] xnox-iad-nr5 ifup[992]: Waiting for bonding kernel module
> to be ready (will timeout after 5s)
> [ 6.693104] xnox-iad-nr5 ifup[992]: Waiting for bond master bond0 to be
> ready
> [ 7.221867] xnox-iad-nr5 ifup[992]: /sbin/ifup: waiting for lock on
> /run/network/
> [ 7.263179] xnox-iad-nr5 systemd[1]: Started Raise network interfaces.
>
>
What is the status/timestamp of the physical devices (enspf90 and 91) ?
Note that networking, <email address hidden>, and <email address hidden> all hit locks
that are held by bond0.
> Note that <email address hidden> beats networking.service to acquire the lock.
>
hrm, I didn't think we got ifup events for non-physical devices; this is sys-subsystem- net-devices- %i.device
certainly problematic;
the design, IIUC, is for physical devices to get ifup@.service triggers via
udev events; see the the BindsTo=
The result in this boot is that networking.service (which starts at 6.64
seconds calls ifup -a (pid 992) which is racing with
the ifup@bond0 (started at 6.74, pid 1184); so we have *two* ifups locking
bond0; I think this is the real deadlock.
I'm not sure why we're getting an ifup@bond0 unit execution, the udev
scripts are only triggered for physical devices;
Looks like when the module is loaded (modprobe bonding) that we get an
event for bond0 ... which may be triggering
the hotplug udev-rule which queues a ifup@bond0 job...
This is strange (to me):
# modprobe -vr bonding
collisions: 0 txqueuelen:1000
rmmod bonding
root@x1:~# ifconfig -a
ens3 Link encap:Ethernet HWaddr 52:54:00:87:54:74
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:2 errors:0 dropped:0 overruns:0 frame:0
TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
RX bytes:104 (104.0 B) TX bytes:184 (184.0 B)
lo Link encap:Local Loopback
collisions: 0 txqueuelen:1
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:320 errors:0 dropped:0 overruns:0 frame:0
TX packets:320 errors:0 dropped:0 overruns:0 carrier:0
RX bytes:23680 (23.6 KB) TX bytes:23680 (23.6 KB)
root@x1:~# cat /etc/issue
Ubuntu 16.04.2 LTS \n \l
root@x1:~# lsb_release -rd pci0000: 00/0000: 00:03.0/ virtio0/ net/ens3 virtual/ net/lo
Description: Ubuntu 16.04.2 LTS
Release: 16.04
root@x1:~# ls -al /sys/class/net/
total 0
drwxr-xr-x 2 root root 0 May 9 16:36 .
drwxr-xr-x 63 root root 0 May 9 16:35 ..
lrwxrwxrwx 1 root root 0 May 9 16:35 ens3 ->
../../devices/
lrwxrwxrwx 1 root root 0 May 9 16:35 lo -> ../../devices/
root@x1:~# modprobe -v bonding 4.4.0-75- generic/ kernel/ drivers/ net/bonding/ bonding. ko virtual/ net/bond0 pci0000: 00/0000: 00:03.0/ virtio0/ net/ens3 virtual/ net/lo
insmod /lib/modules/
root@x1:~# ls -al /sys/class/net/
total 0
drwxr-xr-x 2 root root 0 May 9 16:36 .
drwxr-xr-x 63 root root 0 May 9 16:36 ..
lrwxrwxrwx 1 root root 0 May 9 16:36 bond0 ->
../../devices/
-rw-r--r-- 1 root root 4096 May 9 16:36 bonding_masters
lrwxrwxrwx 1 root root 0 May 9 16:36 ens3 ->
../../devices/
lrwxrwxrwx 1 root root 0 May 9 16:36 lo -> ../../devices/
root@x1:~# uname -r
4.4.0-75-generic
Why would bond0 get created when loading the bonding module?
@xnox, if you change your interfaces from bond0, to bond1 globally, does
your configuration
work ?
> -- /bugs.launchpad .net/bugs/ 1636708 /bugs.launchpad .net/ubuntu/ +source/ ifupdown/ +bug/ +subscriptions
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> ifup -a does not start dependants last, causes deadlocks with
> vlans/bonding
>
> To manage notifications about this bug go to:
> https:/
> 1636708/
>