Comment 5 for bug 950662

Revision history for this message
Steve Langasek (vorlon) wrote :

So according to the contents of /run/network, all of your static networks came up quickly and as intended, and the /run/network/static-network-up-emitted directory confirms that the 'static-network-up' event should have been generated.

This should have been sufficient to stop the 'failsafe' job and avoid any messages about waiting for the network.

One thing I know could have caused boot delays recently in precise is a bug in apparmor, bug #949891, which caused apparmor_parser to take a very long time at boot - and apparmor_parser is called before each network interface is brought up, so this could very well account for seeing the delay in cases when you *do* have interfaces other than lo configured statically in /etc/network/interfaces. But if you *don't* have any interfaces configured statically (as in your current /etc/network/interfaces), this shouldn't have mattered.

I have hit upon a possible race condition that would explain this. If the order of boot events is:

 net-device-up IFACE=lo
 static-network-up
 filesystem
 runlevel

the 'static-network-up' event will not stop the failsafe job because it's not started yet, but the 'filesystem' event will finish satisfying the start condition and the job will start, and stay running until 'runlevel' is emitted.

And the runlevel event is emitted at the bottom of the rc-sysinit job, only after processing scripts in /etc/rcS.d.

Can you show the output of 'ls -l /etc/rcS.d'?

Regardless of what this command shows, I think there's definitely one bug here in /etc/init/failsafe.conf, and possibly another in upstart itself for not resetting the status of the failsafe job when static-network-up is emitted.

The definite bug is that we're waiting for the end of /etc/init/rc-sysinit.conf to clear the message, even though nothing during rc-sysinit is related to networking. Normally rc-sysinit is fast, but when it isn't (i.e, when it takes longer than 20 seconds), this message is confusing.

Clint, I think the right thing for failsafe to do here is 'stop on static-network-up or starting rc-sysinit'. Does that sound reasonable to you? Since it's a task, we can't use 'started rc-sysinit', but 'starting' should be fine and eliminates the remaining risk of spurious messages here.