mountall does not honour _netdev
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
mountall (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned |
Bug Description
Hi,
This is a fresh install of Ubuntu 14.04 LTS AMD64. I tried configuring a Ceph Rados Block Device (rbd) to be mounted during boot on /var/lib/one, containing my OpenNebula configuration and database.
The idea being that should the machine go belly up, I'll have an up-to-date snapshot of the OpenNebula data on Ceph to mount on the new frontend machine.
/etc/ceph/rbdmap is configured, I set up /etc/fstab with an entry:
/dev/rbd/
then rebooted. According to mount(8), _netdev is supposed to tell mountall to skip mounting this device until the network is up.
As seen from the attached snapshot, it doesn't bother to wait, and blindly tries to mount the RBD before connecting to Ceph: this will never work.
mountall seems to rely on *knowing* a list of network file systems: this means when someone comes up with a new network file system, or uses a conventional disk file system with a remote block device, mountall's heuristic falls flat on its face as has been demonstrated here. The problem would also exist for iSCSI, AoE, FibreChannel, nbd and drbd devices.
Due to bug 1313497, the keyboard is non-functional. Recovery is useless as the keyboard is broken there too, and now the machine is waiting for a keypress it will never see due to that bug. A headless system would similarly have this problem.
Two suggestions I would have:
1. mountall should honour _netdev to decide whether to mount a device or not: this gives the user the means to manually tell mountall that the device needs network access to operate even if the filesystem looks to be local. I'd wager that if the user specified _netdev, they probably meant it and likely know better than mountall.
2. mountall should time out after a predefined period and NEVER wait indefinitely: even if the disk is local. If a disk goes missing, then it is better the machine tries to boot in its degraded state so it can be remotely managed and raise an alarm, than to wait for someone to notice the machine being down.
Unfortunately since the machine is now effectively bricked, I can only grep proxy server logs to see what packages got installed. mountall_
While the mount(8) manpage says that _netdev causes the mount to be deferred until the network is up, this manpage was written in a bygone era when "network up" was a discrete event, which it hasn't been for a long time. The current behavior is that _netdev devices will be tried immediately on boot, and tried again each time a network interface comes up. If this doesn't give the desired results, I think this is a bug in the ceph driver - not in mountall, which has been tested with _netdev (and network filesystems) repeatedly and shown to work correctly.
> As seen from the attached snapshot, it doesn't bother to wait,
> and blindly tries to mount the RBD before connecting to Ceph:
> this will never work.
If there is a specific connection that needs to be made before running the mount command, then I don't think that's something mountall can be expected to handle. Something else on the system would need to intercept the request for a ceph mount, and block it until ceph is available.