curtin dname for bcache uses unstable devname instead of UUID
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
bcache-tools |
New
|
Undecided
|
Unassigned | |||
curtin |
Fix Released
|
Undecided
|
Unassigned | |||
bcache-tools (Debian) |
Confirmed
|
Unknown
|
||||
bcache-tools (Ubuntu) | ||||||
Trusty |
Invalid
|
Medium
|
Unassigned | |||
Xenial |
Invalid
|
Medium
|
Unassigned | |||
Artful |
Invalid
|
Medium
|
Unassigned | |||
Bionic |
Invalid
|
Medium
|
Unassigned | |||
curtin (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | |||
Trusty |
Won't Fix
|
Undecided
|
Unassigned | |||
Xenial |
New
|
Undecided
|
Unassigned | |||
Artful |
New
|
Undecided
|
Unassigned | |||
Bionic |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[Impact]
* Current users of bcache devices may encounter unreliable device
numbering as the Linux kernel does not guarantee that bcache
minor numbers are assigned to the same devices at each boot.
Users who may have used /dev/bcacheN in paths to a specific
device could possible be pointing to a different dataset
altogether. bcache udev rules do provide some mechanism to
generate persistent symlinks in /dev/bcache/by-uuid or
/dev/
the underlying device. However, the Linux kernel does not
always generate an kernel uevent to trigger the udev rules
to create the symlink.
* The fix adds a udev program which will read bcache superblock
of slave devices and extract the UUID and LABEL, exporting them
to udev for use in the bcache rule files.
* This is affected in upstream bcache-tools, the owning package
of the udev rules. This affects all releases of bcache-tools
as the rules rely upon the kernel to trigger these events,
though that is not a requirement to resolve the lack of
persistent links.
[Test Case]
* Launch and Ubuntu Cloud Image with 3 unused disks
- apt install bcache-tools tree
- make-bcache -C /dev/vdb
- make-bcache -B /dev/vdc
- make-bcache -B /dev/vdd
- echo "vdc" > /sys/class/
- echo "vdd" > /sys/class/
- reboot
- Run this test:
#!/bin/bash
FAIL=0
[ ! -d /dev/bcache ] && {
echo "FAIL: /dev/bcache is not a directory";
exit 1
}
for label in /dev/bcache/
if [ "$LABEL_TARGET" != "$KNAME" ]; then
echo "FAIL: label points to $LABEL_TARGET but symlink points to $DEVNAME";
FAIL=1
fi;
done
if [ "$FAIL" == "0" ]; then
echo "PASS";
exit 0
fi
exit 1
[Regression Potential]
* As bcache minor numbers and these symlinks have been unreliable in
the past there may be code that makes assumptions about
/dev/bcache* expanded only to the block devices, versus
/dev/bcache which is a directory.
[Original Description]
Bcache device names like /dev/bcache0 are unstable. Bcache does not use any predictable ordering when assembling bcache devices, so on systems with multiple bcache devices, a symlink to /dev/bcache0 may end up pointing do a different device.
the bcache dname symlink should point to the /dev/bcache/
Related bugs:
* bug 1729145: [kernel] /dev/bcache/by-uuid links not created after reboot
Related branches
- Server Team CI bot: Approve (continuous-integration)
- Scott Moser (community): Approve
- Dmitrii Shcherbakov (community): Approve
-
Diff: 676 lines (+148/-259)12 files modifiedcurtin/block/bcache.py (+87/-0)
curtin/commands/block_meta.py (+11/-4)
curtin/commands/curthooks.py (+2/-2)
dev/null (+0/-128)
tests/unittests/test_make_dname.py (+28/-1)
tests/vmtests/__init__.py (+5/-1)
tests/vmtests/test_basic.py (+8/-10)
tests/vmtests/test_lvm.py (+0/-9)
tests/vmtests/test_mdadm_bcache.py (+7/-73)
tests/vmtests/test_nvme.py (+0/-18)
tests/vmtests/test_raid5_bcache.py (+0/-4)
tests/vmtests/test_uefi_basic.py (+0/-9)
tags: | added: cpe-onsite |
Changed in bcache-tools (Ubuntu): | |
status: | New → Confirmed |
importance: | Undecided → Medium |
status: | Confirmed → In Progress |
assignee: | nobody → Ryan Harper (raharper) |
Changed in bcache-tools (Ubuntu Trusty): | |
status: | New → Confirmed |
Changed in bcache-tools (Ubuntu Xenial): | |
status: | New → Confirmed |
Changed in bcache-tools (Ubuntu Artful): | |
status: | New → Confirmed |
Changed in bcache-tools (Ubuntu Trusty): | |
importance: | Undecided → Medium |
Changed in bcache-tools (Ubuntu Xenial): | |
importance: | Undecided → Medium |
Changed in bcache-tools (Ubuntu Artful): | |
importance: | Undecided → Medium |
Changed in bcache-tools (Debian): | |
status: | Unknown → New |
description: | updated |
description: | updated |
Changed in bcache-tools (Debian): | |
status: | New → Confirmed |
Changed in curtin: | |
status: | New → Fix Committed |
no longer affects: | bcache-tools (Ubuntu) |
Changed in bcache-tools (Ubuntu Bionic): | |
status: | In Progress → Invalid |
Changed in bcache-tools (Ubuntu Artful): | |
status: | Confirmed → Invalid |
Changed in bcache-tools (Ubuntu Xenial): | |
status: | Confirmed → Invalid |
Changed in bcache-tools (Ubuntu Trusty): | |
status: | Confirmed → Invalid |
Changed in bcache-tools (Ubuntu Bionic): | |
assignee: | Ryan Harper (raharper) → nobody |
Copied from a private bug:
Currently we have no stability in /dev/bcache<n> device names:
* minor numbers for bcache devices are not guaranteed to stay the same across reboots because there is no guaranteed enumeration;
* uevent details for bcache devices do not propagate an underlying disk's serial number
* serial numbers of disks are driver-specific device attributes - there is no guarantee that this is exposed
====
/dev/ disk/by- dname/< device- name> symlinks provided by curtin are not
reliable as they merely depend on kernel-provided name which is
unstable:
cat /etc/udev/ rules.d/ bcache0. rules.rules ="block" , ACTION= ="add|change" , ENV{DEVNAME} =="/dev/ bcache0" , SYMLINK+ ="disk/ by-dname/ bcache0"
SUBSYSTEM=
dname symlink rules for block devices depend on a partition uuid - if a device doesn't have any partition pre-created a symlink will not be created:
cat /etc/udev/ rules.d/ sda.rules. rules ="block" , ACTION= ="add|change" , ENV{DEVTYPE} =="disk" , ENV{ID_ PART_TABLE_ UUID}== "5a492040" , SYMLINK+ ="disk/ by-dname/ sda"
SUBSYSTEM=
There is no way in MAAS to pre-create a GUID Partition Table without a
partition and a file system for a bcache device (no isolated API call
for partition table creation - only for file systems).
====
Why is this important for bcache usage?
Raw block devices need to be used by ceph-disk in cases where it needs
a device without a file system or partition table, namely, ceph
journal (used without a file system normally), ceph bluestore (for
both data and metadata journal. Bluestore is important especially
because it was designated to work with a raw block device. Using
bluestore on top of a pre-created file system is an improper usage
scenario.
====
Ways to mitigate:
1. Introduce a new udev rule which sets up /dev/by- backing/ <backing-
device-name> symlinks to bcache devices:
cat /etc/udev/ rules.d/ bcache- by-backing. rules.rules ="block" , ACTION= ="add|change" , ENV{DEVNAME} =="/dev/ bcache* ", PROGRAM= "/lib/udev/ bcache- name-helper. sh $kernel", SYMLINK+ ="disk/ by-backing/ $result"
SUBSYSTEM=
cat /lib/udev/ bcache- name-helper. sh $1/slaves/ | tail -n1
#!/bin/sh -e
logger Getting a backing device for a bcache device $1 by sysfs file creation timestamp
ls -c -1t /sys/block/
tree /dev/disk/ by-backing/ disk/by- backing/
/dev/
├── sdc -> ../../bcache2
├── sdd -> ../../bcache1
├── sde -> ../../bcache0
└── sdf -> ../../bcache3
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdf 8:80 0 64G 0 disk
└─bcache3 252:48 0 64G 0 disk
sdd 8:48 0 64G 0 disk
└─bcache1 252:16 0 64G 0 disk
sdb 8:16 0 64G 0 disk
├─bcache0 252:0 0 64G 0 disk
├─bcache3 252:48 0 64G 0 disk
├─bcache1 252:16 0 64G 0 disk
└─bcache2 252:32 0 64G 0 disk
sde 8:64 0 64G 0 disk
└─bcache0 252:0 0 64G 0 disk
sdc 8:32 0 64G 0 disk
└─bcache2 252:32 0 64G 0 disk
sda 8:0 0 64G 0 disk
└─sda1 8:1 0 64G 0 part /
2. Modify the Linux kernel source code to include a way to identify ...