hot add VF to net_failover - could not rename interface '8' from 'eth0' to 'ens4': Device or resource busy
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Bionic |
Invalid
|
Undecided
|
Unassigned | ||
linux-oracle (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
High
|
Marcelo Cerri |
Bug Description
[Impact]
udev fails to rename a new interface when a VF is added due to a race between the kernel and userspace.
It's desirable to get a consistent and predictable name, as otherwise any persistent configuration in userspace can't be applied properly on the Virtual Function that gets hot plugged in.
[Test Case]
Host has a QEMU/KVM setup with standby virtio-net [1]:
# qemu-system-x86_64 -name guest=ubuntu-
Guest is loaded with Xenial Xerus (16.04.5),
vsbalakr@
Linux ubuntu-16 4.15.0-1007-oracle #9~16.04.1-Ubuntu SMP Wed Dec 12 19:49:55 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
vsbalakr@
Ubuntu 4.15.0-
vsbalakr@
NAME="Ubuntu"
VERSION="16.04.5 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.5 LTS"
VERSION_ID="16.04"
HOME_URL="http://
SUPPORT_URL="http://
BUG_REPORT_URL="http://
VERSION_
UBUNTU_
vsbalakr@
ens3 is the master interface of net_failover, while ens3nsby is its standby slave [2]:
vsbalakr@
1: lo: <LOOPBACK,
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,
link/ether d6:36:08:7f:b4:d9 brd ff:ff:ff:ff:ff:ff
inet 10.211.15.21/21 brd 10.211.15.255 scope global ens3
valid_lft forever preferred_lft forever
inet6 fe80::d436:
valid_lft forever preferred_lft forever
3: ens3nsby: <BROADCAST,
link/ether d6:36:08:7f:b4:d9 brd ff:ff:ff:ff:ff:ff
inet 10.211.15.21/21 brd 10.211.15.255 scope global dynamic ens3nsby
valid_lft 2154sec preferred_lft 2154sec
inet6 fe80::d436:
valid_lft forever preferred_lft forever
Now we hot plug a Virtual Function (with MAC set to same address d6:36:08:7f:b4:d9 in prior) into the guest, via QEMU HMP console:
(qemu) device_add vfio-pci,
(qemu)
VF now shows up in guest as "eth0" instead of the expected "ens4":
vsbalakr@
1: lo: <LOOPBACK,
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,
link/ether d6:36:08:7f:b4:d9 brd ff:ff:ff:ff:ff:ff
inet 10.211.15.21/21 brd 10.211.15.255 scope global ens3
valid_lft forever preferred_lft forever
inet6 2606:b400:
inet6 2606:b400:
inet6 fe80::d436:
valid_lft forever preferred_lft forever
3: ens3nsby: <BROADCAST,
link/ether d6:36:08:7f:b4:d9 brd ff:ff:ff:ff:ff:ff
inet 10.211.15.21/21 brd 10.211.15.255 scope global dynamic ens3nsby
valid_lft 2072sec preferred_lft 2072sec
inet6 fe80::d436:
valid_lft forever preferred_lft forever
6: eth0: <BROADCAST,
link/ether d6:36:08:7f:b4:d9 brd ff:ff:ff:ff:ff:ff
vsbalakr@
/var/log/syslog shows that renaming to "ens4" had failed because of "Device or resource busy",
Feb 8 18:24:05 ubuntu-16 kernel: [ 5270.231623] ixgbevf 0000:00:04.0: NIC Link is Up 10 Gbps
Feb 8 18:24:05 ubuntu-16 kernel: [ 5270.233188] IPv6: ADDRCONF(
Feb 8 18:24:05 ubuntu-16 systemd-
Feb 8 18:24:05 ubuntu-16 systemd-
Feb 8 18:24:05 ubuntu-16 systemd-
Feb 8 18:24:05 ubuntu-16 systemd-
Feb 8 18:24:05 ubuntu-16 systemd-
Feb 8 18:24:05 ubuntu-16 systemd-
Feb 8 18:24:05 ubuntu-16 systemd-
Feb 8 18:24:05 ubuntu-16 kernel: [ 5270.236275] virtio_net virtio0 ens3: failover primary slave:eth0 registered
Feb 8 18:24:05 ubuntu-16 kernel: [ 5270.236308] ixgbevf 0000:00:04.0: d6:36:08:7f:b4:d9
Feb 8 18:24:05 ubuntu-16 kernel: [ 5270.236310] ixgbevf 0000:00:04.0: MAC: 4
Feb 8 18:24:05 ubuntu-16 kernel: [ 5270.236313] ixgbevf 0000:00:04.0: Intel(R) 82599 Virtual Function
Feb 8 18:24:05 ubuntu-16 systemd-
Feb 8 18:24:05 ubuntu-16 systemd-
Feb 8 18:24:05 ubuntu-16 NetworkManager[
Feb 8 18:24:05 ubuntu-16 systemd-
Feb 8 18:24:05 ubuntu-16 NetworkManager[
Feb 8 18:24:05 ubuntu-16 NetworkManager[
Feb 8 18:24:05 ubuntu-16 systemd-udevd[516]: could not create device: Invalid argument
it's desirable to get a consistent and preditable name, as otherwise any persistent configuration in userspace can't be applied properly on the Virtual Function that gets hot plugged in.
[1] https:/
[2] https:/
---
ApportVersion: 2.20.1-0ubuntu2.18
Architecture: amd64
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 16.04
EcryptfsInUse: Yes
HibernationDevice: RESUME=
Lsusb: Error: command ['lsusb'] failed with exit code 1:
MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
Package: linux (not installed)
ProcEnviron:
TERM=dtterm
PATH=(custom, no user)
XDG_RUNTIME_
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB:
ProcKernelCmdLine: BOOT_IMAGE=
ProcVersionSign
RelatedPackageV
linux-
linux-
linux-firmware 1.157.21
RfKill:
Tags: xenial xenial
Uname: Linux 4.15.0-1007-oracle x86_64
UnreportableReason: The report belongs to a package that is not installed.
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin lxd plugdev sambashare sudo
_MarkForUpload: False
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: 1.11.0-2.el7
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.
dmi.modalias: dmi:bvnSeaBIOS:
dmi.product.name: Standard PC (i440FX + PIIX, 1996)
dmi.product.
dmi.sys.vendor: QEMU
[Regression Potential]
The proposed solution introduces low risk of regression because it only affects the linux-oracle kernel on instances using net_failover. In case of regressions it's also possible to disable the new behaviour via the kernel cmdline.
tags: | added: bjf-tracking |
tags: | added: patch |
Changed in linux (Ubuntu Bionic): | |
status: | New → Invalid |
Changed in linux (Ubuntu): | |
status: | Confirmed → Invalid |
Changed in linux-oracle (Ubuntu Bionic): | |
status: | New → In Progress |
importance: | Undecided → High |
assignee: | nobody → Marcelo Cerri (mhcerri) |
description: | updated |
Changed in linux-oracle (Ubuntu Bionic): | |
status: | In Progress → Fix Committed |
Changed in linux-oracle (Ubuntu Bionic): | |
status: | Fix Committed → Fix Released |
This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:
apport-collect 1815268
and then change the status of the bug to 'Confirmed'.
If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.
This change has been made by an automated script, maintained by the Ubuntu Kernel Team.