Error performing rename("eth1", "eno1") ... RTNETLINK answers: File exists
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Expired
|
Medium
|
Unassigned |
Bug Description
On my servers, the kernel randomly assigns the "eno1" and/or "eth0" names to *either* the first or second network interface in the server (not consistently) well before MAAS is able to call rename(). On a whim I tried assigning "net0" and "net1" as the network interface names on a set of 5 servers, and I stopped seeing the error in the attached screenshot (which previously appeared about 50% of the time when deploying a server).
My gut says MAAS should either check for and skip any name that was pre-assigned to NICs ("eno1", "eno2", "eth0", "eth1", etc.), or use *only* the stable enp* names as the defaults when first enlisting a server. That said, the problem appears to be differences in what was assigned by the kernel between the enlist, commission, deploy, and/or final reboot steps (see "eno1: renamed from eth1" a mere 3.46 seconds into the boot process, below), so it's possible the only solution is to use the stable enp* names OR use a naming scheme unique to MAAS, as I have done.
This bug affects MAAS 3.2 (possibly others) and all Ubuntu versions that I tried (18.04, 20.04, and 22.04).
[ 1.946954] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver
[ 1.947082] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[ 2.234689] ixgbe 0000:01:00.0: Multiqueue Enabled: Rx Queue count = 32, Tx Queue count = 32 XDP Queue count = 0
[ 2.312017] ixgbe 0000:01:00.0: 32.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x8 link)
[ 2.336383] ixgbe 0000:01:00.0: MAC: 3, PHY: 0, PBA No: 030C00-000
[ 2.336451] ixgbe 0000:01:00.0: 0c:c4:7a:35:29:a2
[ 2.484935] ixgbe 0000:01:00.0: Intel(R) 10 Gigabit Network Connection
[ 3.207607] ixgbe 0000:01:00.1: Multiqueue Enabled: Rx Queue count = 32, Tx Queue count = 32 XDP Queue count = 0
[ 3.292504] ixgbe 0000:01:00.1: 32.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x8 link)
[ 3.316800] ixgbe 0000:01:00.1: MAC: 3, PHY: 0, PBA No: 030C00-000
[ 3.316869] ixgbe 0000:01:00.1: 0c:c4:7a:35:29:a3
[ 3.466390] ixgbe 0000:01:00.1: Intel(R) 10 Gigabit Network Connection
[ 3.467967] ixgbe 0000:01:00.1 eno1: renamed from eth1
[ 139.439536] ixgbe 0000:01:00.0 net0: renamed from eth0
[ 139.474710] ixgbe 0000:01:00.1 net1: renamed from eno1
[ 146.557899] ixgbe 0000:01:00.1: registered PHC device on net1
[ 146.858216] ixgbe 0000:01:00.0: registered PHC device on net0
[ 152.666437] ixgbe 0000:01:00.1 net1: NIC Link is Up 10 Gbps, Flow Control: None
[ 152.999085] ixgbe 0000:01:00.0 net0: NIC Link is Up 10 Gbps, Flow Control: None
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
Changed in maas: | |
importance: | Undecided → Medium |
status: | New → Triaged |
We think that MAAS is instructing the kernel, and indirectly cloud-init, to use an interface name that may clash with an existing name, depending on the order of device discovery. We need to reproduce the issue and see if MAAS needs to control interface naming.
Could you share your system specs and steps you take to reproduce the problem?