Error performing rename("eth1", "eno1") ... RTNETLINK answers: File exists

Bug #1990761 reported by Tobias McNulty
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Expired
Medium
Unassigned

Bug Description

On my servers, the kernel randomly assigns the "eno1" and/or "eth0" names to *either* the first or second network interface in the server (not consistently) well before MAAS is able to call rename(). On a whim I tried assigning "net0" and "net1" as the network interface names on a set of 5 servers, and I stopped seeing the error in the attached screenshot (which previously appeared about 50% of the time when deploying a server).

My gut says MAAS should either check for and skip any name that was pre-assigned to NICs ("eno1", "eno2", "eth0", "eth1", etc.), or use *only* the stable enp* names as the defaults when first enlisting a server. That said, the problem appears to be differences in what was assigned by the kernel between the enlist, commission, deploy, and/or final reboot steps (see "eno1: renamed from eth1" a mere 3.46 seconds into the boot process, below), so it's possible the only solution is to use the stable enp* names OR use a naming scheme unique to MAAS, as I have done.

This bug affects MAAS 3.2 (possibly others) and all Ubuntu versions that I tried (18.04, 20.04, and 22.04).

[ 1.946954] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver
[ 1.947082] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[ 2.234689] ixgbe 0000:01:00.0: Multiqueue Enabled: Rx Queue count = 32, Tx Queue count = 32 XDP Queue count = 0
[ 2.312017] ixgbe 0000:01:00.0: 32.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x8 link)
[ 2.336383] ixgbe 0000:01:00.0: MAC: 3, PHY: 0, PBA No: 030C00-000
[ 2.336451] ixgbe 0000:01:00.0: 0c:c4:7a:35:29:a2
[ 2.484935] ixgbe 0000:01:00.0: Intel(R) 10 Gigabit Network Connection
[ 3.207607] ixgbe 0000:01:00.1: Multiqueue Enabled: Rx Queue count = 32, Tx Queue count = 32 XDP Queue count = 0
[ 3.292504] ixgbe 0000:01:00.1: 32.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x8 link)
[ 3.316800] ixgbe 0000:01:00.1: MAC: 3, PHY: 0, PBA No: 030C00-000
[ 3.316869] ixgbe 0000:01:00.1: 0c:c4:7a:35:29:a3
[ 3.466390] ixgbe 0000:01:00.1: Intel(R) 10 Gigabit Network Connection
[ 3.467967] ixgbe 0000:01:00.1 eno1: renamed from eth1
[ 139.439536] ixgbe 0000:01:00.0 net0: renamed from eth0
[ 139.474710] ixgbe 0000:01:00.1 net1: renamed from eno1
[ 146.557899] ixgbe 0000:01:00.1: registered PHC device on net1
[ 146.858216] ixgbe 0000:01:00.0: registered PHC device on net0
[ 152.666437] ixgbe 0000:01:00.1 net1: NIC Link is Up 10 Gbps, Flow Control: None
[ 152.999085] ixgbe 0000:01:00.0 net0: NIC Link is Up 10 Gbps, Flow Control: None

Revision history for this message
Tobias McNulty (tobias-mcnulty) wrote :
description: updated
description: updated
description: updated
description: updated
description: updated
description: updated
description: updated
Changed in maas:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

We think that MAAS is instructing the kernel, and indirectly cloud-init, to use an interface name that may clash with an existing name, depending on the order of device discovery. We need to reproduce the issue and see if MAAS needs to control interface naming.

Could you share your system specs and steps you take to reproduce the problem?

Changed in maas:
status: Triaged → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for MAAS because there has been no activity for 60 days.]

Changed in maas:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.