netplan try fails with 'Error: NetworkManager is not running.'

Bug #1989215 reported by William
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
One Hundred Papercuts
Triaged
Medium
Unassigned
netplan
Triaged
Medium
Unassigned

Bug Description

```
$ sudo netplan try
Error: NetworkManager is not running.

An error occurred: Command '['nmcli', 'general', 'status']' returned non-zero exit status 8.

Reverting.
$ nmcli general status
STATE CONNECTIVITY WIFI-HW WIFI WWAN-HW WWAN
connected full enabled enabled enabled enabled
$ sudo netplan try
Error: NetworkManager is not running.

An error occurred: Command '['nmcli', 'general', 'status']' returned non-zero exit status 8.

Reverting.

```

It appears that netplan doesn't wait for NetworkManager to come back up after restarting it. In particular at https://github.com/canonical/netplan/blob/e11e0acb48641800d8399ee1b71d27cd67c330f7/netplan/cli/commands/apply.py#L296 the check_output call likely raises a CalledProcessError, which should be caught in case NetworkManager hasn't finished starting yet.

William (wlaub)
description: updated
description: updated
description: updated
Revision history for this message
Simon Chopin (schopin) wrote :

I'm unable to reproduce locally, which isn't surprising given the nature of the issue.

The nmcli code is really unclear as to what the conditions are for the exit code 8 ("NM is not running"). Let's note that the NM service file specifically waits for the NetworkManager DBus name to be acquired before exiting, so at least on the main branch the nmcli calls should occur after that.

Could you please provide us with the netplan and OS version you're running?

Changed in netplan:
status: New → Incomplete
Revision history for this message
William (wlaub) wrote (last edit ):

OS is Ubuntu Core 20.04.5 on Asus PE100A
I'm not sure how to determine the netplan version, as there doesn't seem to be a version flag or command. Based on the files present on the device, it looks like 0.104.

Revision history for this message
William (wlaub) wrote (last edit ):

I've attached the output of journatctl during `netplan try`. Based on the timing of the output from the `netplan try`, it looks like netplan got its "NetworkManager is not running" error at 17:19:40, at which point it appears NetworkManager was running, but had not finished starting up. It seems NetworkManager reported that it had finished starting 5 seconds later at 17:19:45.

Edit: I've also now seen it fail when it appears to try after NetworkManager reports startup complete:
Oct 20 17:29:09 asus-server NetworkManager[12787]: <info> [1666286949.8153] manager: startup complete
Oct 20 17:29:09 asus-server systemd[1]: Started snap.network-manager.nmcli.1c4ab651-4151-49b3-86d6-17d53ef2d518.scope.
Oct 20 17:29:09 asus-server systemd-timesyncd[571]: Initial synchronization to time server 185.125.190.57:123 (ntp.ubuntu.com).
Oct 20 17:29:09 asus-server audit[12992]: SECCOMP auid=1000 uid=0 gid=0 ses=1 pid=12992 comm="nmcli" exe="/snap/network-manager/743/usr/bin/nmcli" sig=0 arch=c00000b7 syscall=274 compat=0 ip=0xffffa2d878c4 code=0x50000
Oct 20 17:29:09 asus-server audit[12787]: AVC apparmor="DENIED" operation="ptrace" profile="snap.network-manager.networkmanager" pid=12787 comm="NetworkManager" requested_mask="read" denied_mask="read" peer="snap.network-manager.nmcli"
Oct 20 17:29:09 asus-server audit[12787]: AVC apparmor="DENIED" operation="ptrace" profile="snap.network-manager.networkmanager" pid=12787 comm="NetworkManager" requested_mask="read" denied_mask="read" peer="snap.network-manager.nmcli"
Oct 20 17:29:09 asus-server audit[12787]: AVC apparmor="DENIED" operation="ptrace" profile="snap.network-manager.networkmanager" pid=12787 comm="NetworkManager" requested_mask="read" denied_mask="read" peer="snap.network-manager.nmcli"
Oct 20 17:29:09 asus-server systemd[1]: snap.network-manager.nmcli.1c4ab651-4151-49b3-86d6-17d53ef2d518.scope: Succeeded.
Oct 20 17:29:10 asus-server sudo[12677]: pam_unix(sudo:session): session closed for user root
Oct 20 17:29:11 asus-server systemd-networkd[3854]: eno1: Gained IPv6LL
Oct 20 17:29:11 asus-server systemd-networkd[3854]: enp1s0: Gained IPv6LL

Edit: If I set NetworkManager's debug.enable correctly, I also get
Oct 20 17:38:42 asus-server NetworkManager[15758]: <debug> [1666287522.3909] dispatcher: (7) failed to call dispatcher scripts: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.nm_dispatcher was not provided by any .service files
Oct 20 17:38:42 asus-server systemd[1]: Started snap.network-manager.nmcli.909d6d78-eb55-4b9f-8e6f-4400e29a59a7.scope.

Revision history for this message
Lukas Märdian (slyon) wrote :

Thanks for the logs. I agree we should probably check for exit code 8 ("NM not running") at https://github.com/canonical/netplan/blob/e11e0acb48641800d8399ee1b71d27cd67c330f7/netplan/cli/commands/apply.py#L296 and re-try after some extra delay in this case.

Changed in netplan:
status: Incomplete → New
importance: Undecided → Medium
status: New → Triaged
Paul White (paulw2u)
Changed in hundredpapercuts:
status: New → Triaged
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.