udev interface fails in privileged containers

Bug #1712808 reported by Colin Watson
74
This bug affects 12 people
Affects Status Importance Assigned to Milestone
Snapcraft
New
Undecided
Unassigned
snapd
Confirmed
Medium
Unassigned
lxd (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

I think this is possibly a known issue since there's evidence of a workaround in e.g. https://stgraber.org/2017/01/13/kubernetes-inside-lxd/, but I couldn't find any proper discussion of it.

Installing snaps in a privileged LXD container fails. Here's a test script:

  $ lxc launch -c security.privileged=true ubuntu:16.04 snap-test
  $ lxc exec snap-test apt update
  $ lxc exec snap-test apt install squashfuse
  $ lxc exec snap-test snap install hello-world
  2017-08-24T12:03:59Z INFO cannot auto connect core:core-support-plug to core:core-support: (slot auto-connection), existing connection state "core:core-support-plug core:core-support" in the way
  error: cannot perform the following tasks:
  - Setup snap "core" (2462) security profiles (cannot setup udev for snap "core": cannot reload udev rules: exit status 2
  udev output:
  )
  - Setup snap "core" (2462) security profiles (cannot reload udev rules: exit status 2
  udev output:
  )

This is because /sys is mounted read-only in privileged containers (presumably to avoid causing havoc to the host) and so the systemd-udevd service isn't started. The prevailing recommendation seems to be to work around it by making /usr/local/bin/udevadm be a symlink to /bin/true, but this looks like a hack rather than a proper fix.

Revision history for this message
Colin Watson (cjwatson) wrote :

On IRC, Stéphane suggested making the container "even more privileged" as a cleaner workaround, by adding the following to raw.lxc:

  lxc.mount.auto=
  lxc.mount.auto=proc:rw sys:rw

(I also had to fiddle with my restrictive policy-rc.d script to allow udev to start.)

Perhaps documenting that somewhere reasonably findable would be good enough?

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I'm not quite sure what's the difference between the regular and privileged (or more privileged) containers but last time we looked at similar issues we came to the conclusion that any container in which apparmor is not stacked but instead directly shared with the host is unsupportable for us. I'm not sure if this is the same problem again. I didn't try to reproduce it yet.

Revision history for this message
Colin Watson (cjwatson) wrote :

The "even more privileged" workarounds have been working in launchpad-buildd for a while now. We can't use unprivileged containers for various reasons, for example because one of the categories of builds that needs to install snaps sometimes is live filesystem builds, and those do various things like mknod that'll never work in unprivileged containers.

Of course, launchpad-buildd is somewhat special in that it typically only runs a single build before shutting down the VM, so I can imagine that there might be some isolation failures that are a problem in general but that don't affect us in practice. Please don't outright forbid privileged containers though, as we don't really have a good alternative.

Michael Vogt (mvo)
Changed in snapd:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I'm wondering what we can do about it.

When we're not running in a unprivileged container anything that we do inside (tweak cgroups, tweak apparmor) will contaminate the host. If the host also uses snaps those definitions will conflict and collide.

I see two options:

1) Close as WONTFIX as in reality this cannot work very well
2) Make it so that launchpad doesn't have to do hacks ... somehow and ignore the contamination

I'm not so sure how 2) would even look like. Shall we ignore errors? Even if we do snaps may fail at runtime, depending on what they do.

Could launchpad spawn a VM instead of a container for this? (I know it's far heavier)

Changed in snapd:
status: Triaged → Incomplete
Revision history for this message
Colin Watson (cjwatson) wrote :

I filed this bug because it seems ugly, but it does at least work with our current hacks, so closing this as Won't Fix would be better than changing something in a way that makes our hacks not work. :-) If you feel you need to close it then go ahead.

We already run every build in a dedicated VM that's reset at the start of each build (hence why we really don't care whether the container contaminates the host - the host is going to be thrown away anyway). However, those VMs are generic: for instance, they're currently all xenial rather than being for the release we're building for. We use the container both to avoid too much in the way of interference from the software that runs the builder itself and to arrange for the build to be running on the appropriate version of Ubuntu. Using another VM here would both be more complicated/expensive to set up and either slower to run or entirely non-functional due to requiring nested virtualisation. So no, we can't reasonably switch to a VM rather than a container.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for snapd because there has been no activity for 60 days.]

Changed in snapd:
status: Incomplete → Expired
Anthony Fok (foka)
Changed in snapd:
status: Expired → Confirmed
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (3.4 KiB)

This will come up again and more frequently now that the LXD package upgrade will do the deb->snap transition even when running in a container itself.

As Colin I run (and others might) privileged containers a lot using those extra privs: http://paste.ubuntu.com/p/bcVHRBTKyP/

I never had an issue as I didn't try to snap-in-lxd on my own, but the new package transition will trigger this.

Due to that the severity of this case increases a bit.

[...]
Preparing to unpack .../16-apache2-utils_2.4.34-1ubuntu2_amd64.deb ...
Unpacking apache2-utils (2.4.34-1ubuntu2) over (2.4.34-1ubuntu1) ...
Preparing to unpack .../17-lxd-client_1%3a0.4_all.deb ...
Unpacking lxd-client (1:0.4) over (3.0.2-0ubuntu3) ...
Setting up apparmor (2.12-4ubuntu8) ...
Installing new version of config file /etc/apparmor.d/abstractions/private-files ...
Installing new version of config file /etc/apparmor.d/abstractions/private-files-strict ...
Installing new version of config file /etc/apparmor.d/abstractions/ubuntu-browsers.d/user-files ...
Skipping profile in /etc/apparmor.d/disable: usr.sbin.rsyslogd
Setting up squashfs-tools (1:4.3-6ubuntu2) ...
Setting up libapparmor1:amd64 (2.12-4ubuntu8) ...
Setting up systemd (239-7ubuntu10) ...
Setting up udev (239-7ubuntu10) ...
update-initramfs: deferring update (trigger activated)
Setting up snapd (2.35.5+18.10) ...
snapd.failure.service is a disabled or a static unit, not starting it.
snapd.snap-repair.service is a disabled or a static unit, not starting it.
(Reading database ... 66334 files and directories currently installed.)
Preparing to unpack .../00-lxd_1%3a0.4_all.deb ...
Warning: Stopping lxd.service, but it can still be activated by:
  lxd.socket
=> Installing the LXD snap
==> Checking connectivity with the snap store
==> Installing the LXD snap from the latest track for ubuntu-18.10
error: cannot perform the following tasks:
- Setup snap "core" (5548) security profiles (cannot setup udev for snap "core": cannot reload udev rules: exit status 2
udev output:
)
- Setup snap "core" (5548) security profiles (cannot reload udev rules: exit status 2
udev output:
)
dpkg: error processing archive /tmp/apt-dpkg-install-R4N7rz/00-lxd_1%3a0.4_all.deb (--unpack):
 new lxd package pre-installation script subprocess returned error exit status 1
Preparing to unpack .../01-open-iscsi_2.0.874-5ubuntu9_amd64.deb ...
[...]

Interesting to me was that a subsequent
$ apt --fix-broken install
does fix it up.

Might there be an ordering issue in the snap/lxd updates that are not an issue for "real" Bionic->Cosmic upgraders?

(Reading database ... 66334 files and directories currently installed.)
Preparing to unpack .../archives/lxd_1%3a0.4_all.deb ...
Warning: Stopping lxd.service, but it can still be activated by:
  lxd.socket
=> Installing the LXD snap
==> Checking connectivity with the snap store
==> Installing the LXD snap from the latest track for ubuntu-18.10
2018-10-16T08:16:38Z INFO Waiting for restart...
lxd 3.6 from Canonical✓ installed
Channel stable/ubuntu-18.10 for lxd is closed; temporarily forwarding to stable.
==> Cleaning up leftovers
Synchronizing state of lxd.service with SysV service script with /lib/systemd/systemd-sysv-...

Read more...

Revision history for this message
Stuart Bishop (stub) wrote :

I just hit this in a 16.04 container, but for reasons I don't understand installing the core snap first worked around the problem:

$ sudo snap install go --classic
error: cannot perform the following tasks:
- Setup snap "core" (5662) security profiles (cannot setup udev for snap "core": cannot reload udev rules: exit status 2
udev output:
)
- Setup snap "core" (5662) security profiles (cannot reload udev rules: exit status 2
udev output:
)

$ sudo snap install core
core 16-2.35.4 from 'canonical' installed

$ sudo snap install go --classic
go 1.11.1 from Michael Hudson-Doyle (mwhudson) installed

Revision history for this message
Stéphane Graber (stgraber) wrote :

Yeah, we've seen that re-running the command usually gets you past the error, so in your case, just running the "snap install go --classic" would likely have been enough.

Revision history for this message
Marco Trevisan (Treviño) (3v1n0) wrote :

Actually to get this working I only needed to use this:

# Mount cgroup in rw to get snaps working
lxc.mount.auto=cgroup:rw

No need to have whole sys and proc as rw (as the problem is due to the snap to try chowning `/sys/fs/cgroup/freezer/snap.*` dirs, however I'm wondering if there's a better way to do this inside the container itself, since this way I guess that two containers sharing the host would have troubles, isn't it?

Revision history for this message
Stéphane Graber (stgraber) wrote :

Hmm, cgroup:rw has absolutely nothing to do with this.
LXD uses a cgroup namespace by default which completely ignores that particular setting.

With the cgroup namespace, root in the container is allowed to do anything it wants to the /sys/fs/cgroup tree.

root@disco:~# mkdir /sys/fs/cgroup/freezer/snap.blah
root@disco:~# chown 1000:1000 /sys/fs/cgroup/freezer/snap.blah

The error also quite clearly comes from udev rather than anything cgroup related:

root@disco:~# snap install hello-world
error: cannot perform the following tasks:
- Setup snap "core" (6531) security profiles (cannot setup udev for snap "core": cannot reload udev rules: exit status 2
udev output:
)
- Setup snap "core" (6531) security profiles (cannot reload udev rules: exit status 2
udev output:
)
root@disco:~# snap install hello-world
2019-03-27T20:18:56Z INFO Waiting for restart...
hello-world 6.3 from Canonical✓ installed
root@disco:~#

Revision history for this message
Marco Trevisan (Treviño) (3v1n0) wrote :

I was not doing this in lxd, but in an unprivileged lxc (not sure if it changes the things) that I've in my qnap nas, but without it I wasn't able to use snap at all.

I guess it reduces securty, but eventually I'm still protected by the container itself.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Yeah, unprivileged LXC is likely to work pretty differently in the way it handles both cgroups and apparmor namespacing both of which are very relevant when you want to run snaps.

Revision history for this message
Stéphane Graber (stgraber) wrote :

At the last engineering sprint, Zygmunt on the snapd team indicate that this was or would soon be sorted out in snapd.

Changed in lxd (Ubuntu):
status: New → Invalid
Revision history for this message
Ian Johnson (anonymouse67) wrote :

For reference, the PR that Zygmunt had which was planned to fix this was https://github.com/snapcore/snapd/pull/8219, but there were issues with that approach. We need to pick it up again and rework to get an approach which matches the comments from Jamie there.

Revision history for this message
Alireza Nasri (sysnasri) wrote :

when this will be fixed?

Revision history for this message
L&L (bass1957) wrote :

+1
impacting ffmpeg snap when sharing a gpu

Revision history for this message
Marco Trevisan (Treviño) (3v1n0) wrote :

Looks like that the error message is quite misleading... Installing and running snaps in privileged containers works quite well, the problem is that apparently udev needs `/lib/modules/` to be available.

In fact, in a completely new privileged LXD instance:

ubuntu@ubuntu-bp:~$ sudo snap install hello
error: cannot perform the following tasks:
- Setup snap "core" (11993) security profiles (cannot reload udev rules: exit status 2
udev output:
)
ubuntu@ubuntu-bp:~$ sudo mkdir /lib/modules
ubuntu@ubuntu-bp:~$ sudo snap install hello
Download snap "core" (11993) from channel "stable" \error: change finished in status "Undone" with no error message
ubuntu@ubuntu-bp:~$ sudo snap install hello
2021-12-02T13:36:05Z INFO Waiting for automatic snapd restart...
hello 2.10 from Canonical✓ installed
ubuntu@ubuntu-bp:~$ hello
Hello, world!

So I think that this issue is really easy to fix, we just need to ensure that such directory is there.

Revision history for this message
Ian Johnson (anonymouse67) wrote :

@Marco, can you reproduce that behavior without creating the directory? I.e. just start a new instance and then run `snap install hello` twice and see if it works? AFAIK, that has always been the workaround of choice is just running it twice initially for some reason...

Revision history for this message
Marco Trevisan (Treviño) (3v1n0) wrote :

IIRC it wasn't working doing multiple times, let me try again and report if that changes the thing.

Revision history for this message
Marco Trevisan (Treviño) (3v1n0) wrote :

So yeah, it looks like that the modules dir was just unneeded. I was doing this because I needed in the past to get access to the (host) fuse module, but not really needed for this.

So, yeah... It always works at 2nd try.

Revision history for this message
Jaume Sabater (jsabater) wrote (last edit ):

As far as I have been able to test, you don't actually need to create the /lib/modules directory but just run the command twice. At least that is what I have been doing in unprivileged containers on Proxmox 7.1 with the FUSE and Nesting options activated (using Debian 11):

~# apt install gnupg fuse squashfuse snapd

Shutdown container. Start again.

~# snap install core
error: cannot perform the following tasks:
- Setup snap "core" (11993) security profiles (cannot reload udev rules: exit status 1
udev output:
Failed to send reload request: No such file or directory
)
~# snap install core
core 16-2.52.1 from Canonical✓ installed

Revision history for this message
Jaume Sabater (jsabater) wrote :

In case it's of any use, I have made the following test on Proxmox 7.1 with LXC:

1. Create an empty LXC (hostname: test) based on Proxmox's Debian 11 template
2. Configure locales, APT mirror and update packages. Reboot.
3. Shutdown LXC, activate Options: Features: FUSE, start container.
4. apt install gnupg fuse squashfuse snapd
5. Shut down container. Start container.
6. snap install core: error (cannot reload udev rules).
7. snap install core: installed fine.
8. snap install --classic certbot: installed fine.

Error log:

error: cannot perform the following tasks:
- Setup snap "core" (11993) security profiles (cannot reload udev rules: exit status 1
udev output:
Failed to send reload request: No such file or directory
)

Is there anything that can be done prior to executing "snap install core" for the first time to prevent the error from happening?

Am I doing it wrong by installing snapd from the APT repo then using "snap install core"?

Thanks in advance.

Revision history for this message
fuomag (fuomag) wrote :

running sudo snap install certbot --classic twice worked for me, strange! (it failed the first time but not the second)

Revision history for this message
mike (atatimelikethis) wrote :

## Issue

To the extent this helps anyone rediscovering this problem...

env penguin container in the chrome os termina VM:
running Debian11 zsh shell

`cannot setup udev for snap "core": cannot run udev triggers: exit status 1`

running through various snap debugs, there was also a state json error

`snap error: cannot read the state file: open state.json: no such file or directory`

## Solution

**Note - will delete information about previous snap installs.**

from [snapcraft forum](https://forum.snapcraft.io/t/snapd-cannot-run-daemon-cannot-read-state-unexpected-eof/17908/2)

* check state.json
* delete state.json
* restart and do a new snap install of core

### Check state.json
`sudo ls -lah /var/lib/snapd/state.json`
output should be something like
`-rw------- 1 root root 38K Jun 11 11:56 /var/lib/snapd/state.json`

### Delete state.json
You can mv-rename the old file if you wish. I simply
`sudo rm /var/lib/snapd/state.json`

### Restart Container
you can exit and reboot chromeos or
sudo reboot,
or if you access via lx, log out into lxc and stop/start the container and login

Then simply run
`sudo snap install core`

Worked for me - good luck

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

I'm still having this in Jammy (22.04) with LXD snap v5.6, when doing a snapcraft pack using lxd it stops with this error right after creating the container. Running the pack again does succeed.

Important part of the error is:

* Command standard error output: b'error: cannot perform the following tasks:\n- Setup snap "snapd" (17336) security profiles (cannot reload udev rules: exit status 1\nudev output:\nFailed to send reload request: No such file or directory\n)\n'

Complete error here: https://pastebin.canonical.com/p/R89vpKFbrf/

This breaks automated builds for my snaps, specially the ones with multiple archs in the same snapcraft.yaml file, since it breaks for each one inside the same build.

Revision history for this message
Andre Ruiz (andre-ruiz) wrote :

Any news on this? The bug turned 5 years old a few months back. Still very relevant, still breaking automated snap builds (and requires many workarounds in automation).

Revision history for this message
Quoc (quocdung1974) wrote :

Same issue on Ubuntu Jammy (22.04) LXD container when trying to install cinnamon desktop which includes Firefox snap

Preparing to unpack .../firefox_1%3a1snap1-0ubuntu2_amd64.deb ...
=> Installing the firefox snap
==> Checking connectivity with the snap store
==> Installing the firefox snap
error: cannot perform the following tasks:
- Setup snap "firefox" (2487) security profiles (cannot setup udev for snap "firefox": cannot reload udev rules: exit status 1
udev output:
Failed to send reload request: No such file or directory
)
- Setup snap "firefox" (2487) security profiles (cannot reload udev rules: exit status 1
udev output:
Failed to send reload request: No such file or directory
)
- Setup snap "firefox" (2487) security profiles for auto-connections (cannot reload udev rules: exit status 1
udev output:
Failed to send reload request: No such file or directory
)
dpkg: error processing archive /var/cache/apt/archives/firefox_1%3a1snap1-0ubuntu2_amd64.deb (--unpack):

new firefox package pre-installation script subprocess returned error exit status 1
Errors were encountered while processing:

/var/cache/apt/archives/firefox_1%3a1snap1-0ubuntu2_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.