Problem #2 is a bug in procfs is breaking the snap-confine AppArmor profile and causing the following rule to not work as intended:
owner @{PROC}/*/mountinfo r,
procfs is incorrectly setting the uid and gid on its /proc/PID/* inodes that represent a setuid-container-root process. Here's how I came to that conclusion:
# Create a setuid-container-root program that doesn't quickly exit so you can poke around procfs
tyhicks@host:~$ container_root=$(sudo stat -c %u /var/lib/lxd/containers/yakkety/rootfs/)
tyhicks@host:~$ echo $container_root
296608
tyhicks@host:~$ sudo cp $(which sleep) /var/lib/lxd/containers/yakkety/rootfs/usr/bin/setuid-sleep
tyhicks@host:~$ sudo chown $container_root:$container_root \
/var/lib/lxd/containers/yakkety/rootfs/usr/bin/setuid-sleep
tyhicks@host:~$ sudo chmod u+s /var/lib/lxd/containers/yakkety/rootfs/usr/bin/setuid-sleep
# Run the setuid-sleep program as an unprivileged user inside the container.
# The procfs inodes for that process are incorrectly owned by init_ns root
tyhicks@host:~$ lxc exec yakkety -- su - ubuntu -c 'setuid-sleep 5m'
# From another host terminal
tyhicks@host:~$ sudo stat -c %u /proc/$(pidof setuid-sleep)/mountinfo
0
# Run the setuid-sleep program as root inside the container.
# The procfs inodes for that process are incorrectly owned by container_ns root
tyhicks@host:~$ lxc exec yakkety -- setuid-sleep 5m
# From another host terminal
tyhicks@host:~$ sudo stat -c %u /proc/$(pidof setuid-sleep)/mountinfo
296608
This explains the AppArmor denial from comment #3 containing "fsuid=296608 ouid=0". The setuid-container-root snap-confine task is correctly running as fsuid 296608 (container_ns root) but the mountinfo inode is correctly assigned uid 0 (init_ns root). This causes the "owner" conditional on the AppArmor rule to not match and for snap-confine to segfault after it can't access /proc/self/mountinfo.
The procfs kernel bug needs to be fixed but, in the meantime, we can probably drop the "owner" conditional in the snap-confine profile.
Problem #2 is a bug in procfs is breaking the snap-confine AppArmor profile and causing the following rule to not work as intended:
owner @{PROC}/*/mountinfo r,
procfs is incorrectly setting the uid and gid on its /proc/PID/* inodes that represent a setuid- container- root process. Here's how I came to that conclusion:
# Create a setuid- container- root program that doesn't quickly exit so you can poke around procfs root=$( sudo stat -c %u /var/lib/ lxd/containers/ yakkety/ rootfs/ ) lxd/containers/ yakkety/ rootfs/ usr/bin/ setuid- sleep root:$container _root \ lib/lxd/ containers/ yakkety/ rootfs/ usr/bin/ setuid- sleep lxd/containers/ yakkety/ rootfs/ usr/bin/ setuid- sleep
tyhicks@host:~$ container_
tyhicks@host:~$ echo $container_root
296608
tyhicks@host:~$ sudo cp $(which sleep) /var/lib/
tyhicks@host:~$ sudo chown $container_
/var/
tyhicks@host:~$ sudo chmod u+s /var/lib/
# Run the setuid-sleep program as an unprivileged user inside the container. sleep)/ mountinfo
# The procfs inodes for that process are incorrectly owned by init_ns root
tyhicks@host:~$ lxc exec yakkety -- su - ubuntu -c 'setuid-sleep 5m'
# From another host terminal
tyhicks@host:~$ sudo stat -c %u /proc/$(pidof setuid-
0
# Run the setuid-sleep program as root inside the container. sleep)/ mountinfo
# The procfs inodes for that process are incorrectly owned by container_ns root
tyhicks@host:~$ lxc exec yakkety -- setuid-sleep 5m
# From another host terminal
tyhicks@host:~$ sudo stat -c %u /proc/$(pidof setuid-
296608
This explains the AppArmor denial from comment #3 containing "fsuid=296608 ouid=0". The setuid- container- root snap-confine task is correctly running as fsuid 296608 (container_ns root) but the mountinfo inode is correctly assigned uid 0 (init_ns root). This causes the "owner" conditional on the AppArmor rule to not match and for snap-confine to segfault after it can't access /proc/self/ mountinfo.
The procfs kernel bug needs to be fixed but, in the meantime, we can probably drop the "owner" conditional in the snap-confine profile.