bindmounting /dev in containers might break /dev/pts on the host

Bug #1950176 reported by David Vallee Delisle
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Cédric Jeanneret

Bug Description

Related: https://bugzilla.redhat.com/show_bug.cgi?id=2021203

Description of problem:

When forking a pty as regular user, we get a failure to read/write on /dev/ptmx since it's just a symlink to /dev/pts/ptmx.

In fedora, /dev/ptmx is its own node.

[1] centos 9
[2] fedora
[3] strace
[4] reproducer script

After doing further investivation, Openshift product has seen a similar issue in Bug 1950408 that resulted in this article [a] and this doc change [b]

Since this issues starts to happen only after the undercloud installation is completed, we can presume that some openstack containers are having bad mounts [5]

In Bugzilla 1950408 comment 18 [c], it's recommended that, if we need to bind mount /dev, we should also add a mount type=devpts,destination=/dev/pts.

It appears like tripleo doesn't have the hability to pass mounttypes.

[a] https://access.redhat.com/solutions/6205892
[b] https://github.com/openshift/openshift-docs/pull/34464/commits/b137fa8a5c9f7c715c9c8f2e8264008b048f1b59
[c] https://bugzilla.redhat.com/show_bug.cgi?id=1950408#c18

Steps to Reproduce:
1. Run script [4]

Workaround:
rm /dev/ptmx
mknod -m 666 /dev/ptmx c 5 2

or

chmod 666 /dev/pts/ptmx

Additional info:
[1]
~~~
(undercloud) [stack@undercloud-0 ~]$ ls -tlra /dev/ptmx
lrwxrwxrwx. 1 root root 8 Nov 7 12:34 /dev/ptmx -> pts/ptmx
~~~

[2]
~~~
[dvd@fedora ~]$ ls -tlra /dev/ptmx
crw-rw-rw-. 1 root tty 5, 2 Nov 8 09:39 /dev/ptmx
[dvd@fedora ~]$ ls -tlra /dev/pts/ptmx
c---------. 1 root root 5, 2 Oct 25 16:50 /dev/pts/ptmx
~~~

[3]
~~~
14:48:20.542970 openat(AT_FDCWD, "/dev/ptmx", O_RDWR) = -1 EACCES (Permission denied)
14:48:20.543061 write(2, "Traceback (most recent call last):\n", 35Traceback (most recent call last):
) = 35
14:48:20.543116 write(2, " File \"/home/stack/./test.py\", line 10, in <module>\n", 53 File "/home/stack/./test.py", line 10, in <module>
) = 53
14:48:20.543168 openat(AT_FDCWD, "/home/stack/./test.py", O_RDONLY|O_CLOEXEC) = 3
14:48:20.543216 newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=437, ...}, AT_EMPTY_PATH) = 0
14:48:20.543266 ioctl(3, TCGETS, 0x7ffe22bd1f30) = -1 ENOTTY (Inappropriate ioctl for device)
14:48:20.543310 lseek(3, 0, SEEK_CUR) = 0
14:48:20.543361 fcntl(3, F_DUPFD_CLOEXEC, 0) = 4
14:48:20.543403 fcntl(4, F_GETFL) = 0x8000 (flags O_RDONLY|O_LARGEFILE)
14:48:20.543445 newfstatat(4, "", {st_mode=S_IFREG|0755, st_size=437, ...}, AT_EMPTY_PATH) = 0
14:48:20.543493 read(4, "#!/usr/bin/env python\n# Python program to explain os.openpty() method \n \n# importing os module \nimport os\n \n \n# open new pseudo-terminal pair\n# using os.openpty() method\nmaster, slave = os.openpty()\n \n \n# Get the terminal device\n# name associated with\n# file descriptor master \nname = os.ttyname(master)\nprint(name)\n \n \n# Get the terminal device\n# name associated with\n# file descriptor slave\nname = os.ttyname(slave)\nprint(name)\n", 4096) = 437
14:48:20.543542 close(4) = 0
14:48:20.543584 lseek(3, 0, SEEK_SET) = 0
14:48:20.543636 read(3, "#!/usr/bin/env python\n# Python program to explain os.openpty() method \n \n# importing os module \nimport os\n \n \n# open new pseudo-terminal pair\n# using os.openpty() method\nmaster, slave = os.openpty()\n \n \n# Get the terminal device\n# name associated with\n# file descriptor master \nname = os.ttyname(master)\nprint(name)\n \n \n# Get the terminal device\n# name associated with\n# file descriptor slave\nname = os.ttyname(slave)\nprint(name)\n", 8192) = 437
14:48:20.543689 close(3) = 0
14:48:20.543735 write(2, " master, slave = os.openpty()\n", 33 master, slave = os.openpty()
) = 33
14:48:20.543794 write(2, "PermissionError: [Errno 13] Permission denied\n", 46PermissionError: [Errno 13] Permission denied
) = 46
~~~

[4]
~~~
#!/usr/bin/env python
import os
master, slave = os.openpty()
~~~

[5]
~~~
[root@undercloud-0 ~]# podman ps -q | while read l;do echo "$l";podman inspect $l | jq -r '.[].Mounts[] | [.Source, .Destination] | @tsv' | grep -P "/dev[^\/]";done
229b0319bb29
560fa91d2da3
82bccf849215
fe0cdd41acb8
/dev /dev/
c7c11279eacb
7f842c077598
c22873137bfe
2ad7c661e2c0
8a703f6f40f9
0546bb29d745
88df372ab2af
8100522a45d4
/dev /dev
d4725a57676b
9940f74ff548
63a170016303
6d978a2fc69a
0b7824b15fe9
[root@undercloud-0 ~]# podman ps | grep -P "fe0cdd41acb8|8100522a45d4"
fe0cdd41acb8 undercloud-0.ctlplane.home.arpa:8787/tripleo_centos9/openstack-iscsid:latest kolla_start 2 days ago Up 2 days ago (healthy) iscsid
8100522a45d4 undercloud-0.ctlplane.home.arpa:8787/tripleo_centos9/openstack-ironic-conductor:latest kolla_start 2 days ago Up 2 days ago (unhealthy) ironic_conductor
~~~

Revision history for this message
chandan kumar (chkumar246) wrote :
Download full text (6.5 KiB)

I think We are also seeing similar issue on CS9 multinode jobs
https://logserver.rdoproject.org/26/34926/9/check/periodic-tripleo-ci-centos-9-containers-multinode-master/3bc334a/logs/undercloud/home/zuul-worker/overcloud_deploy.log.txt.gz

```
/ansible/tripleo-playbooks
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 INFO tripleoclient.utils.utils [-] Temporary directory [ /tmp/tripleo8h94vv6b ] cleaned up
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 INFO tripleoclient.utils.utils [-] Temporary directory [ /tmp/tripleo4c4oe5y1 ] cleaned up
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud [-] Exception occured while running the command: OSError: out of pty devices
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud Traceback (most recent call last):
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud File "/usr/lib/python3.9/site-packages/tripleoclient/command.py", line 34, in run
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud super(Command, self).run(parsed_args)
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud File "/usr/lib/python3.9/site-packages/osc_lib/command/command.py", line 39, in run
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud return super(Command, self).run(parsed_args)
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud File "/usr/lib/python3.9/site-packages/cliff/command.py", line 186, in run
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud return_code = self.take_action(parsed_args) or 0
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud File "/usr/lib/python3.9/site-packages/tripleoclient/v1/overcloud_deploy.py", line 1132, in take_action
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud created_env_files = self.create_env_files(
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud File "/usr/lib/python3.9/site-packages/tripleoclient/v1/overcloud_deploy.py", line 304, in create_env_files
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud self._provision_networks(parsed_args, new_tht_root,
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud File "/usr/lib/python3.9/site-packages/tripleoclient/v1/overcloud_deploy.py", line 555, in _provision_networks
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud utils.run_ansible_playbook(
2021-11-08 07:18:47 | 2021-11-08 07:18:47.814 100991 ERROR tripleoclient.v1.overcloud_deploy.DeployOvercloud File "/usr/lib/python3...

Read more...

tags: added: alert promotion-blocker
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-quickstart-extras (master)
Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

The fix consists in 3 patches:

- https://github.com/containers/ansible-podman-collections/pull/332
- one against tripleo-ansible in order to make container_puppet_config.py and tripleo_container_manage.py aware of that new "mounts" option
- one against tripleo-heat-templates in order to inject the "mounts" parameter where suited

I'm currently working on the 2 others, but they need the first one... So I can't push anything yet.

Changed in tripleo:
importance: Undecided → High
assignee: nobody → Cédric Jeanneret (cjeanner)
milestone: none → xena-3
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

@cjeaner or we could just start containers that need /dev mounted w/o tty (podman module has tty and interactive parameters that map to -t -i options)

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

also, the containers that need /dev:rw mounted, would need to be started as -v /dev/null:/dev/pts/ptmx:ro -v /dev/null:/dev/ptmx:ro

:ro would indicate with a failure logged, if something tries to use pseudoterminal there (it really shouldn't)

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

Sooo. after some more poking around, it seems we need to act on SELinux as well.

Given the following container:
sudo podman run --volume /dev:/dev --mount type=devpts,destination=/dev/pts --rm -ti centos:8 bash

Given the following commands and output from within the container:
[root@6f64f582765e /]# ls -lZ /dev/pts/ptmx
crw-rw-rw-. 1 root root system_u:object_r:container_file_t:s0:c441,c485 5, 2 Nov 9 10:56 /dev/pts/ptmx

[root@6f64f582765e /]# ls -lZ /dev/ptmx
ls: cannot access '/dev/ptmx': Permission denied

[root@6f64f582765e /]# ls -lZ /dev/ | grep ptmx
[many read errrors]
?????????? ? ? ? 0 ? ? ptmx

We can see the specific "--mount type=devpts,destination=/dev/pts" isn't working.

Switching the host to permissive, and checking the audit.log:

[root@6f64f582765e /]# ls -lZ /dev/ptmx
crw-rw-rw-. 1 root tty system_u:object_r:ptmx_t:s0 5, 2 Nov 9 10:58 /dev/ptmx

[root@tengu ~]# grep denied /var/log/audit/audit.log | grep permissive=1
type=AVC msg=audit(1636455530.402:896): avc: denied { getattr } for pid=17493 comm="ls" path="/dev/ptmx" dev="devtmpfs" ino=99 scontext=system_u:system_r:container_t:s0:c441,c485 tcontext=system_u:object_r:ptmx_t:s0 tclass=chr_file permissive=1

We therefore seem to need a new policy, allowing container_t on ptmx_t

I'm not really happy with that, I'm pretty sure there are reasons this isn't allowed by default, such as container evasion capabilities...

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

After some discussions with Bogdan, we might want to explore this:

- set "tty: False" in the tripleo_container_manage.py
- for a selected set of containers, override /dev/ptmx with /dev/null, such as "-v /dev/null:/dev/ptmx:rw"

Maybe using the --mount type=devpts,destination=/dev/pts would still be a good thing in order to get an actual shell in the container for some debugging. Not 100% sure though.

Revision history for this message
Julie Pichon (jpichon) wrote :

I don't like these rules much either. If there are other options, that would be good.

https://github.com/containers/podman/issues/10530 suggests that forcing permission 666 on the host might work, though I see that was suggested comment #2 but doesn't seem like an appropriate approach based on the following comments?

https://github.com/containers/podman/issues/11343#issuecomment-910810104 also looks interesting, it seems like adding a few mount options in addition to the ones suggested in comment #6 could help?

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

@Julie: not sure setting the mode to 0666 will provide any help, since we'll see the denials at the SELinux level :/.

If we provide "--tmpfs=/dev:noexec,nosuid,strictatime,mode=755,size=65536k", I think we won't get the actual /dev bind-mount as expected with the current "-v /dev:/dev".

Not really sure about the best thing to do.... Maybe we should bind-mount /dev blindly and just push the actually used content?

Changed in tripleo:
status: New → In Progress
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by "Bogdan Dobrelya <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/817189

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

So, ironic-conductor does not really need /dev mounted, and iscsid container is no longer used but Wallaby. We gotta only fix iscsid container for W then.

We cannot "nullify" its /dev/ptmx bindmount, since that would break things in unpredictable ways.
We can neither change selinux, nor ptmx_t allowance for the host /dev/ptmx - since that would violate its design purpose (UNIX98 backward compat for Single Unix Specification). It shall has nothing to containers - for that there is /dev/pts/ptmx.

Given that, that really can be a thing - running that iscsid container in its "native" (host) context.

Revision history for this message
chandan kumar (chkumar246) wrote :

Another similar issue on CS9 multinode job on subnode 1
https://logserver.rdoproject.org/26/34926/13/check/periodic-tripleo-ci-centos-9-containers-multinode-master/b11a222/logs/subnode-1/home/zuul-worker/dlrn.log.txt.gz

```
2021-11-11 10:17:56 | 2021-11-11 10:17:56,267 INFO:dlrn:Using file /tmp/tmp9pq1wvvw for temporary db
2021-11-11 10:17:56 | 2021-11-11 10:17:56,556 INFO:dlrn-repositories:Getting https://github.com/rdo-packages/tripleo-ansible-distgit.git to ./data/tripleo-ansible_distro (rpm-master)
2021-11-11 10:17:56 | 2021-11-11 10:17:56,558 ERROR:dlrn-repositories:Error cloning https://github.com/rdo-packages/tripleo-ansible-distgit.git into ./data/tripleo-ansible_distro: out of pty devices
2021-11-11 10:17:56 | Traceback (most recent call last):
2021-11-11 10:17:56 | File "/home/zuul-worker/dlrn-venv/bin/dlrn", line 8, in <module>
2021-11-11 10:17:56 | sys.exit(main())
2021-11-11 10:17:56 | File "/home/zuul-worker/dlrn-venv/lib64/python3.9/site-packages/dlrn/shell.py", line 325, in main
2021-11-11 10:17:56 | project_toprocess, _, skipped = getinfo(
2021-11-11 10:17:56 | File "/home/zuul-worker/dlrn-venv/lib64/python3.9/site-packages/dlrn/shell.py", line 856, in getinfo
2021-11-11 10:17:56 | project_toprocess, skipped = pkginfo.getinfo(
2021-11-11 10:17:56 | File "/home/zuul-worker/dlrn-venv/lib64/python3.9/site-packages/dlrn/drivers/rdoinfo.py", line 149, in getinfo
2021-11-11 10:17:56 | refreshrepo(distro, distro_dir, distro_branch, local=local,
2021-11-11 10:17:56 | File "/home/zuul-worker/dlrn-venv/lib64/python3.9/site-packages/dlrn/repositories.py", line 30, in refreshrepo
2021-11-11 10:17:56 | sh.git.clone(url, path)
2021-11-11 10:17:56 | File "/home/zuul-worker/dlrn-venv/lib64/python3.9/site-packages/sh.py", line 1566, in __call__
2021-11-11 10:17:56 | return RunningCommand(cmd, call_args, stdin, stdout, stderr)
2021-11-11 10:17:56 | File "/home/zuul-worker/dlrn-venv/lib64/python3.9/site-packages/sh.py", line 805, in __init__
2021-11-11 10:17:56 | self.process = OProc(
2021-11-11 10:17:56 | File "/home/zuul-worker/dlrn-venv/lib64/python3.9/site-packages/sh.py", line 1943, in __init__
2021-11-11 10:17:56 | self._stdout_parent_fd, self._stdout_child_fd = pty.openpty()
2021-11-11 10:17:56 | File "/usr/lib64/python3.9/pty.py", line 30, in openpty
2021-11-11 10:17:56 | master_fd, slave_name = _open_terminal()
2021-11-11 10:17:56 | File "/usr/lib64/python3.9/pty.py", line 60, in _open_terminal
2021-11-11 10:17:56 | raise OSError('out of pty devices')
2021-11-11 10:17:56 | OSError: out of pty devices
```

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-quickstart-extras (master)

Change abandoned by "chandan kumar <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-quickstart-extras/+/817141

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-ci (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-ci/+/817700

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-quickstart-extras (master)

Change abandoned by "chandan kumar <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-quickstart-extras/+/817141
Reason: in favor of https://review.opendev.org/c/openstack/tripleo-ci/+/817700

Revision history for this message
chandan kumar (chkumar246) wrote :

https://logserver.rdoproject.org/13/36713/1/check/periodic-tripleo-ci-centos-9-ovb-1ctlr_1comp-featureset002-master/a1b45c4/logs/undercloud/home/zuul-worker/overcloud_introspect.log.txt.gz

/usr/lib/python3.9/site-packages/ansible/_vendor/__init__.py:42: UserWarning: One or more Python packages bundled by this ansible-core distribution were already loaded (pyparsing). This may result in undefined behavior.
  warnings.warn('One or more Python packages bundled by this ansible-core distribution were already '
/usr/lib64/python3.9/site-packages/_yaml/__init__.py:18: DeprecationWarning: The _yaml extension module is now located at yaml._yaml and its location is subject to change. To use the LibYAML-based parser and emitter, import from `yaml`: `from yaml import CLoader as Loader, CDumper as Dumper`.
  warnings.warn(
Exception occured while running the command
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/tripleoclient/command.py", line 34, in run
    super(Command, self).run(parsed_args)
  File "/usr/lib/python3.9/site-packages/osc_lib/command/command.py", line 39, in run
    return super(Command, self).run(parsed_args)
  File "/usr/lib/python3.9/site-packages/cliff/command.py", line 186, in run
    return_code = self.take_action(parsed_args) or 0
  File "/usr/lib/python3.9/site-packages/tripleoclient/v2/overcloud_node.py", line 188, in take_action
    baremetal.introspect_manageable_nodes(
  File "/usr/lib/python3.9/site-packages/tripleoclient/workflows/baremetal.py", line 220, in introspect_manageable_nodes
    introspect(
  File "/usr/lib/python3.9/site-packages/tripleoclient/workflows/baremetal.py", line 173, in introspect
    utils.run_ansible_playbook(
  File "/usr/lib/python3.9/site-packages/tripleoclient/utils.py", line 685, in run_ansible_playbook
    status, rc = runner.run()
  File "/usr/lib/python3.9/site-packages/ansible_runner/runner.py", line 191, in run
    child = pexpect.spawn(
  File "/usr/lib/python3.9/site-packages/pexpect/pty_spawn.py", line 205, in __init__
    self._spawn(command, args, preexec_fn, dimensions)
  File "/usr/lib/python3.9/site-packages/pexpect/pty_spawn.py", line 303, in _spawn
    self.ptyproc = self._spawnpty(self.args, env=self.env,
  File "/usr/lib/python3.9/site-packages/pexpect/pty_spawn.py", line 315, in _spawnpty
    return ptyprocess.PtyProcess.spawn(args, **kwargs)
  File "/usr/lib/python3.9/site-packages/ptyprocess/ptyprocess.py", line 226, in spawn
    pid, fd = pty.fork()
  File "/usr/lib64/python3.9/pty.py", line 97, in fork
    master_fd, slave_fd = openpty()
  File "/usr/lib64/python3.9/pty.py", line 30, in openpty
    master_fd, slave_name = _open_terminal()
  File "/usr/lib64/python3.9/pty.py", line 60, in _open_terminal
    raise OSError('out of pty devices')
OSError: out of pty devices
out of pty devices

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-ci (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ci/+/817700
Committed: https://opendev.org/openstack/tripleo-ci/commit/72d7519942b3666658939bf99bc7129936297b0d
Submitter: "Zuul (22348)"
Branch: master

commit 72d7519942b3666658939bf99bc7129936297b0d
Author: Chandan Kumar (raukadah) <email address hidden>
Date: Fri Nov 12 11:44:19 2021 +0530

    [workaround]Set permission for /dev/pts/ptmx to 666

    When forking a pty as regular user, we get a failure to
    read/write on /dev/ptmx since it's just a symlink to /dev/pts/ptmx.

    It happens only on CS9 on all the nodes and gives error
    OSError: out of pty devices.

    Adding the workaround by setting the permission for /dev/pts/ptmx to
    666 temprorly fixes the issue in CI.

    Proper fix will come later.

    Related-Bug: #1950176

    Signed-off-by: Chandan Kumar (raukadah) <email address hidden>
    Change-Id: I1a0327b09f53beee1668a6ab4bb8380cc6b46446

Revision history for this message
yatin (yatinkarel) wrote :

Seems the issue is specific to "crun" runtime which used in CentOS 9-stream as with "runc" runtime in CentOS 8-stream issue is not seen. Just to confirm this i tried running container on 9-stream with runc+cgroupv1(as runc don't support cgroupv2) + /dev/ mounts and issue was not seen.

I see in runc the similar issue[1] was fixed with [2][3]. On the same note shouldn't be better that's it's fixed in crun itself? Not sure if that's already evaluated or not.

[1] https://github.com/opencontainers/runc/issues/80
[2] https://github.com/opencontainers/runc/pull/96
[3] https://github.com/opencontainers/runc/pull/742

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

wondering if ensuring we're mounting /dev:/dev (instead of /dev/:/dev/) might do the trick. That was a thing in the runc patches, apparently. Running a test right now.

At least, THIS is *not* triggering the issue:

sudo podman run --runtime=crun -it -v /dev:/dev --rm --privileged --name foo quay.io/centos/centos:stream9 python3 -c 'import os; _,_ = os.openpty()'

I forgot to test with -v /dev/:/dev/ though.

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

bingo.

-v /dev:/dev : working fine
-v /dev/:/dev/ : /dev/ptmx is replaced by a symlink

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by "Cedric Jeanneret <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/819646
Reason: zuul failed, restoring in a few.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by "Cedric Jeanneret <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/819646
Reason: gate failed, restoring in a few.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/819899

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/819646
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/1f868ba5307adb6962d89c99b8570c2c09536695
Submitter: "Zuul (22348)"
Branch: master

commit 1f868ba5307adb6962d89c99b8570c2c09536695
Author: Cédric Jeanneret <email address hidden>
Date: Mon Nov 29 14:56:14 2021 +0100

    Ensure we bind-mount /dev instead of /dev/

    With the move to crun instead of runc for the container engine, we seem
    to hit a known issue that was corrected back in 2015[1] for runc. There
    was then a regression, fixed with [2] a bit later.

    There's a good chance crun has a partial fix only, matching only /dev
    and not /dev/, leading to the change of /dev/ptmx from an actual node to
    a symlink pointing to /dev/pts/ptmx.

    Another fix might be ensuring we don't have any trailing "/" in the
    volume paths passed to the tripleo-ansible/tripleo_container_manage
    module/role.

    [1] https://github.com/opencontainers/runc/pull/96
    [2] https://github.com/opencontainers/runc/pull/742/files

    Closes-Bug: #1950176

    Change-Id: I094120f7f2f6bfcfc0cc5843aa1b23629cd90a23

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/819836

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/819899
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/7a99ae23e3804fc24ece379018c68d275dd5d55d
Submitter: "Zuul (22348)"
Branch: master

commit 7a99ae23e3804fc24ece379018c68d275dd5d55d
Author: Cédric Jeanneret <email address hidden>
Date: Tue Nov 30 17:00:31 2021 +0100

    Introduce a new linter for yaml-validate, and correct issues

    This new linter ensures we don't have any trailing "/" in the container
    volume definitions.

    Those trailing "/" may create issues with the containers, for instance
    for specific mounts such as "/dev"[1].

    This patch also takes the opportunity to fix those trailing "/" for the
    affected files, in order to start on a clean basis.

    [1] https://launchpad.net/bugs/1950176

    Change-Id: If951f9643d67574c1225301aab7c9e4b0d316b7f
    Related-Bug: #1950176

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (stable/wallaby)

Related fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/820111

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/819836
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/0261ea22aeddbbd696754a6b58b880dcf57b42d1
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 0261ea22aeddbbd696754a6b58b880dcf57b42d1
Author: Cédric Jeanneret <email address hidden>
Date: Mon Nov 29 14:56:14 2021 +0100

    Ensure we bind-mount /dev instead of /dev/

    With the move to crun instead of runc for the container engine, we seem
    to hit a known issue that was corrected back in 2015[1] for runc. There
    was then a regression, fixed with [2] a bit later.

    There's a good chance crun has a partial fix only, matching only /dev
    and not /dev/, leading to the change of /dev/ptmx from an actual node to
    a symlink pointing to /dev/pts/ptmx.

    Another fix might be ensuring we don't have any trailing "/" in the
    volume paths passed to the tripleo-ansible/tripleo_container_manage
    module/role.

    [1] https://github.com/opencontainers/runc/pull/96
    [2] https://github.com/opencontainers/runc/pull/742/files

    Closes-Bug: #1950176

    Change-Id: I094120f7f2f6bfcfc0cc5843aa1b23629cd90a23
    (cherry picked from commit 1f868ba5307adb6962d89c99b8570c2c09536695)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/820111
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/d2bc890f30ec330dc44f194585d03262e78d19c0
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit d2bc890f30ec330dc44f194585d03262e78d19c0
Author: Cédric Jeanneret <email address hidden>
Date: Tue Nov 30 17:00:31 2021 +0100

    Introduce a new linter for yaml-validate, and correct issues

    This new linter ensures we don't have any trailing "/" in the container
    volume definitions.

    Those trailing "/" may create issues with the containers, for instance
    for specific mounts such as "/dev"[1].

    This patch also takes the opportunity to fix those trailing "/" for the
    affected files, in order to start on a clean basis.

    [1] https://launchpad.net/bugs/1950176

    Note: the backport is NOT clean:
    - a service was removed in master and needs some cleanup in wallaby:
      liquidio-compute-config-container-puppet.yaml
    - two files had some weird list with empty strings:
      - neutron-agents-ib-config-container-puppet.yam
      - neutron-mlnx-agent-container-puppet.yaml
      Those empty strings have been removed from master apparently.

    Change-Id: If951f9643d67574c1225301aab7c9e4b0d316b7f
    Related-Bug: #1950176
    (cherry picked from commit 7a99ae23e3804fc24ece379018c68d275dd5d55d)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 16.0.0

This issue was fixed in the openstack/tripleo-heat-templates 16.0.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.