We are seeing a slow down in jobs running in vexxhost since 03/03 - image builds, containers access/modification/tempest timeouts.
For example in standalone deploy, it is failing with:
FATAL | Capture the update repos and installed rpms | localhost | error={"changed": true, "cmd": "buildah run 192.168.24.1-working-container-3 yum list installed > /var/log/container_info.log\n", "delta": "0:00:30.079656", "end": "2022-03-04 15:57:03.241852", "msg": "non-zero return code", "rc": 1, "start": "2022-03-04 15:56:33.162196", "stderr": "time=\"2022-03-04T15:56:56Z\" level=error msg=\"did not get container create message from subprocess: read |0: i/o timeout\"\nerror running container: write containercreatepipe: broken pipe\nerror while running runtime: exit status 1", "stderr_lines": ["time=\"2022-03-04T15:56:56Z\" level=error msg=\"did not get container create message from subprocess: read |0: i/o timeout\"", "error running container: write containercreatepipe: broken pipe", "error while running runtime: exit status 1"], "stdout": "", "stdout_lines": []}
logs:
https://logserver.rdoproject.org/82/40082/2/check/periodic-tripleo-ci-centos-9-standalone-full-tempest-scenario-master/79eae19/logs/undercloud/var/log/tripleo-container-image-prepare.log.txt.gz
This shows up in some of the c9 standalone jobs, c8 standalone jobs, component. example:
periodic-tripleo-ci-centos-9-standalone-full-tempest-scenario-master
periodic-tripleo-ci-centos-9-scenario003-standalone-common-master, periodic-tripleo-ci-centos-9-standalone-tempest-master
In centos9 jobs we see a diff in podman versions, not sure if that is relevent:
Podman version in a passing log: 3.4.5-0. 7.el9.x86_ 64 catatonit- 3.4.5-0. 7.el9.x86_ 64
podman-
podman-
Podman version in failing log: 4.0.0-6. el9.x86_ 64 catatonit- 4.0.0-6. el9.x86_ 64
podman-
podman-