Trying to checkpoint a container (docker/podman) on 18.04 fails starting with linux-image-5.0.0-35-generic. We (CRIU upstream) see this in Travis starting a few weeks ago. Manually testing it locally shows that linux-image-5.0.0-32-generic still works and linux-image-5.0.0-35-generic does not longer work. It seems to be overlayfs related, at least that is what we believe. The CRIU error message we see is:
(00.170944) Error (criu/files-reg.c:1277): Can't lookup mount=410 for fd=-3 path=/bin/busybox
(00.170987) Error (criu/cr-dump.c:1246): Collect mappings (pid: 1637) failed with -1
We have not seen this only in Travis, but also multiple CRIU users reported that bug already. Currently we have to tell them to downgrade the kernel.
I also able to reproduce it with linux-image-5.3.0-24-generic. Staying on the 4.18.0 kernel series does not show this error. 4.18.0-25-generic works without problems.
One of the possible explanations from our side include:
"Looks like we have the same as for st_dev now with mnt_id, that is bad, because we can't find on which mount to open the file if kernel hides these information from us."
Running on the upstream 5.5.0-rc1 kernel does not show this error.
Trying to checkpoint a container (docker/podman) on 18.04 fails starting with linux-image- 5.0.0-35- generic. We (CRIU upstream) see this in Travis starting a few weeks ago. Manually testing it locally shows that linux-image- 5.0.0-32- generic still works and linux-image- 5.0.0-35- generic does not longer work. It seems to be overlayfs related, at least that is what we believe. The CRIU error message we see is:
(00.170944) Error (criu/files- reg.c:1277) : Can't lookup mount=410 for fd=-3 path=/bin/busybox dump.c: 1246): Collect mappings (pid: 1637) failed with -1
(00.170987) Error (criu/cr-
We have not seen this only in Travis, but also multiple CRIU users reported that bug already. Currently we have to tell them to downgrade the kernel.
I also able to reproduce it with linux-image- 5.3.0-24- generic. Staying on the 4.18.0 kernel series does not show this error. 4.18.0-25-generic works without problems.
See also https:/ /github. com/checkpoint- restore/ criu/issues/ 860
One of the possible explanations from our side include:
"Looks like we have the same as for st_dev now with mnt_id, that is bad, because we can't find on which mount to open the file if kernel hides these information from us."
Running on the upstream 5.5.0-rc1 kernel does not show this error.