I brought this up among my colleagues in the security team and we kicked around a few ideas.
It seems likely that the host /etc/group doesn't match the rootfs /etc/group file. tar will write both uid/gid as well as username/groupname to tarballs, and the extraction process will use the names where it can, and will use the numbers where it can't.
"The magic field indicates that this archive was output in the P1003 archive format. If this field contains TMAGIC, the uname and gname fields will contain the ASCII representation of the owner and group of the file respectively. If found, the user and group IDs are used rather than the values in the uid and gid fields."
The guesses on ways to handle this:
- use --numeric-owner when creating the tarballs to skip the text names
- use --group-map when creating the tarballs to map the rootfs names/numbers to host names/numbers
- set up a new usernamespace, new mount namespace, and bindmount the rootfs /etc/passwd and /etc/group into the new namespace, so the tar process's getpwent, getgrent calls will reflect the new root filesystem.
- run the tar from a chroot within the rootfs
I think these are in rough preference order, but there might be huge consequences to omitting the text names from tarballs that I am completely unaware of. That's the largest change in what we produce, the --group-map and bindmount ideas are minimal changes to tools and what's produced, and the last one is probably too reliant upon the image actually having a suitable tar command.
I brought this up among my colleagues in the security team and we kicked around a few ideas.
It seems likely that the host /etc/group doesn't match the rootfs /etc/group file. tar will write both uid/gid as well as username/groupname to tarballs, and the extraction process will use the names where it can, and will use the numbers where it can't.
https:/ /www.gnu. org/software/ tar/manual/ html_node/ Standard. html
"The magic field indicates that this archive was output in the P1003 archive format. If this field contains TMAGIC, the uname and gname fields will contain the ASCII representation of the owner and group of the file respectively. If found, the user and group IDs are used rather than the values in the uid and gid fields."
The guesses on ways to handle this:
- use --numeric-owner when creating the tarballs to skip the text names
- use --group-map when creating the tarballs to map the rootfs names/numbers to host names/numbers
- set up a new usernamespace, new mount namespace, and bindmount the rootfs /etc/passwd and /etc/group into the new namespace, so the tar process's getpwent, getgrent calls will reflect the new root filesystem.
- run the tar from a chroot within the rootfs
I think these are in rough preference order, but there might be huge consequences to omitting the text names from tarballs that I am completely unaware of. That's the largest change in what we produce, the --group-map and bindmount ideas are minimal changes to tools and what's produced, and the last one is probably too reliant upon the image actually having a suitable tar command.
Thanks