-
Notifications
You must be signed in to change notification settings - Fork 31
Description
File capabilities set from within a user namespace apparently include user id and are then valid only if the user id is in the current user namespace, see https://elixir.bootlin.com/linux/v6.1.42/source/security/commoncap.c#L455.
This is not an issue for file capabilities set from inside a container, but it is a problem for capabilities stored in container images. Container images are built in a different user namespace than containers created from those images, which makes the file capabilities invalid. Unfortunately, file capabilities can be used e.g. instead of the suid bit, for example on Fedora/CentOS:
[nix-shell:~]# getcap -r /dozer/ct/instance-077d2aad/private/
/dozer/ct/instance-077d2aad/private/usr/bin/newuidmap cap_setuid=ep
/dozer/ct/instance-077d2aad/private/usr/bin/clockdiff cap_net_raw=p
/dozer/ct/instance-077d2aad/private/usr/bin/newgidmap cap_setgid=ep
/dozer/ct/instance-077d2aad/private/usr/bin/arping cap_net_raw=p
Reading those capabilities from the container fails:
[CT instance-077d2aad] root@instance-077d2aad:/# strace getcap /usr/bin/newuidmap
[...]
getxattr("/usr/bin/newuidmap", "security.capability", NULL, 0) = -1 EOVERFLOW (Value too large for defined data type)
[...]
The same issue will arise when a container is chowned into a different user namespace. All existing file capabilities will not longer be valid.
It's unclear how we could solve this. We could create a list of files with capabilities when images are built and then restore those capabilities when containers are created from the images. ct chown would still break the capabilities though. Walking through all files on existing containers to find all capabilities and preserve them is highly impractical as there can be millions of files.