-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Fix user namespace validation for containers in pods #27413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: 0xDVC The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
6059c1f to
5a7b73a
Compare
|
[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore. |
2 similar comments
|
[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore. |
|
[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore. |
88b5145 to
d5700bc
Compare
libpod/container_validate.go
Outdated
| // Linux requires containers sharing network or IPC namespaces (like in a pod) to use the same | ||
| // user namespace. Trying to use different ones will fail at the kernel level. | ||
| // Most of this is already checked in pkg/specgen/generate/namespaces.go before pod containers drop their ID mappings. | ||
| func (c *Container) validateUserNSInPod() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not my area of expertise, but what am I missing here? All conditions including the default return nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, totally missed removing that. Earlier in development, I was focusing on libpod/container_validate.go as suggested when I started work. I realized that by the time that call runs, the ID mappings had already been dropped (in namespaces.go lines 265-269), so I would have missed catching the error. namespaces.go is where the actual validation is happening.
removing it now.
181f6ca to
502a145
Compare
|
Tests are unhappy, I'm not sure what's what with them. I'm going to try a rerun of the failed ones. |
Literally😂. For most, quite a learning experience. The rest? Got me at a choke hold. |
1af8d52 to
aba6c69
Compare
pkg/specgen/generate/namespaces.go
Outdated
| // Linux requires containers sharing network or IPC namespaces (like in a pod) | ||
| // to use the same user namespace. Trying to use different ones will fail at the kernel level. | ||
| func userNSConflictsWithPod(pod *libpod.Pod, mode specgen.NamespaceMode) error { | ||
| if pod != nil && pod.HasInfraContainer() && (pod.SharesIPC() || pod.SharesNet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are the network and ipc namespaces special cased here? I don't think the comment is right, it doesn't fail it is just not resulting in a proper setup as the namespace ownerships isn't right wich can cause permission problems later.
I see no reason to special case namespace types here, this should be blocked for all namespaces.
cc @mheon @giuseppe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should work fine with crun, there is some code to handle joining namespaces not owned by the target user namespace. So it is mostly a decision whether we want to allow it or not (I think we should).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@giuseppe Are you saying we should allow mixing userns in containers that are part of the same pod?
AFAICT we added this limitation from the beginning #10589 just not properly enforced.
While I agree it can start doesn't it totally break the capabilities as the ownership is wrong, i.e. similar to described here https://blog.podman.io/2023/12/interaction-between-user-namespaces-and-capabilities/
Additionally files in /proc/sys will have the wrong ownership:
$ podman pod create --pod-id-file /tmp/pod1 --userns keep-id
55dff018031a5f8282983a7657e815633756b2adc47049cd6055631429f20d9f
$ podman run --pod-id-file /tmp/pod1 --rm quay.io/libpod/testimage:20241011 ls -l /proc/sys/net/
total 0
dr-xr-xr-x 1 root root 0 Nov 4 11:22 core
dr-xr-xr-x 1 root root 0 Nov 4 11:22 ipv4
dr-xr-xr-x 1 root root 0 Nov 4 11:22 ipv6
dr-xr-xr-x 1 root root 0 Nov 4 11:22 mptcp
dr-xr-xr-x 1 root root 0 Nov 4 11:22 netfilter
dr-xr-xr-x 1 root root 0 Nov 4 11:22 unix
$ podman run --pod-id-file /tmp/pod1 --rm --userns host quay.io/libpod/testimage:20241011 ls -l /proc/sys/net/
total 0
dr-xr-xr-x 1 bin bin 0 Nov 4 11:22 core
dr-xr-xr-x 1 bin bin 0 Nov 4 11:22 ipv4
dr-xr-xr-x 1 bin bin 0 Nov 4 11:22 ipv6
dr-xr-xr-x 1 bin bin 0 Nov 4 11:22 mptcp
dr-xr-xr-x 1 bin bin 0 Nov 4 11:22 netfilter
dr-xr-xr-x 1 bin bin 0 Nov 4 11:22 unix
So this seems like a major problem as most users won't be aware of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when we first added the feature, it was not possible to mix namespaces (it requires the new mount API in the kernel), so that was the root reason for not allowing it.
Sure you get something weird, like files ownership, but that might be useful in some cases for power users? IIRC Kubernetes doesn't allow it, and they are discussing whether allow netns=host with user namespaces.
pkg/specgen/generate/namespaces.go
Outdated
| case specgen.Private: | ||
| if err := userNSConflictsWithPod(pod, s.UserNS.NSMode); err != nil { | ||
| return nil, err | ||
| } | ||
| case specgen.Auto: | ||
| if err := userNSConflictsWithPod(pod, s.UserNS.NSMode); err != nil { | ||
| return nil, err | ||
| } | ||
| case specgen.NoMap: | ||
| if err := userNSConflictsWithPod(pod, s.UserNS.NSMode); err != nil { | ||
| return nil, err | ||
| } | ||
| case specgen.Path: | ||
| if err := userNSConflictsWithPod(pod, s.UserNS.NSMode); err != nil { | ||
| return nil, err | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a bit inelegant and does not scale well. IT could be made much simpler by moving this out of the switch case.
since you already check the userns mode != FromPod in userNSConflictsWithPod() all you need to do is call userNSConflictsWithPod(pod,s.UserNS.NSMode) once before or after this switch case block.
test/e2e/create_test.go
Outdated
| cleanup := podmanTest.Podman([]string{"pod", "rm", "-f", "testing123"}) | ||
| cleanup.WaitWithDefaultTimeout() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is missing a Expect(session).Should(Exit(0)) to make sure the command worked.
New test code however should just use PodmanExitCleanly which combines this all into a one liner
podmanTest.PodmanExitCleanly("pod", "rm", "-f", "testing123")
test/e2e/create_test.go
Outdated
| cleanup = podmanTest.Podman([]string{"pod", "rm", "-f", "testing1234"}) | ||
| cleanup.WaitWithDefaultTimeout() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need to do final cleanup in e2e tests so this can be omitted
| done | ||
|
|
||
| # --userns and --pod have a different error message format | ||
| run_podman pod create --name userns-pod-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't hard code names in system tests, this test runs in parallel with other so names can conflict
use podname=p-$(safename) and then use this var as name to create/remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved
pkg/specgen/generate/namespaces.go
Outdated
| if pod != nil && pod.HasInfraContainer() && (pod.SharesIPC() || pod.SharesNet()) { | ||
| // FromPod mode is allowed - container inherits from pod | ||
| if mode != specgen.FromPod { | ||
| return fmt.Errorf("cannot set user namespace mappings that differ from pod: %w", define.ErrInvalidArg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking at the error mappings might be confusing, cannot set user namespace mode when joining pod with infra container is more clear I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the feedback. will resolve all at its earliest
21962d1 to
36709ee
Compare
Honny1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM. I have only one non-blocking style nit.
- remove old CLI validation that only checked --pod flag - add validation in namespaces.go to catch all paths (cli, quadlet, api) - block userns mixing for all pods with infra, not just ipc/net - update error message to be clearer - fix test cleanup to use PodmanExitCleanly() - use dynamic pod names in system tests to avoid conflicts fixes containers#26848 Signed-off-by: 0xdvc <neilohene@gmail.com>
2c5aaa2 to
afc6464
Compare
Remove incomplete CLI validation that only checked --pod flag and missed --pod-id-file (used by quadlet). Move validation to libpod/container_validate.go to catch all cases where --userns is set with --pod.
The new validation checks if container's ID mappings differ from the pod's
infra container and returns a clearer error message:
'cannot set user namespace mappings that differ from pod'
Fixes: #26848
Checklist
Ensure you have completed the following checklist for your pull request to be reviewed:
commits. (
git commit -s). (If needed, usegit commit -s --amend). The author email must matchthe sign-off email address. See CONTRIBUTING.md
for more information.
Fixes: #00000in commit message (if applicable)make validatepr(format/lint checks)Noneif no user-facing changes)Does this PR introduce a user-facing change?