-
Notifications
You must be signed in to change notification settings - Fork 2.2k
fix rootfs propagation mode to shared / unbindable #4724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
# Set rootfsPropagation to shared | ||
update_config ' .linux.rootfsPropagation = "shared" ' | ||
|
||
update_config ' .process.args = ["sh", "-c", "findmnt --noheadings -o PROPAGATION /"] ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to wrap findmnt
in sh
} | ||
|
||
@test "runc run [rootfsPropagation shared]" { | ||
# Set rootfsPropagation to shared |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is obvious from the code and does not need to be explained in a code comment line
# Run the container and capture the output | ||
runc run test_shared_rootfs | ||
[ "$status" -eq 0 ] | ||
[[ "$output" == *"shared"* ]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[[ "$output" == *"shared"* ]] | |
[ "$output" == "shared" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strictly speaking this should be
[ "$output" = "shared" ]
(i.e. single =
), since test
(aka [
) only support =
for string comparison (unlike [[
which supports ==
).
libcontainer/rootfs_linux.go
Outdated
@@ -232,6 +236,27 @@ func prepareRootfs(pipe *syncSocket, iConfig *initConfig) (err error) { | |||
return nil | |||
} | |||
|
|||
// adjustRootMountPropagation applies mount propagation flags such as MS_SHARED or MS_UNBINDABLE | |||
// to the root mount after the pivot_root or chroot has taken place. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a comment to explain why this cannot be done in prepareRoot
:
runc/libcontainer/rootfs_linux.go
Lines 1012 to 1027 in 3d8a278
func prepareRoot(config *configs.Config) error { | |
flag := unix.MS_SLAVE | unix.MS_REC | |
if config.RootPropagation != 0 { | |
flag = config.RootPropagation | |
} | |
if err := mount("", "/", "", uintptr(flag), ""); err != nil { | |
return err | |
} | |
if err := rootfsParentMountPrivate(config.Rootfs); err != nil { | |
return err | |
} | |
return mount(config.Rootfs, config.Rootfs, "bind", unix.MS_BIND|unix.MS_REC, "") | |
} | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a comment.
This issue has a lot of history, I wonder if @cyphar or @kolyshkin would like to take a look |
Please squash and sign off the commits |
24745a2
to
897ae54
Compare
897ae54
to
bb317a0
Compare
This sounds reasonable to me. Do we know what crun is doing? Also, @cyphar , do you want to take a look? |
The behavior of shared mount propagation, as it relates to both Runc$ podman info --format {{.Host.OCIRuntime.Version}}
runc version 1.3.0-rc.1+dev
commit: bb317a03
spec: 1.2.1
go: go1.23.0
libseccomp: 2.5.5
$ mkdir ~/hoge && mount -t tmpfs tmpfs ~/hoge
$ podman run -itd --privileged --name testcon --volume ~/hoge:/hoge:z,shared fedora-minimal:42 /bin/bash
babefd2d44a9d237a236482ddf593b578e8d56225c3f42220c582e484a779620
$ cat /var/lib/containers/storage/overlay-containers/babefd2d44a9d237a236482ddf593b578e8d56225c3f42220c582e484a779620/userdata/config.json | jq .linux.rootfsPropagation
"shared"
$ podman exec testcon mkdir /test
$ podman exec testcon mount -t tmpfs tmpfs /test
$ podman exec testcon findmnt -o "TARGET,PROPAGATION"
TARGET PROPAGATION
/ shared
|-/test shared
|-/hoge shared
|-/proc private
|-/dev private
| |-/dev/console private
| |-/dev/pts private
| |-/dev/mqueue private
| `-/dev/shm private
|-/sys private
| `-/sys/fs/cgroup private
|-/etc/resolv.conf private
|-/etc/hosts private
|-/run/.containerenv private
`-/etc/hostname private With crun$ podman info --format {{.Host.OCIRuntime.Version}}
crun version 1.14.1
commit: de537a7965bfbe9992e2cfae0baeb56a08128171
rundir: /run/crun
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +WASM:wasmedge +YAJL
$ podman run -itd --privileged --name testcon --volume ~/hoge:/hoge:z,shared fedora-minimal:42 /bin/bash
90a94c7a21000c8a83389ae085c4846a505da2e2891ab25f5b0e325f79de1f62
$ cat /var/lib/containers/storage/overlay-containers/90a94c7a21000c8a83389ae085c4846a505da2e2891ab25f5b0e325f79de1f62/userdata/config.json | jq .linux.rootfsPropagation
"shared"
$ podman exec testcon mkdir /test
$ podman exec testcon mount -t tmpfs tmpfs /test
$ podman exec testcon findmnt -o "TARGET,PROPAGATION"
TARGET PROPAGATION
/ shared
|-/test shared
|-/hoge shared
|-/proc private
|-/dev private
| |-/dev/console private
| |-/dev/pts private
| |-/dev/mqueue private
| `-/dev/shm private
|-/sys private
| `-/sys/fs/cgroup private
|-/run/.containerenv private
|-/etc/hostname private
|-/etc/resolv.conf private
`-/etc/hosts private As shown above, the result of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
libcontainer/rootfs_linux.go
Outdated
return nil | ||
} | ||
|
||
if err := unix.Mount("", "/", "", flag, ""); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use rootPropagation
directly?
libcontainer/rootfs_linux.go
Outdated
if err := adjustRootMountPropagation(uintptr(config.RootPropagation)); err != nil { | ||
return err | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that we can use config.RootPropagation directly, and not use a separate function.
Something like
// Apply root mount propagation flags.
if config.RootPropagation != 0 {
if err := mount("", "/", "", uintptr(flag), ""); err != nil {
return fmt.Errorf("unable to apply root propagation flags: %w", err )
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolyshkin @lifubang
Thank you for the review. I've fixed.
bb317a0
to
1b28a8a
Compare
libcontainer/rootfs_linux.go
Outdated
// to the current root ("/"), and not to the old rootfs before it becomes "/". Applying the | ||
// flag in prepareRoot would affect the host mount namespace if the container's | ||
// root mount is shared. | ||
if config.RootPropagation != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we add a unix.MS_PRIVATE
comparison, it will reduce a mount syscall.
if config.RootPropagation != 0 && config.RootPropagation & unix.MS_PRIVATE == 0 {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
1b28a8a
to
ecec73d
Compare
Signed-off-by: Yusuke Sakurai <yusuke.sakurai@3-shake.com>
ecec73d
to
04be81b
Compare
This PR adds support for applying mount propagation settings (MS_SHARED or MS_UNBINDABLE) to the container root based on the value of config.RootPropagation.
We apply mount propagation after executing pivot_root and rootfsParentMountPrivate
Fixes #1755
Related:
#1815
youki-dev/youki#3141
Signed-off-by: Yusuke Sakurai yusuke.sakurai@3-shake.com