Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions changelog/20250924_fix_provide_more_secure_pod_and_container.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
kind: fix
date: 2025-09-24
---

* To follow the [Pod Security Standards](https://v1-32.docs.kubernetes.io/docs/concepts/security/pod-security-standards/) more secure default pod `securityContext` settings were added.
Operator deployment `securityContext` settings that have changed:
- `allowPrivilegeEscalation: false`
- `capabilities.drop: [ ALL ]`
- `seccompProfile.type: RuntimeDefault`

Other workloads:
- `capabilities.drop: [ ALL ]` - container level
- `seccompProfile.type: RuntimeDefault` - pod level

> **Note**: If you require less restrictive `securityContext` settings please use `template` or `podTemplate` overrides.
> Detailed information about overrides can be found in [Modify Ops Manager or MongoDB Kubernetes Resource Containers](https://www.mongodb.com/docs/kubernetes/current/tutorial/modify-resource-image/).
Comment on lines +6 to +17
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,9 @@ func Test_buildDatabaseInitContainer(t *testing.T) {
SecurityContext: &corev1.SecurityContext{
ReadOnlyRootFilesystem: ptr.To(true),
AllowPrivilegeEscalation: ptr.To(false),
Capabilities: &corev1.Capabilities{
Copy link
Contributor

@lsierant lsierant Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any potential that adding this default will break any customer's workload (or rather prevent the operator from deploying or the workload sts from restarting) and will require some manual intervention? Just thinking about our semver guarantees.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is our own deployment, that we manage. If the customer wants a managedSecurityContext they are allowed to, but otherwise we should be able to modify the one we provide.

cc @mircea-cosbuc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is our defaults. Only problem I see with this is some customers now requiring explicitly setting capabilities.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be consider as a security fix? In that case we should be able to overwrite our defaults if they are not secure even if this forces customers to explicitly specify custom capabilities. What do you think?

Copy link
Contributor

@lsierant lsierant Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, do we force customers (which don't care about it) to do any manual fix when upgrading? If yes, we need to bump major.

Copy link
Collaborator Author

@MaciejKaras MaciejKaras Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what will we break here. We are changing our default SecurityContext for operator and other pods created. If customer wants to have dedicated SecurityContext or PodSecurityContext they need to set MANAGED_SECURITY_CONTEXT env var and our defaults will be completely overwritten. If they don't set MANAGED_SECURITY_CONTEXT every change they make to SecurityContext manually will be overwritten by our defaults.

Code that handles securityContext settings:

func WithDefaultSecurityContextsModifications() (Modification, container.Modification) {
managedSecurityContext := envvar.ReadBool(ManagedSecurityContextEnv) // nolint:forbidigo
configureContainerSecurityContext := container.NOOP()
configurePodSpecSecurityContext := NOOP()
if !managedSecurityContext {
configurePodSpecSecurityContext = WithSecurityContext(DefaultPodSecurityContext())
configureContainerSecurityContext = container.WithSecurityContext(container.DefaultSecurityContext())
}
return configurePodSpecSecurityContext, configureContainerSecurityContext
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have discussed with @lsierant that the change for the more strict Capabilities is only applied on the db/om containers, not the whole Pod. This will not affect other containers in the Pod i.e. security, istio sidecars that customer can have. The only change on the Pod level is adding seccompProfile: type: RuntimeDefault. We can do two things with it:

  • move the seccompProfile: type: RuntimeDefault to container level and don't specify it on pod level. We will have our containers with secure seccomp settings, but if customer will add any sidecar to it it will not have seccomp settings applied
  • keep it as is and secure entire Pod

@mircea-cosbuc looking for guidance here how to proceed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's best to set it at pod level. I think based on @lsierant this needs clarity on what customers might need to change on upgrade (if anything), outlining those scenarios and deciding if it's a breaking change.

Copy link
Collaborator Author

@MaciejKaras MaciejKaras Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked what are the consequences for using seccomp: type: RuntimeDefault and it defaults to what container runtime is used. Containerd and docker for example have very similar default seccomp profile -> https://docs.docker.com/engine/security/seccomp/#significant-syscalls-blocked-by-the-default-profile

Based on what I have found in official Kubernetes docs:

These profiles may differ between runtimes like CRI-O or containerd. They also differ for its used hardware architectures. But generally speaking, those default profiles allow a common amount of syscalls while blocking the more dangerous ones, which are unlikely or unsafe to be used in a containerized application.

Additionally on Red Hat OpenShift Container Platform RuntimeDefault is often enforced by default via Security Context Constraints (SCCs).

To summarise it is unlikely that users of our Operator require more syscalls permissions in MongoDB workloads than what is allowed by RuntimeDefault seccomp. Nevertheless I should add comment in the changelog how to mitigate securityContext defaults by using managedSecurityContext.

@lsierant @mircea-cosbuc let me know if that justifies approving PR. I have already edited changelog.

Drop: []corev1.Capability{"ALL"},
},
},
}
assert.Equal(t, expectedContainer, container)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,9 @@ func Test_buildOpsManagerAndBackupInitContainer(t *testing.T) {
SecurityContext: &corev1.SecurityContext{
ReadOnlyRootFilesystem: ptr.To(true),
AllowPrivilegeEscalation: ptr.To(false),
Capabilities: &corev1.Capabilities{
Drop: []corev1.Capability{"ALL"},
},
},
}
assert.Equal(t, expectedContainer, containerObj)
Expand Down
5 changes: 5 additions & 0 deletions docker/mongodb-kubernetes-tests/kubetester/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -274,6 +274,11 @@ def delete_namespace(name: str):
c.delete_namespace(name, body=c.V1DeleteOptions())


def label_namespace(name: str, labels: dict):
body = {"metadata": {"labels": labels}}
client.CoreV1Api().patch_namespace(name, body)


def get_deployments(namespace: str):
return client.AppsV1Api().list_namespaced_deployment(namespace)

Expand Down
3 changes: 2 additions & 1 deletion docker/mongodb-kubernetes-tests/kubetester/helm.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ def helm_template(
args = ("helm", "template", *(command_args), _helm_chart_dir(helm_chart_path))
logger.info(" ".join(args))

yaml_file_name = "{}.yaml".format(str(uuid.uuid4()))
home = os.getenv("HOME")
yaml_file_name = os.path.join(home, "{}.yaml".format(str(uuid.uuid4())))
with open(yaml_file_name, "w") as output:
process_run_and_check(" ".join(args), stdout=output, check=True, shell=True)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
from kubetester import (
create_or_update_secret,
get_default_storage_class,
label_namespace,
try_load,
wait_until,
)
Expand Down Expand Up @@ -42,15 +43,25 @@ def s3_bucket(aws_s3_client: AwsS3Client, namespace: str) -> str:
yield from create_s3_bucket(aws_s3_client, "test-bucket-sharded-")


@fixture(scope="module")
def enforced_pss_namespace(namespace: str) -> str:
# Change pod-security mode from warn to enforce. This will make test fail if operator and deployments don't support enforce mode
# This will not work in multi-cluster, because Istio injects sidecar and that breaks restricted level
if not is_multi_cluster():
label_namespace(namespace, {"pod-security.kubernetes.io/enforce": "restricted"})

return namespace


@fixture(scope="module")
def ops_manager(
namespace: str,
enforced_pss_namespace: str,
s3_bucket: str,
custom_version: Optional[str],
custom_appdb_version: str,
) -> MongoDBOpsManager:
resource: MongoDBOpsManager = MongoDBOpsManager.from_yaml(
yaml_fixture("om_ops_manager_backup.yaml"), namespace=namespace
yaml_fixture("om_ops_manager_backup.yaml"), namespace=enforced_pss_namespace
)

try_load(resource)
Expand Down
11 changes: 10 additions & 1 deletion helm_chart/templates/operator.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,10 @@ spec:
securityContext:
runAsNonRoot: true
runAsUser: 2000
seccompProfile:
type: RuntimeDefault
{{- end }}
{{- if .Values.registry.imagePullSecrets}}
{{- if .Values.registry.imagePullSecrets }}
imagePullSecrets:
- name: {{ .Values.registry.imagePullSecrets }}
{{- end }}
Expand Down Expand Up @@ -74,6 +76,13 @@ spec:
requests:
cpu: {{ .Values.operator.resources.requests.cpu }}
memory: {{ .Values.operator.resources.requests.memory }}
{{- if not .Values.managedSecurityContext }}
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
{{- end }}
env:
- name: OPERATOR_ENV
value: {{ .Values.operator.env }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -206,5 +206,11 @@ func WithSecurityContext(context corev1.SecurityContext) Modification {
func DefaultSecurityContext() corev1.SecurityContext {
readOnlyRootFilesystem := true
allowPrivilegeEscalation := false
return corev1.SecurityContext{ReadOnlyRootFilesystem: &readOnlyRootFilesystem, AllowPrivilegeEscalation: &allowPrivilegeEscalation}
return corev1.SecurityContext{
ReadOnlyRootFilesystem: &readOnlyRootFilesystem,
AllowPrivilegeEscalation: &allowPrivilegeEscalation,
Capabilities: &corev1.Capabilities{
Drop: []corev1.Capability{"ALL"},
},
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,13 @@ func DefaultPodSecurityContext() corev1.PodSecurityContext {
runAsNonRoot := true
runAsUser := int64(2000)
fsGroup := int64(2000)
return corev1.PodSecurityContext{RunAsUser: &runAsUser, RunAsNonRoot: &runAsNonRoot, FSGroup: &fsGroup}

return corev1.PodSecurityContext{
RunAsUser: &runAsUser,
RunAsNonRoot: &runAsNonRoot,
FSGroup: &fsGroup,
SeccompProfile: &corev1.SeccompProfile{Type: corev1.SeccompProfileTypeRuntimeDefault},
}
}

// WithImagePullSecrets adds an ImagePullSecrets local reference with the given name
Expand Down
4 changes: 3 additions & 1 deletion mongodb-community-operator/scripts/dev/e2e.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,9 @@ def _prepare_test_environment(namespace) -> None:

print("Creating Namespace")
k8s_conditions.ignore_if_already_exists(
lambda: corev1.create_namespace(client.V1Namespace(metadata=dict(name=namespace)))
lambda: corev1.create_namespace(
client.V1Namespace(metadata=dict(name=namespace, labels={"pod-security.kubernetes.io/warn": "restricted"}))
)
)

print("Creating Cluster Role Binding and Service Account for test pod")
Expand Down
7 changes: 7 additions & 0 deletions public/mongodb-kubernetes-multi-cluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -329,6 +329,8 @@ spec:
securityContext:
runAsNonRoot: true
runAsUser: 2000
seccompProfile:
type: RuntimeDefault
containers:
- name: mongodb-kubernetes-operator-multi-cluster
image: "quay.io/mongodb/mongodb-kubernetes:1.4.0"
Expand All @@ -353,6 +355,11 @@ spec:
requests:
cpu: 500m
memory: 200Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
env:
- name: OPERATOR_ENV
value: prod
Expand Down
7 changes: 7 additions & 0 deletions public/mongodb-kubernetes.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -329,6 +329,8 @@ spec:
securityContext:
runAsNonRoot: true
runAsUser: 2000
seccompProfile:
type: RuntimeDefault
containers:
- name: mongodb-kubernetes-operator
image: "quay.io/mongodb/mongodb-kubernetes:1.4.0"
Expand All @@ -349,6 +351,11 @@ spec:
requests:
cpu: 500m
memory: 200Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
env:
- name: OPERATOR_ENV
value: prod
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ spec:
emptyDir: { }
- name: diagnostics
emptyDir: { }
- name: tests-home-dir
emptyDir: { }
{{ if .Values.multiCluster.memberClusters }}
- name: kube-config-volume
secret:
Expand All @@ -52,6 +54,12 @@ spec:
defaultMode: 420
secretName: test-pod-multi-cluster-config
{{ end }}
securityContext:
runAsNonRoot: true
runAsUser: 2000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault
containers:
- image: public.ecr.aws/docker/library/busybox:1.37.0
name: keepalive
Expand All @@ -61,6 +69,11 @@ spec:
mountPath: /tmp/results
- name: diagnostics
mountPath: /tmp/diagnostics
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
- name: mongodb-enterprise-operator-tests
env:
# OTEL env vars can either be used to construct custom spans or are used by pytest opentelemetry dynamic instrumentation
Expand Down Expand Up @@ -190,6 +203,9 @@ spec:
value: {{ .Values.cognito_workload_url }}
- name: cognito_workload_user_id
value: {{ .Values.cognito_workload_user_id }}
# Used by helm to create .config and .cache directories. Also used by some tests that need to write files.
- name: HOME
value: /home/tests-home
image: {{ .Values.repo }}/mongodb-kubernetes-tests:{{ .Values.tag }}
# Options to pytest command should go in the pytest.ini file.
command: ["pytest"]
Expand All @@ -206,9 +222,16 @@ spec:
mountPath: /tmp/results
- name: diagnostics
mountPath: /tmp/diagnostics
- name: tests-home-dir
mountPath: /home/tests-home
{{ if .Values.multiCluster.memberClusters }}
- mountPath: /etc/config
name: kube-config-volume
- mountPath: /etc/multicluster
name: multi-cluster-config
{{ end }}
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
9 changes: 5 additions & 4 deletions scripts/funcs/kubernetes
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ metadata:
name: ${namespace}
labels:
evg: task
pod-security.kubernetes.io/warn: "restricted"
annotations:
evg/version: "https://evergreen.mongodb.com/version/${version_id:-'not-specified'}"
evg/task-name: ${TASK_NAME:-'not-specified'}
Expand Down Expand Up @@ -98,15 +99,15 @@ create_image_registries_secret() {
context=$1
namespace=$2
secret_name=$3

# Detect the correct config file path based on container runtime
local config_file
local temp_config_file=""
if command -v podman &> /dev/null && (podman info &> /dev/null || sudo podman info &> /dev/null); then
# For Podman, use root's auth.json since minikube uses sudo podman
config_file="/root/.config/containers/auth.json"
echo "Using Podman config: ${config_file}"

# Create a temporary copy that the current user can read
temp_config_file=$(mktemp)
sudo cp "${config_file}" "${temp_config_file}"
Expand All @@ -117,7 +118,7 @@ create_image_registries_secret() {
config_file="${HOME}/.docker/config.json"
echo "Using Docker config: ${config_file}"
fi

# shellcheck disable=SC2154
if kubectl --context "${context}" get namespace "${namespace}"; then
kubectl --context "${context}" -n "${namespace}" delete secret "${secret_name}" --ignore-not-found
Expand All @@ -127,7 +128,7 @@ create_image_registries_secret() {
else
echo "Skipping creating pull secret in ${context}/${namespace}. The namespace doesn't exist yet."
fi

# Clean up temporary file
if [[ -n "${temp_config_file}" ]] && [[ -f "${temp_config_file}" ]]; then
rm -f "${temp_config_file}"
Expand Down