You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Internal-communication security review of the platform as it would run in a cluster. Tracks hardening work; exploit specifics intentionally omitted from this public issue. If any item warrants coordinated disclosure, move that item to a private GitHub Security Advisory.
Static analysis only — items marked runtime check need cluster verification.
Critical
C1. Gateway can be bypassed; MCP servers trust any cluster pod. Upstream calls in services/mcp-proxy/main.go:166-167 use plain http.DefaultTransport; identity is forwarded as plain headers (services/mcp-proxy/main.go:765-784). Combined with C2, any pod can reach an MCP server directly with arbitrary identity headers. Fix: mTLS or HMAC-signed identity headers between gateway and upstream, plus NetworkPolicy.
C2. Zero NetworkPolicies in repo.find . -name '*.yaml' | xargs grep -l 'kind: NetworkPolicy' → 0 matches. Add default-deny + explicit allow per namespace (mcp-sentinel, mcp-runtime, mcp-servers, registry).
C3. Default header-mode auth has no verification.services/mcp-proxy/main.go:546-553 reads humanID/agentID/sessionID from headers without signature/MAC/JWT check. OAuth mode (policy.Auth.Mode == \"oauth\", line 657-688) does verify JWT. Recommend defaulting to OAuth in production policies and gating header mode behind an explicit dev flag.
C4. Only `tools/call` is policy-gated.services/mcp-proxy/main.go:827-829 calls policypkg.IsToolCallMethod, which matches only `tools/call` and `call_tool` (pkg/policy/helpers.go:9-16). `tools/list`, `prompts/`, `resources/`, `completion/complete` bypass grants entirely. At minimum extend gating to `tools/list` and the resources/prompts methods.
C5. Ingest fails open when both `API_KEYS` and `OIDC_JWKS_URL` are unset.services/ingest/main.go:247-250 returns next handler with no auth in that case. Flip to fail-closed and refuse to start when no auth source is configured.
C6. Gateway ClusterRole grants `get/list/watch` on `secrets` cluster-wide.k8s/10-gateway.yaml:402. Traefik does not need this. Drop `secrets` from the resources list.
High
H1. API ServiceAccount has cluster-wide Deployment CRUD + user/group impersonate.k8s/08-api-rbac.yaml:23,38-39. Reduce deployment verbs to `["patch"]`; remove the `impersonate` rule unless it is actively used (and if so, scope it).
H2. Plain HTTP everywhere; Kafka `PLAINTEXT`. No mTLS / Istio / Linkerd / SPIFFE config in repo. Adopt a service mesh or, at minimum, TLS for Kafka and the analytics path.
H3. Bundled Docker registry has no authentication.config/registry/base/ingress.yaml. Any pod can push images. Add htpasswd/OAuth in front of the registry, or restrict via NetworkPolicy + auth proxy. Runtime check for any prod overlay that already adds auth.
H4. Prometheus and Grafana exposed via gateway with no auth.k8s/10-gateway.yaml:525-538. k8s/02-secrets.yaml.example:29 ships `changeme` placeholder for Grafana admin. Gate both paths behind auth middleware and reject placeholder credentials at setup time.
H5. Shared API key model — no per-user isolation.API_KEYS is reused across UI/API/ingest/mcp-proxy; UI proxies upstream as a single shared key (services/ui/main.go:238-239). When `ADMIN_API_KEYS` is unset, all keys are admin (services/api/main.go:614). Move toward per-user/per-service credentials with attribution in audit logs.
H6. No replay/idempotency on tool calls. Add an `Idempotency-Key` (or JSON-RPC `id` dedup) for state-changing tool calls.
Medium
M1. API can `delete`/`deletecollection` namespaces and NetworkPolicies cluster-wide — k8s/08-api-rbac.yaml:30-35. Make namespaces read-only for the API SA.
M2. Operator has cluster-wide `secrets:get` with no `resourceNames` filter — config/rbac/role.yaml:29-31. Restrict to the specific secrets it manages.
M3. `PLATFORM_ADMIN_PASSWORD` may persist in API env after bootstrap. Setup renders it (around internal/cli/setup.go:2137) but no code clears it post-bootstrap; CLAUDE.md documents a manual `kubectl patch`. Automate the cleanup.
M4. Policy reload is on a 5s timer (services/mcp-proxy/main.go:451), so revocations take up to 5s to apply. Consider a watch-based reload or a shorter interval for incident response.
M5. Traefik PII redactor preserves `Authorization`, `X-Internal-Auth`, `X-Api-Key` (services/traefik-plugins/pii-redactor/redactor.go:29). Per CLAUDE.md the redactor is dev-overlay only — confirm prod registry ingress does not reference `pii-redactor@file` (it would break the Docker API).
M6. Verify JWT `exp` is enforced in mcp-proxy OAuth path (services/mcp-proxy/main.go:657-683). Runtime / parser-options check.
M7. All services bind `0.0.0.0`. Acceptable with NetworkPolicy (C2); for sidecar-only services, prefer `127.0.0.1`.
M8. `/health` on API leaks `runtime_initialized` and `runtime_error` strings — services/api/main.go:207-220. Reduce to a static OK.
Low
L1. `/metrics` endpoints are unauthenticated on dedicated ports (acceptable only with NetworkPolicy).
L2. `x-forwarded-for` used as rate-limit key without trusted-proxy validation — services/ui/main.go:569-579.
L3. Example secrets file ships `change-me-now` / `changeme` placeholders; no setup-time guard rejects them.
L4. `automountServiceAccountToken: true` left default on Sentinel deployments.
Suggested fix order
Default-deny NetworkPolicy per namespace + explicit allow rules (closes most of the multiplier behind C1, H2, M7, L1).
Make ingest auth fail-closed (C5).
Extend the policy gate beyond `tools/call` (C4).
Tighten gateway and API RBAC (C6, H1, M1, M2).
Sign or mTLS the gateway→upstream identity hop (C1).
Notes
Static analysis on `main`. Findings reference exact file:line at time of audit; please re-verify against current HEAD when implementing.
Repo is marked alpha (per CLAUDE.md); this issue is sanitized for public tracking. If any item should be treated as a coordinated-disclosure vulnerability, open a private GitHub Security Advisory and link it here without exploit detail.
Internal-communication security review of the platform as it would run in a cluster. Tracks hardening work; exploit specifics intentionally omitted from this public issue. If any item warrants coordinated disclosure, move that item to a private GitHub Security Advisory.
Static analysis only — items marked runtime check need cluster verification.
Critical
services/mcp-proxy/main.go:166-167use plainhttp.DefaultTransport; identity is forwarded as plain headers (services/mcp-proxy/main.go:765-784). Combined with C2, any pod can reach an MCP server directly with arbitrary identity headers. Fix: mTLS or HMAC-signed identity headers between gateway and upstream, plus NetworkPolicy.find . -name '*.yaml' | xargs grep -l 'kind: NetworkPolicy'→ 0 matches. Add default-deny + explicit allow per namespace (mcp-sentinel,mcp-runtime,mcp-servers,registry).services/mcp-proxy/main.go:546-553readshumanID/agentID/sessionIDfrom headers without signature/MAC/JWT check. OAuth mode (policy.Auth.Mode == \"oauth\", line 657-688) does verify JWT. Recommend defaulting to OAuth in production policies and gating header mode behind an explicit dev flag.services/mcp-proxy/main.go:827-829callspolicypkg.IsToolCallMethod, which matches only `tools/call` and `call_tool` (pkg/policy/helpers.go:9-16). `tools/list`, `prompts/`, `resources/`, `completion/complete` bypass grants entirely. At minimum extend gating to `tools/list` and the resources/prompts methods.services/ingest/main.go:247-250returns next handler with no auth in that case. Flip to fail-closed and refuse to start when no auth source is configured.k8s/10-gateway.yaml:402. Traefik does not need this. Drop `secrets` from the resources list.High
k8s/08-api-rbac.yaml:23,38-39. Reduce deployment verbs to `["patch"]`; remove the `impersonate` rule unless it is actively used (and if so, scope it).config/registry/base/ingress.yaml. Any pod can push images. Add htpasswd/OAuth in front of the registry, or restrict via NetworkPolicy + auth proxy. Runtime check for any prod overlay that already adds auth.k8s/10-gateway.yaml:525-538.k8s/02-secrets.yaml.example:29ships `changeme` placeholder for Grafana admin. Gate both paths behind auth middleware and reject placeholder credentials at setup time.API_KEYSis reused across UI/API/ingest/mcp-proxy; UI proxies upstream as a single shared key (services/ui/main.go:238-239). When `ADMIN_API_KEYS` is unset, all keys are admin (services/api/main.go:614). Move toward per-user/per-service credentials with attribution in audit logs.Medium
k8s/08-api-rbac.yaml:30-35. Make namespaces read-only for the API SA.config/rbac/role.yaml:29-31. Restrict to the specific secrets it manages.internal/cli/setup.go:2137) but no code clears it post-bootstrap; CLAUDE.md documents a manual `kubectl patch`. Automate the cleanup.services/mcp-proxy/main.go:451), so revocations take up to 5s to apply. Consider a watch-based reload or a shorter interval for incident response.services/traefik-plugins/pii-redactor/redactor.go:29). Per CLAUDE.md the redactor is dev-overlay only — confirm prod registry ingress does not reference `pii-redactor@file` (it would break the Docker API).services/mcp-proxy/main.go:657-683). Runtime / parser-options check.services/api/main.go:207-220. Reduce to a static OK.Low
services/ui/main.go:569-579.Suggested fix order
Notes