Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ kubectl patch secret mcp-sentinel-secrets -n mcp-sentinel --type merge -p '{"str
- `mcp.mcpruntime.org` (default ingress host for `MCPServer` when you use host-based routing)
- `platform.mcpruntime.org` (dashboard / admin UI — the primary user-facing entrypoint)
- **Expected public URLs (after DNS and TLS):**
- Dashboard UI: `https://platform.mcpruntime.org/` (also serves `/api`, `/grafana`, `/prometheus` under the same host so users do not need raw IPs / dev path-based routing)
- Dashboard UI: `https://platform.mcpruntime.org/` (also serves `/api` under the same host). `/grafana` and `/prometheus` are intentionally **not** exposed on the public platform host — those tools have no built-in auth in this stack. Reach them with `kubectl port-forward -n mcp-sentinel svc/grafana 3000:3000` / `svc/prometheus 9090:9090`, or front them with your own auth-aware ingress.
- Registry: `https://registry.mcpruntime.org` (or HTTP before TLS, depending on overlay)
- Each MCP server (default `IngressPath` is `/{metadata.name}/mcp`): e.g. `https://mcp.mcpruntime.org/demo-one/mcp` for a server named `demo-one` in the default shape
- **Let’s Encrypt and DNS:** the setup TLS flow requests `registry/registry-cert` for `registry.<domain>` and `mcp.<domain>` when those names are in env-derived config. `platform.<domain>` is separate: the `mcp-sentinel-platform-ui` Ingress in `mcp-sentinel` asks cert-manager to write `mcp-sentinel-platform-tls`. **All three** public DNS A/AAAA (or CNAME) records must exist and point to the **same** public ingress IP (or stable LB). A typo in DNS (e.g. `regsitry` instead of **registry**, or `platfrom` instead of **platform**) will break the matching hostname. Port **80** must hit Traefik for **HTTP-01** before certs are issued.
Expand All @@ -147,10 +147,11 @@ kubectl patch secret mcp-sentinel-secrets -n mcp-sentinel --type merge -p '{"str
- **Analytics 401:** use gateway/ingest URL and key, not the app’s random env. Example: `ANALYTICS_INGEST_URL=http://mcp-sentinel-ingest.mcp-sentinel.svc.cluster.local:8081/events` and `ANALYTICS_API_KEY` from `mcp-sentinel-secrets` (`API_KEYS` key).
- **Secret not found in workload namespace:** copy `mcp-sentinel-secrets` or use a shared secret reference.
- **Dashboard / API 401:** align `API_KEYS` and `UI_API_KEY` and roll the API deployment.
- **Dashboard 308 redirect loop in dev:** the UI service redirects HTTP→HTTPS for non-local hosts when it sees `X-Forwarded-Proto: http`. Override with the `UI_REQUIRE_HTTPS` env on the `mcp-sentinel-ui` deployment: `auto` (default — redirect public hosts only), `true` (always redirect on http), `false` (never redirect, use this when the UI is fronted by a non-TLS terminator on a real hostname).
- **Ingress / routes:** `kubectl get ingress -A` and confirm paths match the gateway and demo servers you expect.
- **Private / HTTP in-cluster registry / k3s:** Pull and push can fail with `https` vs `http` or `registry.local` DNS on nodes. See **k3s and HTTP registry (config files)** below, set **`MCP_REGISTRY_*`** before `pipeline generate` when you want `ClusterIP:port` in manifests, and raise **`MCP_DEPLOYMENT_TIMEOUT`** if setup rollouts time out on slow first pulls.
- **Prod DNS / ACME:** with `MCP_PLATFORM_DOMAIN=example.com`, setup derives `registry.example.com`, `mcp.example.com`, and `platform.example.com`. All three public DNS records must point at the ingress IP and port 80 must reach Traefik for HTTP-01. If cert-manager reports NXDOMAIN, verify from outside and inside the cluster: `getent hosts registry.example.com`, `getent hosts mcp.example.com`, `getent hosts platform.example.com`, and `kubectl run dns-check --rm -i --restart=Never --image=busybox:1.36 -- nslookup platform.example.com`.
- **Platform UI 404 / wrong host:** when `MCP_PLATFORM_DOMAIN` (or `MCP_PLATFORM_INGRESS_HOST`) is set, setup applies a host-based ingress `mcp-sentinel-platform-ui` in `mcp-sentinel`. Verify with `kubectl get ingress mcp-sentinel-platform-ui -n mcp-sentinel -o yaml`; the rule should be host=`platform.<domain>` routing `/` to `mcp-sentinel-ui:8082` (and `/api`, `/grafana`, `/prometheus` to those services). If the dashboard returns Traefik default 404, check that DNS resolves `platform.<domain>` to the cluster ingress, then `kubectl logs -n traefik deploy/traefik --tail=120` for routing errors. The dev path-based gateway (`mcp-sentinel-gateway`) keeps working when `MCP_PLATFORM_DOMAIN` is unset.
- **Platform UI 404 / wrong host:** when `MCP_PLATFORM_DOMAIN` (or `MCP_PLATFORM_INGRESS_HOST`) is set, setup applies a host-based ingress `mcp-sentinel-platform-ui` in `mcp-sentinel` (and, when TLS is enabled, a sibling `mcp-sentinel-platform-ui-http` for HTTP→HTTPS redirect). Verify with `kubectl get ingress mcp-sentinel-platform-ui -n mcp-sentinel -o yaml`; the rule should be host=`platform.<domain>` routing `/` to `mcp-sentinel-ui:8082` (and `/api` to the same service). `/grafana` and `/prometheus` are deliberately not on the public host. If the dashboard returns Traefik default 404, check that DNS resolves `platform.<domain>` to the cluster ingress, then `kubectl logs -n traefik deploy/traefik --tail=120` for routing errors. The dev path-based gateway (`mcp-sentinel-gateway`) keeps working when `MCP_PLATFORM_DOMAIN` is unset.
- **Prod registry 404 / image pulls say “not found”:** if `registry-cert` is Ready but pods fail to pull `registry.<domain>/<repo>:<tag>`, check the public registry route: `curl -k -i https://registry.<domain>/v2/`. Expected is HTTP 200 with `docker-distribution-api-version: registry/2.0`; Traefik `404 page not found` means the ingress/router is not active. Check `kubectl logs -n traefik deploy/traefik --tail=120` and `kubectl get ingress registry -n registry -o yaml`. In prod, the registry ingress must not reference the dev-only `pii-redactor@file` middleware.
- **Prod MCP server URLs:** prefer path-based public routing for clients: `https://mcp.<domain>/<server-name>/mcp`. Use `spec.publicPathPrefix: <server-name>` and set the server’s `MCP_PATH` to `/<server-name>/mcp`; avoid examples that require a custom `Host` header such as `go.example.local`.

Expand Down
68 changes: 48 additions & 20 deletions internal/cli/platform_ingress.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import (
)

const platformIngressName = "mcp-sentinel-platform-ui"
const platformHTTPRedirectIngressName = "mcp-sentinel-platform-ui-http"
const platformTLSSecretName = "mcp-sentinel-platform-tls"

// applyPlatformIngressIfConfigured applies a host-based ingress for the
Expand All @@ -27,12 +28,19 @@ func applyPlatformIngressIfConfigured(kubectl KubectlRunner) error {
}

// renderPlatformIngressManifest emits an Ingress that maps platform.<domain>
// to the dashboard UI, /api on the same UI service (which reverse-proxies to
// mcp-sentinel-api via API_UPSTREAM), and the in-cluster Grafana / Prometheus
// paths. When issuerName is set, a TLS section and cert-manager annotation are
// added so cert-manager's ingress-shim provisions a Certificate for
// platform.<domain> into the mcp-sentinel-platform-tls Secret in the same
// namespace as the Ingress.
// to the dashboard UI and /api on the same UI service (which reverse-proxies
// to mcp-sentinel-api via API_UPSTREAM). Grafana and Prometheus are
// intentionally NOT exposed on the public platform host: those tools have no
// built-in auth in this stack, so operators must reach them via port-forward
// or a private ingress instead.
//
// When issuerName is set, a TLS section and cert-manager annotation are added
// so cert-manager's ingress-shim provisions a Certificate for platform.<domain>
// into the mcp-sentinel-platform-tls Secret in the same namespace as the
// Ingress. A second Ingress on the `web` entrypoint is also emitted so HTTP
// requests to the same host hit the UI service, which redirects to HTTPS.
// (We can't rely on Traefik's entrypoint-level redirect because the prod
// overlay disables it to keep HTTP-01 ACME challenges working on first issue.)
func renderPlatformIngressManifest(host, issuerName string) string {
host = strings.TrimSpace(host)
issuerName = strings.TrimSpace(issuerName)
Expand Down Expand Up @@ -81,26 +89,46 @@ func renderPlatformIngressManifest(host, issuerName string) string {
b.WriteString(" name: mcp-sentinel-ui\n")
b.WriteString(" port:\n")
b.WriteString(" number: 8082\n")
b.WriteString(" - path: /grafana\n")
b.WriteString(" pathType: Prefix\n")
b.WriteString(" backend:\n")
b.WriteString(" service:\n")
b.WriteString(" name: grafana\n")
b.WriteString(" port:\n")
b.WriteString(" number: 3000\n")
b.WriteString(" - path: /prometheus\n")
b.WriteString(" pathType: Prefix\n")
b.WriteString(" backend:\n")
b.WriteString(" service:\n")
b.WriteString(" name: prometheus\n")
b.WriteString(" port:\n")
b.WriteString(" number: 9090\n")
b.WriteString(" - path: /\n")
b.WriteString(" pathType: Prefix\n")
b.WriteString(" backend:\n")
b.WriteString(" service:\n")
b.WriteString(" name: mcp-sentinel-ui\n")
b.WriteString(" port:\n")
b.WriteString(" number: 8082\n")

if issuerName != "" {
// HTTP-only ingress on the same host so plain `http://platform.<domain>/`
// hits the UI service (which 308s to HTTPS) instead of falling through to
// the host-less dev gateway ingress in k8s/10-gateway.yaml.
b.WriteString("---\n")
b.WriteString("apiVersion: networking.k8s.io/v1\n")
b.WriteString("kind: Ingress\n")
b.WriteString("metadata:\n")
b.WriteString(" name: ")
b.WriteString(platformHTTPRedirectIngressName)
b.WriteString("\n")
b.WriteString(" namespace: ")
b.WriteString(defaultAnalyticsNamespace)
b.WriteString("\n")
b.WriteString(" annotations:\n")
b.WriteString(" traefik.ingress.kubernetes.io/router.entrypoints: web\n")
b.WriteString("spec:\n")
b.WriteString(" ingressClassName: traefik\n")
b.WriteString(" rules:\n")
b.WriteString(" - host: ")
b.WriteString(strconv.Quote(host))
b.WriteString("\n")
b.WriteString(" http:\n")
b.WriteString(" paths:\n")
b.WriteString(" - path: /\n")
b.WriteString(" pathType: Prefix\n")
b.WriteString(" backend:\n")
b.WriteString(" service:\n")
b.WriteString(" name: mcp-sentinel-ui\n")
b.WriteString(" port:\n")
b.WriteString(" number: 8082\n")
}

return b.String()
}
66 changes: 49 additions & 17 deletions internal/cli/platform_ingress_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,19 +13,27 @@ func TestRenderPlatformIngressManifestNoTLS(t *testing.T) {
"traefik.ingress.kubernetes.io/router.entrypoints: web",
`- host: "platform.example.com"`,
"- path: /api\n",
"- path: /grafana\n",
"- path: /prometheus\n",
"- path: /\n",
"name: mcp-sentinel-ui",
"number: 8082",
"name: grafana",
"name: prometheus",
}
for _, want := range mustContain {
if !strings.Contains(got, want) {
t.Fatalf("missing %q in manifest:\n%s", want, got)
}
}
mustNotContain := []string{
"- path: /grafana",
"- path: /prometheus",
"name: grafana",
"name: prometheus",
"name: " + platformHTTPRedirectIngressName,
}
for _, unwanted := range mustNotContain {
if strings.Contains(got, unwanted) {
t.Fatalf("manifest must not contain %q (Grafana/Prometheus must not be exposed publicly, redirect ingress only emitted with TLS):\n%s", unwanted, got)
}
}
if strings.Contains(got, "tls:") {
t.Fatalf("did not expect a TLS block when issuer is empty:\n%s", got)
}
Expand All @@ -34,21 +42,17 @@ func TestRenderPlatformIngressManifestNoTLS(t *testing.T) {
}
}

func TestRenderPlatformIngressManifestApiBeforeGrafana(t *testing.T) {
func TestRenderPlatformIngressManifestApiBeforeRoot(t *testing.T) {
got := renderPlatformIngressManifest("platform.example.com", "")
apiIdx := strings.Index(got, "- path: /api")
grafanaIdx := strings.Index(got, "- path: /grafana")
rootIdx := strings.Index(got, "- path: /\n")
if apiIdx < 0 || grafanaIdx < 0 || rootIdx < 0 {
t.Fatalf("missing one of /api, /grafana, / paths:\n%s", got)
}
// Traefik matches longer/more-specific prefixes before /, so /api must
// appear in the manifest and be a sibling of /grafana, /prometheus.
if apiIdx > grafanaIdx {
t.Fatalf("/api must be listed before /grafana in the rule for readability:\n%s", got)
if apiIdx < 0 || rootIdx < 0 {
t.Fatalf("missing /api or / paths:\n%s", got)
}
if grafanaIdx > rootIdx {
t.Fatalf("/grafana must be listed before / catch-all:\n%s", got)
// Traefik matches longer/more-specific prefixes before /, so /api must be
// declared in the rule before the catch-all /.
if apiIdx > rootIdx {
t.Fatalf("/api must be listed before / catch-all:\n%s", got)
}
}

Expand All @@ -61,14 +65,42 @@ func TestRenderPlatformIngressManifestWithTLS(t *testing.T) {
`- "platform.mcpruntime.org"`,
"secretName: " + platformTLSSecretName,
`- host: "platform.mcpruntime.org"`,
"name: " + platformHTTPRedirectIngressName,
}
for _, want := range mustContain {
if !strings.Contains(got, want) {
t.Fatalf("missing %q in manifest:\n%s", want, got)
}
}
if strings.Contains(got, "\n traefik.ingress.kubernetes.io/router.entrypoints: web\n") {
t.Fatalf("did not expect plain web entrypoint when TLS issuer is set:\n%s", got)
if strings.Contains(got, "\n traefik.ingress.kubernetes.io/router.entrypoints: web\n ingressClassName") {
t.Fatalf("primary ingress should be on websecure when TLS issuer is set:\n%s", got)
}
}

func TestRenderPlatformIngressManifestHTTPRedirectShape(t *testing.T) {
got := renderPlatformIngressManifest("platform.mcpruntime.org", "letsencrypt-prod")
idx := strings.Index(got, "name: "+platformHTTPRedirectIngressName)
if idx < 0 {
t.Fatalf("expected HTTP redirect ingress when TLS configured:\n%s", got)
}
tail := got[idx:]
mustContain := []string{
"traefik.ingress.kubernetes.io/router.entrypoints: web",
`- host: "platform.mcpruntime.org"`,
"- path: /\n",
"name: mcp-sentinel-ui",
}
for _, want := range mustContain {
if !strings.Contains(tail, want) {
t.Fatalf("HTTP redirect ingress missing %q:\n%s", want, tail)
}
}
// The HTTP redirect ingress must NOT request its own cert / TLS block.
if strings.Contains(tail, "tls:") {
t.Fatalf("HTTP redirect ingress must not have a tls block:\n%s", tail)
}
if strings.Contains(tail, "cert-manager.io/cluster-issuer") {
t.Fatalf("HTTP redirect ingress must not request a certificate:\n%s", tail)
}
}

Expand Down
106 changes: 105 additions & 1 deletion services/ui/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,9 @@ func main() {
}

log.Printf("mcp-sentinel-ui listening on :%s", port)
handler := otelhttp.NewHandler(logRequests(mux), "http.server")
httpsMode := envOr("UI_REQUIRE_HTTPS", "auto")
secured := securityHeadersMiddleware(httpsRedirectMiddleware(mux, httpsMode))
handler := otelhttp.NewHandler(logRequests(secured), "http.server")
httpServer := &http.Server{
Addr: ":" + port,
Handler: handler,
Expand Down Expand Up @@ -837,6 +839,108 @@ func writeJSON(w http.ResponseWriter, status int, payload any) {
}
}

// httpsRedirectMiddleware redirects HTTP requests to HTTPS based on the
// X-Forwarded-Proto header set by an upstream TLS-terminating proxy.
//
// mode controls behavior:
// - "false"/"off"/"0": never redirect (useful in dev or when fronted differently)
// - "true"/"on"/"1": always redirect on X-Forwarded-Proto: http
// - anything else (default "auto"): redirect only when Host looks public
// (not localhost / not a bare IP). This is safe for the bundled Kind dev
// stack where Host is `localhost:18080`.
func httpsRedirectMiddleware(next http.Handler, mode string) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if shouldRedirectToHTTPS(r, mode) {
target := "https://" + r.Host + r.URL.RequestURI()
http.Redirect(w, r, target, http.StatusPermanentRedirect)
return
}
next.ServeHTTP(w, r)
})
}

func shouldRedirectToHTTPS(r *http.Request, mode string) bool {
switch strings.ToLower(strings.TrimSpace(mode)) {
case "false", "off", "0", "no":
return false
case "true", "on", "1", "yes":
// fall through, force-mode: redirect on http forwarded scheme
default:
if isLocalHost(r.Host) {
return false
}
}
if r.TLS != nil {
return false
}
proto := strings.ToLower(strings.TrimSpace(r.Header.Get("x-forwarded-proto")))
if proto == "https" {
return false
}
if proto == "http" {
return true
}
// No proxy header. Only redirect in forced mode for non-local hosts.
return strings.EqualFold(strings.TrimSpace(mode), "true") && !isLocalHost(r.Host)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Treat on/1/yes as forced HTTPS mode

The forced-mode aliases declared in shouldRedirectToHTTPS are not handled consistently when X-Forwarded-Proto is missing: "on", "1", and "yes" enter the forced-mode branch in the switch, but the fallback path only checks mode == "true". In deployments where the upstream proxy does not set X-Forwarded-Proto, setting UI_REQUIRE_HTTPS=on (or 1/yes) silently disables the intended redirect and allows plain-HTTP responses for public hosts.

Useful? React with 👍 / 👎.

}

func isLocalHost(host string) bool {
if host == "" {
return true
}
h, _, err := net.SplitHostPort(host)
if err != nil {
h = host
}
h = strings.ToLower(h)
if h == "localhost" || h == "127.0.0.1" || h == "::1" {
return true
}
if ip := net.ParseIP(h); ip != nil {
return true
}
return false
}

// securityHeadersMiddleware adds baseline security headers on every response.
// HSTS is added only when the request was served over HTTPS so it never asks a
// browser to upgrade dev hostnames that have no certificate.
func securityHeadersMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
h := w.Header()
h.Set("X-Content-Type-Options", "nosniff")
h.Set("Referrer-Policy", "strict-origin-when-cross-origin")
h.Set("Permissions-Policy", "camera=(), microphone=(), geolocation=(), payment=(), usb=()")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

Consider adding interest-cohort=() to the Permissions-Policy header to explicitly opt-out of Google's FLoC (Federated Learning of Cohorts) tracking, which is a common privacy-hardening practice for modern web applications.

// Google Sign-In needs accounts.google.com for scripts/iframes/connect.
h.Set("Content-Security-Policy",
"default-src 'self'; "+
"script-src 'self' 'unsafe-inline' https://accounts.google.com https://apis.google.com; "+
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The use of 'unsafe-inline' in the script-src directive of the Content Security Policy (CSP) significantly weakens the protection against Cross-Site Scripting (XSS) attacks. While it may be necessary for some legacy or third-party scripts, it is highly recommended to refactor the frontend to use nonces or hashes for inline scripts, or move them to external files, to allow for a more restrictive policy.

"style-src 'self' 'unsafe-inline'; "+
"img-src 'self' data: https:; "+
"font-src 'self' data:; "+
"connect-src 'self' https://accounts.google.com; "+
"frame-src https://accounts.google.com; "+
"frame-ancestors 'none'; "+
"base-uri 'self'; "+
"form-action 'self'")
if isHTTPSRequest(r) {
h.Set("Strict-Transport-Security", "max-age=63072000; includeSubDomains")
}
path := r.URL.Path
if strings.HasPrefix(path, "/api") || strings.HasPrefix(path, "/auth/") {
h.Set("Cache-Control", "no-store")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

For maximum security on sensitive endpoints like /api and /auth/, consider expanding the Cache-Control header to include no-cache and must-revalidate (e.g., no-store, no-cache, must-revalidate). While no-store is the strongest directive, adding the others ensures that even if a browser or intermediary proxy doesn't fully respect no-store, the data is not served from cache without revalidation.

Suggested change
h.Set("Cache-Control", "no-store")
h.Set("Cache-Control", "no-store, no-cache, must-revalidate")

}
next.ServeHTTP(w, r)
})
}

func isHTTPSRequest(r *http.Request) bool {
if r.TLS != nil {
return true
}
return strings.EqualFold(strings.TrimSpace(r.Header.Get("x-forwarded-proto")), "https")
}

// logRequests is middleware that logs HTTP requests.
// It logs the HTTP method, URL path, response status, and duration.
func logRequests(next http.Handler) http.Handler {
Expand Down
Loading
Loading