From ddabeb4e1fdf5f7b09316ec7ddb0800406de76a1 Mon Sep 17 00:00:00 2001 From: Tirso Garcia Date: Wed, 18 Mar 2026 20:13:34 +0100 Subject: [PATCH] docs: add e2e README with TLS validation evidence Test catalog, run instructions, and full TLS validation evidence from 2026-03-18 cluster run: health (3/3), session-lifecycle (7/7), LLM agent loop via OpenAI over HTTPS (5 iterations, 5 invocations). Co-Authored-By: Claude Opus 4.6 (1M context) --- e2e/README.md | 276 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 276 insertions(+) create mode 100644 e2e/README.md diff --git a/e2e/README.md b/e2e/README.md new file mode 100644 index 0000000..0f1c3ff --- /dev/null +++ b/e2e/README.md @@ -0,0 +1,276 @@ +# E2E Tests — Underpass Runtime + +End-to-end tests run as Kubernetes Jobs against a live runtime deployment. + +## Test Catalog + +| ID | Name | Tier | What it validates | +|----|------|------|-------------------| +| 01 | health | smoke | `/healthz` 200, `/metrics` 200, method-not-allowed 405 | +| 02 | session-lifecycle | smoke | Create, close, metadata, idempotent close, explicit ID, independence | +| 03 | tool-discovery | smoke | Compact/full detail, filters (risk, tags, scope, cost, side_effects) | +| 04 | recommendations | core | Heuristic scoring, task hint matching, top_k | +| 05 | invoke-basic | smoke | `fs.write_file`, `fs.read_file`, `fs.list` — basic tool invocations | +| 06 | invoke-policy | core | Policy enforcement, approval flows, risk-gated tools | +| 07 | invocation-retrieval | core | GET invocation by ID, logs, artifacts | +| 08 | data-flow | core | Full write → read → list → artifacts cycle | +| 10 | llm-agent-loop | full | LLM (Claude/OpenAI/vLLM) drives tool discovery + invocation loop | +| 11 | tool-learning-pipeline | core | DuckDB → Thompson Sampling → Valkey policies → NATS events | + +## Running + +```bash +# All tests +./e2e/run-e2e-tests.sh + +# By tier +./e2e/run-e2e-tests.sh --tier smoke +./e2e/run-e2e-tests.sh --tier core + +# Single test +./e2e/run-e2e-tests.sh --test 01 + +# Skip build/push (images already in registry) +./e2e/run-e2e-tests.sh --skip-build --skip-push --test 01 +``` + +Requires `ghcr.io` authentication for image push and an `imagePullSecrets` named `ghcr-pull` in the namespace. + +## TLS Validation + +Full TLS validation was performed on 2026-03-18 against a live K8s cluster with `tls.mode=server`. + +### Setup + +- Self-signed ECDSA CA (P-256) with `keyUsage: keyCertSign, cRLSign` +- Server cert with SAN: `DNS:underpass-runtime`, `DNS:underpass-runtime.underpass-runtime.svc.cluster.local`, `IP:127.0.0.1` +- TLS 1.3 minimum enforced by `internal/tlsutil` +- Helm deployed with `tls.mode=server`, `tls.existingSecret=runtime-tls` +- E2e jobs connect via `https://` with CA mounted at `/etc/ssl/runtime/ca.crt` + +### Runtime Evidence + +``` +$ kubectl logs -l app.kubernetes.io/name=underpass-runtime + +{"time":"2026-03-18T18:58:57Z","level":"INFO","msg":"HTTP server TLS configured","mode":"server"} +{"time":"2026-03-18T18:58:57Z","level":"INFO","msg":"session store initialized","backend":"memory"} +{"time":"2026-03-18T18:58:57Z","level":"INFO","msg":"artifact store initialized","backend":"local","root":"/tmp/artifacts"} +{"time":"2026-03-18T18:58:57Z","level":"INFO","msg":"invocation store initialized","backend":"memory"} +{"time":"2026-03-18T18:58:57Z","level":"INFO","msg":"event bus initialized","bus":"noop"} +{"time":"2026-03-18T18:58:57Z","level":"INFO","msg":"telemetry initialized","backend":"noop"} +{"time":"2026-03-18T18:58:57Z","level":"INFO","msg":"workspace service listening (TLS)","port":"50053","workspace_root":"/tmp/workspaces"} +``` + +### Helm Configuration + +``` +$ kubectl get configmap underpass-runtime -o yaml | grep TLS + +WORKSPACE_TLS_MODE: server +WORKSPACE_TLS_CERT_PATH: /var/run/underpass-runtime/tls/tls.crt +WORKSPACE_TLS_KEY_PATH: /var/run/underpass-runtime/tls/tls.key +``` + +``` +$ kubectl get svc underpass-runtime -o jsonpath='{.spec.ports[0]}' + +name=https port=50053 targetPort=https +``` + +``` +$ kubectl get deployment underpass-runtime -o jsonpath='{.spec.template.spec.containers[0].readinessProbe}' + +{"httpGet":{"path":"/healthz","port":"https","scheme":"HTTPS"},"initialDelaySeconds":5,"periodSeconds":10} +``` + +### Test 1: Health Check via HTTPS + +``` +$ kubectl logs e2e-tls-health-9kz75 + +Step 1: GET /healthz returns 200 with status ok +OK GET /healthz -> 200, status=ok + +Step 2: GET /metrics returns Prometheus metrics +OK GET /metrics -> 200 + +Step 3: POST /metrics returns 405 Method Not Allowed +OK POST /metrics -> 405 (Method Not Allowed) +OK All health tests passed +``` + +Evidence JSON: + +```json +{ + "test_id": "01-health", + "run_id": "e2e-health-1773860375", + "status": "passed", + "workspace_url": "https://underpass-runtime.underpass-runtime.svc.cluster.local:50053", + "steps": [ + {"step": "healthz", "status": "passed", "data": {"status": 200, "body": {"status": "ok"}}}, + {"step": "metrics", "status": "passed", "data": {"status": 200}}, + {"step": "metrics_method_not_allowed", "status": "passed", "data": {"status": 405}} + ] +} +``` + +### Test 2: Session Lifecycle via HTTPS + +``` +$ kubectl logs e2e-tls-session-b95hv + +Step 1: Create session and verify fields +OK Session session-6afe99f2d6faea61 created and closed + +Step 2: Create session with metadata +OK Session with metadata created: session-f86b43bd667f5731 + +Step 3: Close nonexistent session is idempotent +OK Close nonexistent session returned 200 + +Step 4: GET /v1/sessions returns 405 +OK GET /v1/sessions -> 405 + +Step 5: Multiple sessions are independent +OK Two independent sessions: session-fc4a7ad06110020f, session-6f4cb7d223d41707 + +Step 6: Double close same session +OK Double close returned 200 both times + +Step 7: Create session with explicit ID +OK Explicit session ID honoured: e2e-explicit-1773860630 +OK All session lifecycle tests passed +``` + +Evidence JSON: + +```json +{ + "test_id": "02-session-lifecycle", + "run_id": "e2e-session-1773860630", + "status": "passed", + "workspace_url": "https://underpass-runtime.underpass-runtime.svc.cluster.local:50053", + "steps": [ + {"step": "create_and_close", "status": "passed"}, + {"step": "metadata", "status": "passed", "data": {"session_id": "session-f86b43bd667f5731"}}, + {"step": "close_idempotent", "status": "passed"}, + {"step": "method_not_allowed", "status": "passed"}, + {"step": "multiple_independent", "status": "passed"}, + {"step": "double_close", "status": "passed"}, + {"step": "explicit_id", "status": "passed"} + ], + "sessions": [ + {"session_id": "session-6afe99f2d6faea61", "workspace_path": "/tmp/workspaces/e2e-tenant/session-6afe99f2d6faea61/repo"}, + {"session_id": "session-f86b43bd667f5731", "workspace_path": "/tmp/workspaces/e2e-tenant/session-f86b43bd667f5731/repo"}, + {"session_id": "session-fc4a7ad06110020f", "workspace_path": "/tmp/workspaces/e2e-tenant/session-fc4a7ad06110020f/repo"}, + {"session_id": "session-6f4cb7d223d41707", "workspace_path": "/tmp/workspaces/e2e-tenant/session-6f4cb7d223d41707/repo"}, + {"session_id": "session-1bef2a9e6133f018", "workspace_path": "/tmp/workspaces/e2e-tenant/session-1bef2a9e6133f018/repo"}, + {"session_id": "e2e-explicit-1773860630", "workspace_path": "/tmp/workspaces/e2e-tenant/e2e-explicit-1773860630/repo"} + ] +} +``` + +### Test 3: LLM Agent Loop via HTTPS (OpenAI gpt-4o-mini) + +Full agent loop: LLM creates a Go project through governed tool execution over TLS. + +``` +$ kubectl logs e2e-tls-llm-openai-b9h7t + +Step 1: Creating workspace session +OK Session created: session-2dc582e508c77834 + +Step 2: Discovering tools (provider: openai) +OK Discovered 70 tools + +Step 3: Getting tool recommendations +OK Got 10 recommendations + +Step 4: Starting agent loop (max 10 iterations) + --- Iteration 1/10 --- + Thinking: I need to create a main.go file with a simple HTTP server + Tool fs.write_file → succeeded + --- Iteration 2/10 --- + Thinking: Now I need to create a main_test.go file with a test + Tool fs.write_file → succeeded + --- Iteration 3/10 --- + Thinking: Next, I will list the workspace to confirm both files exist + Tool fs.list → succeeded + --- Iteration 4/10 --- + Thinking: Now I will read the content of main.go to verify + Tool fs.read_file → succeeded + --- Iteration 5/10 --- + Agent done: Created a Go project with main.go and main_test.go + +Step 5: Verifying workspace state +OK Workspace has 2 files (main.go: yes, test: yes) +OK LLM Agent Loop PASSED (provider=openai, iterations=5) +``` + +Evidence JSON: + +```json +{ + "test_id": "10-llm-agent-loop", + "run_id": "e2e-llm-openai-1773860822", + "status": "pass", + "workspace_url": "https://underpass-runtime.underpass-runtime.svc.cluster.local:50053", + "result": { + "session_id": "session-2dc582e508c77834", + "provider": "openai", + "iterations": 5, + "tools_discovered": 70, + "recommendations": 10, + "has_main": true, + "has_test": true + }, + "invocations": [ + {"tool": "fs.write_file", "invocation_id": "inv-b97bdbf7386e6e55", "invocation_status": "succeeded", + "output": {"bytes_written": 216, "path": "main.go", "sha256": "75a678da58a4aa78157a4647150f8fdd5c3f99739a5ae3232d1360ec46f4c206"}}, + {"tool": "fs.write_file", "invocation_id": "inv-403ad9e8cbfb1731", "invocation_status": "succeeded", + "output": {"bytes_written": 325, "path": "main_test.go", "sha256": "3a41bde0206784e4d4063ff6fd6f241561ea85797c83b80b92f33b51d35dc6b7"}}, + {"tool": "fs.list", "invocation_id": "inv-e94b65f8fa5850eb", "invocation_status": "succeeded", + "output": {"count": 2, "entries": [{"path": "main.go", "size_bytes": 216}, {"path": "main_test.go", "size_bytes": 325}]}}, + {"tool": "fs.read_file", "invocation_id": "inv-5a8688d5e80117bb", "invocation_status": "succeeded", + "output": {"content": "package main\n\nimport...", "path": "main.go", "size_bytes": 216}}, + {"tool": "fs.list", "invocation_id": "inv-bd7eae815d720651", "invocation_status": "succeeded", + "output": {"count": 2}} + ] +} +``` + +### Test 4: LLM Agent Loop via HTTPS (Claude sonnet-4) + +TLS connection to runtime succeeded (session created, 70 tools discovered, 10 recommendations received). Failed at Anthropic API call (`HTTP 400`) — not a TLS issue. + +``` +$ kubectl logs e2e-tls-llm-claude-xjsk8 + +Step 1: Creating workspace session +OK Session created: session-7e79836db8d20031 + +Step 2: Discovering tools (provider: claude) +OK Discovered 70 tools + +Step 3: Getting tool recommendations +OK Got 10 recommendations + +Step 4: Starting agent loop (max 10 iterations) + --- Iteration 1/10 --- +ERROR LLM Agent Loop FAILED: HTTP Error 400: Bad Request +``` + +Evidence: TLS to runtime worked (3 HTTPS calls succeeded). Failure was in the outbound Anthropic SDK call, not in the TLS transport. + +### Summary + +| Test | Transport | Provider | Steps | Status | +|------|-----------|----------|-------|--------| +| 01-health | HTTPS | - | 3/3 | **PASS** | +| 02-session-lifecycle | HTTPS | - | 7/7 | **PASS** | +| 10-llm-agent-loop | HTTPS | OpenAI gpt-4o-mini | 5 iterations, 5 invocations | **PASS** | +| 10-llm-agent-loop | HTTPS | Claude sonnet-4 | 3 HTTPS calls OK, Anthropic SDK 400 | **FAIL** (not TLS) | + +All HTTPS connections verified against self-signed CA with TLS 1.3. The runtime correctly serves `ListenAndServeTLS` with cert/key from Kubernetes Secret, health probes use `scheme: HTTPS`, and service port is named `https`.