Skip to content

feat(runner,manifests): add gRPC transport, credential system, SSE enhancements, and Kustomize overlays from alpha#1421

Closed
markturansky wants to merge 12 commits intomainfrom
chore/alpha-migration-pr6-runners
Closed

feat(runner,manifests): add gRPC transport, credential system, SSE enhancements, and Kustomize overlays from alpha#1421
markturansky wants to merge 12 commits intomainfrom
chore/alpha-migration-pr6-runners

Conversation

@markturansky
Copy link
Copy Markdown
Contributor

@markturansky markturansky commented Apr 22, 2026

Summary

Alpha migration PR 6/7 — Runner infrastructure and deployment manifests.

  • gRPC transport: Bidirectional streaming transport for Claude bridge, with credential push and session message APIs
  • Credential system: Shared session credentials, Google credentials preservation across turns, runtime credential clearing
  • SSE enhancements: Between-run event listener, background task proxies, event replay compaction
  • Kustomize overlays: MPP OpenShift overlay, control plane RBAC, tenant networking, API server patches
  • Gerrit integration: Auth connector, frontend connection card, backend handlers
  • Security fix: All websocket runner HTTP clients now inject X-Ambient-Session-Token via NewRunnerTransport, fixing HTTP 401 errors from runners with AGUI_TOKEN middleware enabled (was using plain http.Transport)

Auth token fix details

The websocket package's agui_proxy.go used plain http.Transport / ad-hoc http.Client instances that bypassed the runnerTransport session-token layer. Every runner call from the websocket handlers (connectToRunner, interrupt, feedback, capabilities, mcp-status, task stop/output/list, between-run listener) lacked the X-Ambient-Session-Token header, causing HTTP 401s.

Fix:

  • Export NewRunnerTransport from handlers package
  • Wrap runnerHTTPClient with handlers.NewRunnerTransport
  • Add runnerShortClient (10s) and runnerMediumClient (30s) with token transport
  • Replace all 7 ad-hoc http.Client{} calls with the named clients

TDD tests added:

  • Transport type assertion (catches regression if someone reverts to plain transport)
  • End-to-end token injection via fake K8s secret + httptest server
  • Missing-secret graceful degradation (no token sent, no error)

Test plan

  • go test ./websocket/ -v — all tests pass including 3 new session token tests
  • go vet ./... — clean
  • E2E tests in CI
  • Deploy to staging and verify no more "Runner error: HTTP 401" in UI

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • Refactor

    • Consolidated HTTP client configuration for improved consistency and reliability across request handlers.
    • Enhanced session token management for secure runner communication.
  • Tests

    • Added comprehensive test coverage validating session token injection and authentication header handling across request scenarios.

user and others added 7 commits April 21, 2026 15:59
…hancements, and Kustomize overlays from alpha

PR 6 of alpha-to-main migration (combined with PR 7 — manifests).

Runners:
- gRPC transport for session message streaming
- gRPC client for control-plane token endpoint
- Inbox and session messages APIs with delta buffer
- Credential system: fetch/populate/clear, gh CLI wrapper
- SSE flush-per-chunk, unbounded tap queue
- CP OIDC token for backend credential fetches (RSA keypair auth)
- New deps: cryptography, grpcio, protobuf
- Tests: grpc_client, grpc_transport, grpc_writer, events_endpoint,
  app_initial_prompt, expanded bridge_claude and shared_session_credentials

Manifests:
- mpp-openshift overlay: NetworkPolicy, gRPC Route, CP token Service,
  RBAC, MCP sidecar, RoleBinding namespace fixes
- production overlay updates
- openshift-dev overlay
- Removed deprecated cluster-reader overlay
- All overlays pass kustomize build

Migration plan updated: PRs 1-5 marked merged, PR 6+7 combined.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
TestGhWrapper._cleanup() used stale top-level imports of _GH_WRAPPER_PATH
(always ""), causing Path(".").unlink() → IsADirectoryError. Now reads from
the module object with empty-string guards. Also fixes e2e workflow Docker
build context from components/runners to components/runners/ambient-runner
so pyproject.toml is found during image build.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
clear_runtime_credentials was deleting the Google Workspace credentials
file between turns, causing workspace-mcp to fall back to an inaccessible
localhost OAuth flow. The file is now intentionally preserved.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Tests were refactored to remove CREDENTIAL_IDS but _fetch_credential
still requires it to look up the credential_id and build the API URL.
Without it, the function returns {} immediately without hitting the
test HTTP server. Also restores credential_id-based response matching
in the lifecycle test.

713 passed, 11 skipped locally.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
…hot tests

The session-list screenshot test failed because setTheme clicked the
toggle button before it was visible in the DOM. Add explicit visibility
wait with 10s timeout to setTheme, matching the pattern already used
in the waitForThemeToggle setup step.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
…requests

The websocket package's agui_proxy.go used plain http.Transport / ad-hoc
http.Client instances that bypassed the runnerTransport session-token layer.
Every runner call from the websocket handlers (connectToRunner, interrupt,
feedback, capabilities, mcp-status, task stop/output/list, between-run
listener) lacked the X-Ambient-Session-Token header, causing HTTP 401s
from runners with AGUI_TOKEN middleware enabled.

- Export NewRunnerTransport from handlers package
- Wrap runnerHTTPClient with handlers.NewRunnerTransport
- Add runnerShortClient (10s) and runnerMediumClient (30s) with token transport
- Replace all 7 ad-hoc http.Client{} calls with the named clients
- Add TDD tests: transport type assertion, end-to-end token injection,
  missing-secret graceful degradation

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 22, 2026

Deploy Preview for cheerful-kitten-f556a0 canceled.

Name Link
🔨 Latest commit b56893d
🔍 Latest deploy log https://app.netlify.com/projects/cheerful-kitten-f556a0/deploys/69e8d982f4fdbd0008e1ed2b

@github-actions
Copy link
Copy Markdown
Contributor

⚠️ SDD Preflight — Managed Paths Modified

This PR modifies files in SDD-managed component(s). These components are migrating to Spec-Driven Development.

File Component Mode
components/runners/ambient-runner/.mcp.json runner warn
components/runners/ambient-runner/architecture.md runner warn
components/runners/ambient-runner/pyproject.toml runner warn

No action required — these components are in warn mode. Consider using the component's agent workflow for future changes.

📖 Specs: Runner Spec · Runner Constitution

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 22, 2026

📝 Walkthrough

Walkthrough

Refactored HTTP client construction across session and WebSocket handlers to centralize transport initialization and ensure consistent per-request session-token header injection to runner pods via a new exported NewRunnerTransport wrapper.

Changes

Cohort / File(s) Summary
Transport Abstraction
components/backend/handlers/sessions.go
Introduced exported NewRunnerTransport constructor wrapping base http.RoundTripper with nil-to-default fallback, centralizing transport initialization logic.
Client Consolidation
components/backend/websocket/agui_proxy.go
Replaced ad-hoc per-handler &http.Client{Timeout: ...} construction with shared runnerShortClient (10s) and runnerMediumClient (30s) clients; updated runnerHTTPClient transport to use handlers.NewRunnerTransport for consistent session-token header injection across all runner requests.
Test Coverage
components/backend/websocket/agui_proxy_test.go
Added test suite validating session-token injection behavior: verifies runnerHTTPClient transport wrapping, asserts X-Ambient-Session-Token header presence/absence based on Kubernetes Secret state, uses fake K8s client and local test server for header capture.
🚥 Pre-merge checks | ✅ 6 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title follows Conventional Commits format with type(scope) but doesn't clearly match the actual changeset, which focuses on session-token injection and transport wrapping. Align the title with the primary changes shown in the diff: refactor(websocket): inject X-Ambient-Session-Token via NewRunnerTransport and shared clients, or provide a title that accurately reflects the actual code changes in this PR commit.
Kubernetes Resource Safety ⚠️ Warning PostgreSQL container in ambient-api-server-db.yaml lacks resource requests and limits, creating resource starvation risk. Add resource requests/limits to PostgreSQL container: requests {cpu: 100m, memory: 256Mi}, limits {cpu: 500m, memory: 512Mi}.
✅ Passed checks (6 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 84.62% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Performance And Algorithmic Complexity ✅ Passed HTTP client refactoring consolidates per-request clients into reusable shared clients with connection pooling. Pagination uses 500-item limits with proper continuation loops. Session state in sync.Maps is bounded by 1-hour TTL with 10-minute cleanup sweeps. Activity updates are debounced and concurrency-limited to 50 goroutines. No O(n²) algorithms, N+1 patterns, unbounded growth, or expensive work in loops detected.
Security And Secret Handling ✅ Passed Token injection via NewRunnerTransport is properly implemented with secure handling, auth/authz guards on all handlers, no injection vulnerabilities, and proper secret management.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/alpha-migration-pr6-runners
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch chore/alpha-migration-pr6-runners

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread components/backend/websocket/agui_proxy.go Fixed
Comment thread components/backend/websocket/agui_proxy.go Fixed
Comment thread components/backend/websocket/agui_proxy.go Fixed
Comment thread components/backend/websocket/agui_proxy.go Fixed
Comment thread components/backend/websocket/agui_proxy.go Fixed
Comment thread components/backend/websocket/agui_proxy.go Fixed
Comment thread components/backend/websocket/agui_proxy.go Fixed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
components/backend/websocket/agui_proxy_test.go (1)

52-54: Avoid mutating package globals without test isolation.

These cases overwrite handlers.K8sClientMw and runnerHTTPClient, both of which are package-level singletons. That makes the new assertions vulnerable to cross-test interference if anything in this package runs concurrently. A shared test mutex or injectable client dependency would keep them deterministic.

Also applies to: 65-72, 88-90, 101-108

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/backend/websocket/agui_proxy_test.go` around lines 52 - 54, The
tests mutate package globals handlers.K8sClientMw and runnerHTTPClient which can
cause cross-test interference; instead, protect those mutations with a shared
test mutex or make the clients injectable. Concretely, add a package-level
sync.Mutex (e.g., testMu) and acquire it before replacing handlers.K8sClientMw
or runnerHTTPClient and release it after restoring the original (use t.Cleanup
to restore oldClient/oldRunner), or refactor the code under test to accept a
client argument so tests can pass fakeClient without touching package globals;
apply this change for the spots touching handlers.K8sClientMw and
runnerHTTPClient (including the occurrences around fakeClient and oldClient).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@components/backend/websocket/agui_proxy_test.go`:
- Around line 52-54: The tests mutate package globals handlers.K8sClientMw and
runnerHTTPClient which can cause cross-test interference; instead, protect those
mutations with a shared test mutex or make the clients injectable. Concretely,
add a package-level sync.Mutex (e.g., testMu) and acquire it before replacing
handlers.K8sClientMw or runnerHTTPClient and release it after restoring the
original (use t.Cleanup to restore oldClient/oldRunner), or refactor the code
under test to accept a client argument so tests can pass fakeClient without
touching package globals; apply this change for the spots touching
handlers.K8sClientMw and runnerHTTPClient (including the occurrences around
fakeClient and oldClient).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 923020e3-88fe-474c-b723-097ae4f1ad1e

📥 Commits

Reviewing files that changed from the base of the PR and between d39cb88 and 9a1fa74.

📒 Files selected for processing (3)
  • components/backend/handlers/sessions.go
  • components/backend/websocket/agui_proxy.go
  • components/backend/websocket/agui_proxy_test.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants