Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions docs/internal/design/control-plane.spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,63 @@ The CP creates a Pod (not a Job) for each session. Key pod attributes:

Each section is joined with `\n\n`. Empty sections are omitted. If all four are empty, `INITIAL_PROMPT` is not set and the runner waits for a user message via gRPC.

### Workspace Initialization (Repo Cloning)

When a session specifies repositories to clone (`session.RepoURL` or `session.Repos`), the CP adds an **init container** to the runner pod that clones the repositories into `/workspace/repos/` before the runner starts.

#### Trigger

The init container is added when either:
- `session.RepoURL` is a non-empty string (single repo shorthand)
- `session.Repos` is a non-empty JSON string (array of `{"url": "...", "branch": "..."}` objects)

If neither is set, no init container is created and `/workspace` starts empty.

#### Init Container Behavior

The CP reuses the existing **state-sync** image (`quay.io/ambient_code/vteam_state_sync`) and its `hydrate.sh` script — the same init container the operator uses. No new images or scripts are needed.

```
Name: init-hydrate
Image: quay.io/ambient_code/vteam_state_sync (same as operator)
Command: /usr/local/bin/hydrate.sh
Env: REPOS_JSON, SESSION_NAME, NAMESPACE, PROJECT_NAME, BACKEND_API_URL
Mount: /workspace (shared with runner container via emptyDir volume)
```

`hydrate.sh` handles the full workspace initialization lifecycle:
1. Creates workspace directory structure (`/workspace/repos/`, `/workspace/artifacts/`, etc.)
2. Restores session state from S3 (if configured — skipped when S3 is not available)
3. Installs a git credential helper that reads `GITHUB_TOKEN`/`GITLAB_TOKEN` from env
4. Fetches git credentials from the backend API (if `BACKEND_API_URL` and `BOT_TOKEN` are set)
5. Parses `REPOS_JSON` and clones each repo into `/workspace/repos/<repo-name>`
6. Clones workflow repos (if `ACTIVE_WORKFLOW_GIT_URL` is set)
7. Restores git branch/patch state from S3 backup (if available)
8. Sets ownership and permissions for the runner user (UID 1001)

#### Repo URL Normalization

- `session.RepoURL` (single string) is converted to `REPOS_JSON`: `[{"url": "<value>"}]`
- `session.Repos` (JSON string) is passed through as-is to `REPOS_JSON`
- If both are set, `Repos` takes precedence (it may contain `RepoURL` plus additional repos)

#### Credential Injection for Private Repos

When `CREDENTIAL_IDS` includes a `github` or `gitlab` credential, the init container receives the same credential environment so that `git clone` can authenticate to private repositories. The CP injects:

- `GITHUB_TOKEN` — for `https://github.com/` URLs
- `GITLAB_TOKEN` — for GitLab URLs

These are fetched from the credential store at pod creation time (same as the runner's credential fetch, but injected into the init container env directly).

Public repos require no credentials and clone over HTTPS without authentication.

#### Status: 🔲 not implemented

The CP currently sets `REPOS_JSON` as an env var on the runner container but does **not** create an init container. The runner creates the target directory (`/workspace/repos/<name>`) but does not clone. Repos are present as empty directories.

The operator (`components/operator`) implements this correctly in `reconcileSpecReposWithPatch` using the `state-sync` image's `hydrate.sh` script. The CP implementation should add the same `init-hydrate` container to the pod spec in `ensurePod` — the image and script already exist and are deployed.

### Environment Variables Injected into Runner Pod

| Var | Value | Purpose |
Expand All @@ -116,6 +173,7 @@ Each section is joined with `\n\n`. Empty sections are omitted. If all four are
| `USE_VERTEX` / `ANTHROPIC_VERTEX_PROJECT_ID` / `CLOUD_ML_REGION` | CP config | Vertex AI config (when enabled) |
| `GOOGLE_APPLICATION_CREDENTIALS` | `/app/vertex/ambient-code-key.json` | Vertex service account path |
| `LLM_MODEL` / `LLM_TEMPERATURE` / `LLM_MAX_TOKENS` | session fields | Per-session model config |
| `REPOS_JSON` | JSON array of `{"url","branch"}` | Repos to clone into `/workspace/repos/` (set when `session.RepoURL` or `session.Repos` is non-empty) |
| `CREDENTIAL_IDS` | JSON map `{provider: credential_id}` | Resolved credentials for this session; runner calls `/credentials/{id}/token` per provider |

---
Expand Down