Skip to content

feat(scripts): remote devcontainer orchestration via just recipe (#70)#166

Open
gerchowl wants to merge 285 commits intodevfrom
feature/70-remote-devc-orchestration
Open

feat(scripts): remote devcontainer orchestration via just recipe (#70)#166
gerchowl wants to merge 285 commits intodevfrom
feature/70-remote-devc-orchestration

Conversation

@gerchowl
Copy link
Copy Markdown
Contributor

@gerchowl gerchowl commented Feb 23, 2026

Description

Implements remote devcontainer orchestration: a single command (just remote-devc <host> or devc-remote.sh) that provisions a devcontainer on a remote host and connects your IDE to it. This enables developers to spin up devcontainers on powerful remote machines (GPU servers, cloud VMs) from their local terminal without manual SSH and compose steps.

Key capabilities

  • Core orchestration — SSH preflight, container state detection (fresh/running/stopped), compose lifecycle, IDE launch
  • gh:org/repo[:branch] targets — Clone a GitHub repo on the remote host and start its devcontainer in one command
  • --bootstrap flag — One-time remote host setup (config file, GHCR auth, image build)
  • --force flag — Auto-push unpushed commits before deploying; guards against deploying stale code
  • --open ssh|cursor|code|none — IDE-agnostic connection modes with auto-detection
  • Tailscale SSH integration — Ephemeral auth key generation via OAuth API, TUN device injection, peer wait polling
  • Claude Code CLI injection — Subscription OAuth token forwarding for AI-assisted development in containers
  • Container lifecycle execution — Runs post-create/post-start scripts inside the container after compose up
  • Compose file parsing — Reads dockerComposeFile from devcontainer.json, builds correct -f flags

Implementation

Component Purpose
scripts/devc-remote.sh Bash orchestrator: parse_args, check_ssh, remote_preflight, inject_tailscale_key, inject_claude_auth, remote_compose_up, run_container_lifecycle, open_editor
scripts/devc_remote_uri.py Python helper for Cursor/VS Code nested authority URIs (hex-encoded devcontainer specs)
justfile.base remote-devc recipe wrapping devc-remote.sh with local git state auto-detection
setup-tailscale.sh Opt-in Tailscale SSH daemon (install/start subcommands, lifecycle hooks)
setup-claude.sh Opt-in Claude Code CLI (install/start subcommands, lifecycle hooks)

Type of Change

  • feat -- New feature

Issues

Closes #152, #153, #221, #230, #231, #232, #235, #236, #243
Refs: #70, #208, #246

Testing

Manual Integration Test Results (#243)

36/39 items verified on a real remote host. Remaining 3 edge cases (low disk, missing compose, missing runtime) covered by unit tests.

Full test matrix (click to expand)

1. Core orchestration

  • devc-remote.sh myserver:~/Projects/fd5 — SSH, preflight, compose up
  • Re-run with container already running — skips compose up, opens editor
  • Re-run with container stopped — restarts and opens
  • --open none — infra only, no IDE launch
  • --open ssh — waits for Tailscale, prints hostname
  • --open code — opens VS Code instead of Cursor
  • --yes flag — auto-accepts prompts (reuse running container)

2. Tailscale SSH integration (#208, #230)

  • With TS_CLIENT_ID + TS_CLIENT_SECRET set — generates ephemeral key, injects into remote compose
  • Container joins tailnet after compose up
  • --open ssh mode — polls tailscale status, prints hostname when ready
  • Without TS env vars — silently skips (no error)

3. Claude Code CLI (#70)

  • With CLAUDE_CODE_OAUTH_TOKEN set — injects token into remote compose
  • setup-claude.sh install inside container — installs CLI, creates claude user
  • claude wrapper auto-switches to non-root user when run as root
  • setup-claude.sh start — refreshes workspace permissions
  • Without token — silently skips (no error)

4. Container lifecycle

  • Fresh container — runs post-create.sh then post-start.sh inside container
  • Existing running container — runs only post-start.sh (skips post-create)
  • Lifecycle scripts not present — skips gracefully with log message

5. --bootstrap (#235)

  • First run on clean host — prompts for projects_dir, creates config, forwards GHCR auth, clones devcontainer repo, builds image
  • --bootstrap --yes — uses defaults without prompting
  • Re-run — reads existing config, skips prompts, pulls latest, rebuilds
  • GHCR auth forwarding — podman credentials or GHCR_TOKEN copied to remote

6. gh:org/repo[:branch] (#236)

  • devc-remote.sh myserver gh:vig-os/fd5 — clones to ~/Projects/fd5, starts devcontainer
  • devc-remote.sh myserver gh:vig-os/fd5:feature/my-branch — clones and checks out branch
  • Re-run with repo already cloned — fetches, doesn't re-clone
  • devc-remote.sh myserver:~/custom/path gh:vig-os/fd5 — overrides clone location
  • Branch switch on existing clone — checks out new branch

7. Compose file parsing

  • read_compose_files() correctly reads dockerComposeFile array from devcontainer.json
  • compose_cmd_with_files() builds correct -f flags
  • Works with single file string and multi-file array

8. Edge cases

  • Low disk space warning (<2GB) — requires special host (covered by unit test)
  • Remote host without compose — requires special host (covered by unit test)
  • Remote host without container runtime — requires special host (covered by unit test)
  • SSH connection failure — clear error message
  • macOS remote host — not tested (covered by unit test)

Bugs found and fixed during testing

  • SSH drops empty args / expands ~ in remote_clone_project — fixed with sentinel values (17ca79f)
  • GHCR auth forwarding moved from bootstrap-only to every deploy (9280224)
  • Tailscale SSH required real TUN device, not userspace networking (c209f1d)
  • Tailscale key regenerated on every deploy to avoid expired ephemeral keys (0b0bcef)
  • ~/.local/bin added to PATH for SSH compose commands (15120fb)
  • Reverted unnecessary podman-compose preference logic (ea3af49)
  • Pre-flight check for stale local Tailscale daemon (49c7a4e)

Changelog Entry

See CHANGELOG.md ## Unreleased section — fully up to date.

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have updated the documentation accordingly
  • I have updated CHANGELOG.md in the [Unreleased] section
  • My changes generate no new warnings or errors
  • I have added tests that prove my feature works
  • New and existing unit tests pass locally with my changes
  • Manual integration tests pass (chore: manual integration tests for remote devcontainer features #243)

@gerchowl gerchowl self-assigned this Feb 23, 2026
@gerchowl gerchowl requested a review from c-vigo as a code owner February 24, 2026 15:15
… improvements

- Remote devcontainer orchestration: devc-remote.sh error handling, devc_remote_uri.py, justfile recipes
- Skills: deduplicate Delegation sections, add missing ones for ci_check, ci-fix, verify
- gh_issues.py: CI status column, hyperlinks, Refs: parsing, reviewer display
- Worktree: configurable agent model via _read_model helper
- Containerfile: fix vig-utils COPY order for uv workspace resolution
- Dependencies: ruff 0.15.0, bandit added to devcontainer group
- init-workspace.sh: exclude .venv from template sync, safe rename
- New setup-labels.sh for GitHub label provisioning
- Docs: updated README.md and CONTRIBUTE.md with new recipes

Refs: #70
@c-vigo
Copy link
Copy Markdown
Contributor

c-vigo commented Feb 25, 2026

This PR requires an update to assets/workspace/.devcontainer/README.md explaining the feature to users.

@c-vigo
Copy link
Copy Markdown
Contributor

c-vigo commented Feb 25, 2026

Opened follow-up bug issue #202 for the host-dependent BATS failure (detect_editor_cli neither-path case).

Root cause: test assumes /usr/bin:/bin has no code, which is false on some hosts.

Planned fix in #202: isolate PATH with an empty temp dir and invoke /bin/bash under env -i so detection is deterministic.

c-vigo and others added 4 commits February 25, 2026 13:37
Pointing directly to the script file sometimes leads to execution problems.

Refs: #204
Update `post_create` and `post_attach` tests
Add missing `post_start` test

Refs: #204
## Description

Ensure devcontainer lifecycle hooks execute reliably in downstream
workspaces by invoking hook scripts through `/bin/bash` in
`devcontainer.json`. Add integration coverage for all three lifecycle
commands (`postCreate`, `postStart`, `postAttach`) so command format
regressions are caught by tests.

## Type of Change

- [ ] `feat` -- New feature
- [x] `fix` -- Bug fix
- [x] `docs` -- Documentation only
- [ ] `chore` -- Maintenance task (deps, config, etc.)
- [ ] `refactor` -- Code restructuring (no behavior change)
- [x] `test` -- Adding or updating tests
- [ ] `ci` -- CI/CD pipeline changes
- [ ] `build` -- Build system or dependency changes
- [ ] `revert` -- Reverts a previous commit
- [ ] `style` -- Code style (formatting, whitespace)

### Modifiers

- [ ] Breaking change (`!`) -- This change breaks backward compatibility

## Changes Made

- `assets/workspace/.devcontainer/devcontainer.json`
- Update `postCreateCommand`, `postStartCommand`, and
`postAttachCommand` to run scripts via `/bin/bash`.
- Keep existing script paths unchanged while making command execution
more robust.
- `tests/test_integration.py`
- Update expectations for `postAttachCommand` and `postCreateCommand` to
include `/bin/bash`.
- Add `test_devcontainer_json_post_start_command` to validate
`postStartCommand` uses the same bash-wrapped format.
- `CHANGELOG.md`
- Add an Unreleased `### Fixed` entry for issue `#204` describing the
lifecycle-command fix and user-visible error resolved.

## Changelog Entry

### Fixed
- **Devcontainer lifecycle commands fail in mock-up folders with crun
getcwd error**
([#204](#204))
- Run post-create, post-start, and post-attach commands via `/bin/bash`
in `devcontainer.json` for stable command resolution on attach
- Prevent attach-time failure where OCI runtime reports `getcwd: No such
file or directory`
  - Update tests in `test-integration.py`

## Testing

- [ ] Tests pass locally (`just test`)
- [ ] Manual testing performed (describe below)

### Manual Testing Details

Added/updated integration assertions in `tests/test_integration.py`:
- `postAttachCommand` expected value includes `/bin/bash`
- `postCreateCommand` expected value includes `/bin/bash`
- new `postStartCommand` assertion

## Checklist

- [x] My code follows the project's style guidelines
- [x] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have updated the documentation accordingly (edit
`docs/templates/`, then run `just docs`)
- [x] I have updated `CHANGELOG.md` in the `[Unreleased]` section (and
pasted the entry above)
- [x] My changes generate no new warnings or errors
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] New and existing unit tests pass locally with my changes
- [x] Any dependent changes have been merged and published

## Additional Notes

Closes #204

Refs: #204
@c-vigo
Copy link
Copy Markdown
Contributor

c-vigo commented Feb 25, 2026

Opened follow-up bug issue #202 for the host-dependent BATS failure (detect_editor_cli neither-path case).

Root cause: test assumes /usr/bin:/bin has no code, which is false on some hosts.

Planned fix in #202: isolate PATH with an empty temp dir and invoke /bin/bash under env -i so detection is deterministic.

Fixed in #205

@c-vigo
Copy link
Copy Markdown
Contributor

c-vigo commented Feb 25, 2026

Running from the host directly:

carlosvigo@vigolaptop:~/Documents/vigOS/tmp$ just devc-remote ksb
bash scripts/devc-remote.sh ksb
ℹ  Detecting local editor CLI...
✓  Using cursor
ℹ  Checking SSH connectivity to ksb...
✓  SSH connection OK
ℹ  Running pre-flight checks on ksb...
✗  No .devcontainer/ found in ~. Is this a devcontainer-enabled project?
error: Recipe `devc-remote` failed on line 382 with exit code 1

Running from inside the devcontainer:

root@0491bc9d9819:/workspace/tmp# just devc-remote ksb
bash scripts/devc-remote.sh ksb
ℹ  Detecting local editor CLI...
✓  Using cursor
ℹ  Checking SSH connectivity to ksb...
✗  Cannot connect to ksb. Check your SSH config and network.
error: Recipe `devc-remote` failed on line 382 with exit code 1

I assume I am supposed to run from the devcontainer, but my SSH configuration (and maybe also the credentials) are not being forwarded.

…tion' into bugfix/202-deterministic-detect-editor-cli-test
## Description

Make the `detect_editor_cli` negative-path BATS test deterministic across host environments where `/usr/bin/code` may be present.

## Type of Change

- [ ] `feat` -- New feature
- [x] `fix` -- Bug fix
- [ ] `docs` -- Documentation only
- [ ] `chore` -- Maintenance task (deps, config, etc.)
- [ ] `refactor` -- Code restructuring (no behavior change)
- [ ] `test` -- Adding or updating tests
- [ ] `ci` -- CI/CD pipeline changes
- [ ] `build` -- Build system or dependency changes
- [ ] `revert` -- Reverts a previous commit
- [ ] `style` -- Code style (formatting, whitespace)

### Modifiers

- [ ] Breaking change (`!`) -- This change breaks backward compatibility

## Changes Made

- `tests/bats/devc-remote.bats`
  - Replace `PATH="/usr/bin:/bin"` assumption with a temp empty PATH directory.
  - Execute via `/bin/bash "$DEVC_REMOTE"` under `env -i` to avoid shebang/PATH lookup side effects.
  - Clean up temporary directory at test end.

## Changelog Entry

No changelog needed: this is an internal test determinism fix with no user-facing behavior change.

## Testing

- [x] Tests pass locally (`just test`)
- [ ] Manual testing performed (describe below)

### Manual Testing Details

N/A

## Checklist

- [x] My code follows the project's style guidelines
- [x] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have updated the documentation accordingly (edit `docs/templates/`, then run `just docs`)
- [ ] I have updated `CHANGELOG.md` in the `[Unreleased]` section (and pasted the entry above)
- [x] My changes generate no new warnings or errors
- [x] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published

## Additional Notes

Validated with:
- `npx bats tests/bats/devc-remote.bats -f "detect_editor_cli fails when neither cursor nor code in PATH"`
- `npx bats tests/bats/devc-remote.bats`

Refs: #202
Added an import statement for '.devcontainer/justfile.worktree' to the main justfile, allowing for improved workspace configuration management.
…remote-devc-orchestration

# Conflicts:
#	CHANGELOG.md
#	CONTRIBUTE.md
#	README.md
#	scripts/manifest.toml
… to devcontainer.json (#207)

## Description

When a user's Cursor/VS Code settings configure `terminal.integrated.defaultProfile.linux` to a shell not present in the container image (e.g. `zsh`), the Agent chat shell fails with `forkpty(3) failed` and the extension host times out. This adds a `"terminal.integrated.defaultProfile.linux": "bash"` setting to the devcontainer.json template so the container always overrides the user's host-side preference.

## Type of Change

- [x] `fix` -- Bug fix

### Modifiers

- [ ] Breaking change (`!`) -- This change breaks backward compatibility

## Changes Made

- Add `"terminal.integrated.defaultProfile.linux": "bash"` to `assets/workspace/.devcontainer/devcontainer.json` settings
- Add BATS test verifying the setting is present in the template
- Update CHANGELOG.md

## Changelog Entry

### Fixed

- **Cursor Agent shell fails with forkpty(3) when host sets zsh as default terminal profile** ([#206](#206))
  - Add `terminal.integrated.defaultProfile.linux: "bash"` to devcontainer.json template settings
  - Prevents user's host-side shell preference from leaking into the container

## Testing

- [x] Tests pass locally (`npx bats tests/bats/init-workspace.bats`)
- [x] Manual testing performed (describe below)

### Manual Testing Details

- Verified the devcontainer.json template is valid JSON after the change
- Confirmed the new BATS test fails before the fix (RED) and passes after (GREEN)

## Checklist

- [x] My code follows the project's style guidelines
- [x] I have performed a self-review of my code
- [x] I have updated `CHANGELOG.md` in the `[Unreleased]` section (and pasted the entry above)
- [x] My changes generate no new warnings or errors
- [x] I have added tests that prove my fix is effective or that my feature works
- [x] New and existing unit tests pass locally with my changes

## Additional Notes

The container image ships `bash` but not `zsh`. Users with `"terminal.integrated.defaultProfile.linux": "zsh"` in their global Cursor settings hit this silently. The devcontainer.json `customizations.vscode.settings` override ensures the container always uses `bash` regardless of the host-side preference.

Refs: #206
Opt-in Tailscale SSH setup for devcontainers. Both subcommands are
silent no-ops when TAILSCALE_AUTHKEY is unset.

Refs: #208
c-vigo and others added 30 commits March 26, 2026 16:03
)

## Description

Smoke-test orchestration failed for `0.3.1-rc24` because the workspace
`release.yml` passed `needs.core.outputs.tag_already_exists` (a string
from job outputs) into `release-publish.yml`, which declares that input
as `type: boolean`. GitHub Actions rejects the reusable workflow call,
so the Publish job never spawns sub-jobs and the downstream Release run
fails. This PR coerces the value with `== 'true'` so the call receives a
real boolean.

## Type of Change

- [ ] `feat` -- New feature
- [x] `fix` -- Bug fix
- [ ] `docs` -- Documentation only
- [ ] `chore` -- Maintenance task (deps, config, etc.)
- [ ] `refactor` -- Code restructuring (no behavior change)
- [ ] `test` -- Adding or updating tests
- [ ] `ci` -- CI/CD pipeline changes
- [ ] `build` -- Build system or dependency changes
- [ ] `revert` -- Reverts a previous commit
- [ ] `style` -- Code style (formatting, whitespace)

### Modifiers

- [ ] Breaking change (`!`) -- This change breaks backward compatibility

## Changes Made

- `assets/workspace/.github/workflows/release.yml` — pass
`tag_already_exists: ${{ needs.core.outputs.tag_already_exists == 'true'
}}` into `release-publish.yml`.
- `CHANGELOG.md` — document the fix under `## [0.3.1] - TBD` →
**Fixed**.
- `assets/workspace/.devcontainer/CHANGELOG.md` — same entry (workspace
SSoT mirror).

## Changelog Entry

Pasted from `CHANGELOG.md` on this branch (under `## [0.3.1] - TBD` →
**Fixed**; this release branch uses the TBD section instead of `##
Unreleased`):

```markdown
- **Workspace release publish `tag_already_exists` input coercion** ([#451](#451))
  - Pass a boolean into `release-publish.yml` via `needs.core.outputs.tag_already_exists == 'true'` so `workflow_call` does not reject string `"true"`/`"false"` job outputs
```

## Testing

- [ ] Tests pass locally (`just test`)
- [ ] Manual testing performed (describe below)

### Manual Testing Details

N/A — workflow expression change only; behavior verified via RCA against
failed run logs. Follow-up validation is a new smoke-test dispatch after
template reaches the validation repo.

## Checklist

- [x] My code follows the project's style guidelines
- [x] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have updated the documentation accordingly (edit
`docs/templates/`, then run `just docs`)
- [ ] I have updated `CHANGELOG.md` in the `[Unreleased]` section (and
pasted the entry above)
- [x] My changes generate no new warnings or errors
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published

## Additional Notes

RCA draft for posting on the issue:
`docs/issues/issue-451-github-comment.md` (optional).

Refs: #451
# [Release
0.3.1](https://github.com/vig-os/devcontainer/releases/tag/0.3.1) -
2026-03-26

This PR prepares release 0.3.1 for merge to main.

## [0.3.1] - TBD

### Added

- **Split downstream release workflow with project-owned extension
hook** ([#326](#326))
- Add local `workflow_call` release phases (`release-core.yml`,
`release-publish.yml`) and a lightweight `release.yml` orchestrator in
`assets/workspace/.github/workflows/`
- Add `release_kind` support with candidate mode (`X.Y.Z-rcN`) and final
mode (`X.Y.Z`) in downstream release workflows
- Candidate mode now auto-computes the next RC tag, skips CHANGELOG
finalization/sync-issues, and publishes a GitHub pre-release
- Add project-owned `release-extension.yml` stub and preserve it during
`init-workspace.sh --force` upgrades
- Add `validate-contract` composite action for single-source contract
version validation
- Add downstream release contract documentation and GHCR extension
example in `docs/DOWNSTREAM_RELEASE.md`
- **`jq` in devcontainer image**
([#425](#425))
- Install the `jq` CLI in the GHCR image so containerized workflows
(e.g. `release-core` validate / downstream Release Core) can pipe JSON
through `jq`

### Changed

- **Dependabot dependency update batch**
([#302](#302),
[#303](#303),
[#305](#305),
[#306](#306),
[#307](#307),
[#308](#308),
[#309](#309))
- Bump `@devcontainers/cli` from `0.81.1` to `0.84.0` and `bats-assert`
from `v2.2.0` to `v2.2.4`
- Bump GitHub Actions: `actions/download-artifact` (`4.3.0` -> `8.0.1`),
`actions/github-script` (`7.1.0` -> `8.0.0`),
`actions/attest-build-provenance` (`3.0.0` -> `4.1.0`),
`actions/checkout` (`4.3.1` -> `6.0.2`)
- Bump release workflow action pins: `sigstore/cosign-installer`
(`4.0.0` -> `4.1.0`) and `anchore/sbom-action` (`0.22.2` -> `0.23.1`)
- **Dependabot dependency update batch**
([#314](#314),
[#315](#315),
[#316](#316),
[#317](#317))
- Bump GitHub Actions: `actions/attest-sbom` (`3.0.0` -> `4.0.0`),
`actions/upload-artifact` (`4.6.2` -> `7.0.0`),
`actions/create-github-app-token` (`2.2.1` -> `3.0.0`)
  - Bump `docker/login-action` from `3.7.0` to `4.0.0`
  - Bump `just` minor version from `1.46` to `1.47`
- **Node24-ready GitHub Actions pin refresh for shared composite
actions** ([#321](#321))
- Update Docker build path pins in `build-image`
(`docker/setup-buildx-action`, `docker/metadata-action`,
`docker/build-push-action`) to Node24-compatible releases
- Set `setup-env` default Node runtime to `24` and upgrade
`actions/setup-node`
- Align test composite actions with newer pins (`actions/checkout`,
`actions/cache`, `actions/upload-artifact`)
- **Smoke-test dispatch payload now carries source run traceability
metadata** ([#289](#289))
- Candidate release dispatches now include source repo/workflow/run/SHA
metadata plus a deterministic `correlation_id`
- Smoke-test dispatch receiver logs normalized source context, derives
source run URL when possible, and writes it to workflow summary output
- Release-cycle docs now define required vs optional dispatch payload
keys and the future callback contract path for `publish-candidate`
- **Smoke-test repository dispatch now runs for final releases too**
([#173](#173))
- `release.yml` now triggers the existing smoke-test dispatch contract
for both `candidate` and `final` release kinds
- Final release summaries and release-cycle documentation now reflect
dispatch behavior for both release modes
- **Workspace CI templates now use a single container-based workflow**
([#327](#327))
- Consolidate `assets/workspace/.github/workflows/ci.yml` as the
canonical CI workflow and remove the obsolete `ci-container.yml`
template
- Extract reusable `assets/workspace/.github/actions/resolve-image` and
run workspace release tests in the same containerized workflow model
- Update smoke-test and release-cycle documentation to reference the
single CI workflow contract
- **Final release now requires downstream RC pre-release gate**
([#331](#331))
- Add upstream final-release validation that requires a downstream
GitHub pre-release for the latest published RC tag
- Move smoke-test dispatch to a dedicated release job and include
`release_kind` in the dispatch payload
- Add downstream `repository-dispatch.yml` template that runs smoke
tests and creates pre-release/final release artifacts
- **Ship changelog into workspace payload and smoke-test deploy root**
([#333](#333))
- Sync canonical `CHANGELOG.md` into both workspace root and
`.devcontainer/` template paths
- Smoke-test dispatch now copies `.devcontainer/CHANGELOG.md` to
repository root so deploy output keeps a root changelog
- **Final release now publishes a GitHub Release with finalized notes**
([#310](#310))
- Add a final-only publish step in `.github/workflows/release.yml` that
creates a GitHub Release for `X.Y.Z`
- Source GitHub Release notes from the finalized `CHANGELOG.md` section
and fail the run if notes extraction or release publishing fails
- **Release dispatch and publish ordering hardened for 0.3.1**
([#336](#336))
- Make smoke-test dispatch fire-and-forget in
`.github/workflows/release.yml` and decouple rollback from downstream
completion timing
- Add bounded retries to the final-release downstream RC pre-release
gate API check
- Move final GitHub Release creation to the end of publish so artifact
publication/signing completes before release object creation
- Add concurrency control to
`assets/smoke-test/.github/workflows/repository-dispatch.yml` to prevent
overlapping dispatch races
- Handle smoke-test dispatch failures with a targeted issue while
avoiding destructive rollback after publish artifacts are already
released
- **Redesigned smoke-test dispatch release orchestration**
([#358](#358))
- Replace premature `publish-release` behavior with full downstream
orchestration: deploy-to-dev merge gate, `prepare-release.yml`, release
PR readiness/approval, and `release.yml` dispatch polling
- Add upstream failure issue reporting with job-phase results and
cleanup guidance when dispatch orchestration fails
- **Smoke-test release orchestration now runs as two phases**
([#402](#402))
- Keep `repository-dispatch.yml` focused on deploy/prepare/release-PR
readiness and move release dispatch to a dedicated merged-PR workflow
(`on-release-pr-merge.yml`)
- Add release-kind labeling and auto-merge enablement for release PRs,
and keep upstream failure notifications in both phases
- Remove release-branch upstream `CHANGELOG.md` sync from
`repository-dispatch.yml` (previously added in
[#358](#358))
- **Dependabot dependency update batch**
([#414](#414))
- Bump `github/codeql-action` from `4.32.6` to `4.34.1` and
`anchore/sbom-action` from `0.23.1` to `0.24.0`
- Bump `actions/cache` restore/save pins from `5.0.3` to `5.0.4` in
`sync-issues.yml`
- **Dependabot dependency update batch**
([#413](#413))
  - Bump `@devcontainers/cli` from `0.84.0` to `0.84.1`
- **cursor-agent install is now resilient to CDN failures**
([#434](#434))
  - Retries 3 times with backoff before giving up
  - Build succeeds without cursor-agent when Cursor's CDN is unavailable
- **Immutable GitHub releases, tag rulesets, and forward-fix policy**
([#446](#446))
- Final releases create a **draft** GitHub Release for human review
before publishing; rollback no longer deletes remote tags
- Release workflows skip redundant tag push when the tag already matches
the finalized commit; workspace `release-core` / `release-publish` and
smoke-test failure guidance updated accordingly
- Document tag rulesets, immutable releases, and recovery in
`docs/RELEASE_CYCLE.md`, `docs/DOWNSTREAM_RELEASE.md`, and
`docs/CROSS_REPO_RELEASE_GATE.md`
- **Container image tests expect current GitHub CLI minor line**
- Update `tests/test_image.py` `EXPECTED_VERSIONS["gh"]` to `2.89.` to
match the CLI shipped in the image

### Removed

- **PR Title Check GitHub Actions workflow**
([#444](#444))
- Remove `.github/workflows/pr-title-check.yml`; commit message rules
remain enforced via local hooks and `validate-commit-msg`
- Remove `--subject-only` from `validate-commit-msg` (it existed only
for PR title CI)

### Fixed

- **Smoke-test deploy restores workspace CHANGELOG for prepare-release**
([#417](#417))
- Add `prepare-changelog unprepare` to rename the top `## [semver] - …`
heading to `## Unreleased`
- `init-workspace.sh --smoke-test` copies `.devcontainer/CHANGELOG.md`
into workspace `CHANGELOG.md` and runs unprepare; remove duplicate remap
from smoke-test dispatch workflow
- **Release app permission docs now include downstream workflow dispatch
requirements**
([#397](#397))
- Update `docs/RELEASE_CYCLE.md` to require `Actions` read/write for
`RELEASE_APP` on the validation repository
- Clarify this is required so downstream `repository-dispatch.yml` can
trigger release orchestration workflows via `workflow_dispatch`
- **Smoke-test dispatch no longer fails on release PR self-approval**
([#402](#402))
- Remove bot self-approval from `repository-dispatch.yml` and replace
with release-kind labeling plus auto-merge enablement
- Remove in-job polling for release PR merge and downstream release
execution from phase 1 orchestration
- Phase 2 (`on-release-pr-merge.yml`) fails validation unless the merged
release PR has `release-kind:final` or `release-kind:candidate`
- **Sync-main-to-dev PRs now trigger CI reliably in downstream repos**
([#398](#398))
- Replace API-based sync branch creation with `git push` in
`assets/workspace/.github/workflows/sync-main-to-dev.yml`
- **Sync-main-to-dev no longer dispatches CI via workflow_dispatch**
([#405](#405))
- `workflow_dispatch` runs are omitted from the PR status check rollup,
so they do not satisfy branch protection on the sync PR
- Remove the post-PR `gh workflow run ci.yml` step and drop `actions:
write` from the sync job in `.github/workflows/sync-main-to-dev.yml` and
`assets/workspace/.github/workflows/sync-main-to-dev.yml`
- **Sync-main-to-dev conflict detection uses merge-tree**
([#410](#410))
- Replace working-tree trial merge with `git merge-tree --write-tree` so
clean merges are not mislabeled as conflicts
- Enable auto-merge when dev merges cleanly with main; print merge-tree
output on conflicts; fail the step on unexpected errors
- **Smoke-test release phase 2 branch-not-found failure**
([#419](#419))
- Merge phase 2 (`on-release-pr-merge.yml`) back into
`repository-dispatch.yml` so the release runs while `release/<version>`
still exists, matching the normal release flow
  - Remove `on-release-pr-merge.yml` from the smoke-test template
- **Pinned commit-action to v0.2.0**
([#354](#354))
- Updated workflow pins from `vig-os/commit-action@c0024cb` (v0.1.5) to
`1bc004353d08d9332a0cb54920b148256220c8e0` (v0.2.0) in release,
sync-issues, prepare-release, and smoke-test workflows
- Upstream v0.2.0 adds bounded retry with exponential backoff for
transient GitHub API failures (configurable `MAX_ATTEMPTS` and delay
bounds)
- Efficient multi-file commits via `createTree` inline content for text
files, binary blobs only when needed, and chunked tree creation for
large change sets
- Exports `isBinaryFile`, `getFileMode`, and `TREE_ENTRY_CHUNK_SIZE` for
library use; sequential binary blob creation to reduce secondary
rate-limit bursts

- **Release finalization now commits generated docs and refreshes PR
content** ([#300](#300))
- Final release automation regenerates docs before committing so
pre-commit `generate-docs` does not fail CI with tracked file diffs
  - Release PR body is refreshed from finalized `CHANGELOG.md`
- **Release attestation warnings reduced by granting artifact metadata
permission** ([#348](#348))
- Add `artifact-metadata: write` to the release publish job so
attestation steps can persist metadata storage records
- Keep `actions/attest`-based SBOM attestation path and remove
missing-permission warnings from publish runs
- **Smoke-test dispatch deploy now repairs workspace ownership before
changelog copy**
([#352](#352))
- Add a write probe and conditional `sudo chown -R` in
`assets/smoke-test/.github/workflows/repository-dispatch.yml` after
installer execution
- Prevent `Permission denied` failures when copying
`.devcontainer/CHANGELOG.md` to repository root in GitHub-hosted runner
jobs
- **Smoke-test release lookup no longer treats missing tags as existing
releases** ([#355](#355))
- Change `assets/smoke-test/.github/workflows/repository-dispatch.yml`
to branch on `gh api` exit status when querying `releases/tags/<tag>`
- Ensure missing release tags follow the create path instead of failing
with `prerelease=null` mismatch
- **Bounded retry added for network-dependent setup and prepare-release
calls** ([#357](#357))
- Replace shell-based retry helper with pure Python `retry` CLI in
`vig-utils` (`packages/vig-utils/src/vig_utils/retry.py`)
- Update this repository CI workflows to call `uv run retry` after
`setup-env` dependency sync
- Update downstream workflow templates to call `retry` directly in
devcontainer jobs and remove `source` lines
- Ensure downstream containerized jobs resolve image tags from `.vig-os`
instead of hardcoded `latest`
- Bundle idempotency guards for branch/PR/tag/release creation paths to
keep retried network calls safe on reruns
- Remove synced `retry.sh` artifacts and BATS retry tests in favor of
`vig-utils` pytest coverage
- **Release workflow no longer fails when retry tooling is unavailable**
([#365](#365))
- Extend `.github/actions/setup-env/action.yml` with a reusable `retry`
shell function exported via `BASH_ENV` as the retry single source of
truth
- Add `setup-env` input support for uv-only usage by allowing Python
setup to be disabled when jobs only need retry tooling
- Switch release workflow retry calls from `uv run retry` to shared
`retry` and remove duplicated inline retry implementations
- **Upstream sync workflows no longer depend on pre-published GHCR image
tags** ([#367](#367))
- Remove upstream `.vig-os` files at repository root and
`assets/smoke-test/` to eliminate downstream-only configuration from
upstream CI
- Refactor `.github/workflows/sync-issues.yml` and
`.github/workflows/sync-main-to-dev.yml` to run natively on runners via
`./.github/actions/setup-env` instead of `resolve-image` + `container`
- **Release test-image setup now recovers from uv sync crashes**
([#370](#370))
- Harden `.github/actions/setup-env/action.yml` to retry `uv sync
--frozen --all-extras` once after clearing uv cache and removing stale
`.venv`
- Prevent repeat release test failures when `setup-env` is executed
multiple times in the same job
- **Release setup-env no longer self-sources retry helper via BASH_ENV**
([#374](#374))
- Guard the retry-helper merge logic in
`.github/actions/setup-env/action.yml` to skip merging when
`PREV_BASH_ENV` already equals `RETRY_HELPER`
- Prevent infinite `source` recursion and exit 139 crashes when
`setup-env` is invoked multiple times in one job
- **Smoke-test dispatch now checks out repository before local setup
action** ([#376](#376))
- Add `actions/checkout` to the `smoke-test` job in
`.github/workflows/release.yml` before invoking
`./.github/actions/setup-env`
- Prevent dispatch failures caused by missing local action metadata
(`action.yml`) in a fresh job workspace
- **Workspace resolve-image jobs now checkout local action metadata**
([#380](#380))
- Update `sparse-checkout` in workspace `resolve-image` jobs to include
`.github/actions/resolve-image` in addition to `.vig-os`
- Prevent CI failures in downstream deploy PRs where local composite
actions were missing from sparse checkout
- **Smoke-test dispatch gh jobs now set explicit repo context**
([#386](#386))
- Add job-level `GH_REPO: ${{ github.repository }}` to
`cleanup-release`, `trigger-prepare-release`, `ready-release-pr`, and
`trigger-release` in
`assets/smoke-test/.github/workflows/repository-dispatch.yml`
- Prevent `gh` CLI failures (`fatal: not a git repository`) in runner
jobs that do not perform `actions/checkout`
- **Smoke-test release orchestration now validates workflow contract
before dispatch**
([#389](#389))
- Add a preflight check that verifies `prepare-release.yml` and
`release.yml` are resolvable on dispatch ref `dev` before downstream
orchestration starts
- Dispatch and polling now use explicit ref/branch context (`--ref dev`
/ `--branch dev`) to avoid default-branch workflow registry drift and
`404 workflow not found` failures
- **Smoke-test preflight now uses gh CLI ref-compatible workflow
validation** ([#392](#392))
- Update `assets/smoke-test/.github/workflows/repository-dispatch.yml`
preflight checks to call `gh workflow view` with `--yaml` when `--ref`
is set
- Prevent false preflight failures caused by newer GitHub CLI argument
validation before `prepare-release` dispatch
- **Downstream release workflow templates hardened for smoke-test
orchestration**
([#394](#394))
- Add missing `git config --global --add safe.directory
"$GITHUB_WORKSPACE"` in containerized release and sync jobs that run git
after checkout
- Decouple `release.yml` rollback container startup from
`needs.core.outputs.image_tag` by resolving the image in a dedicated
`resolve-image` job
- Add explicit release caller/reusable workflow permissions for
`actions` and `pull-requests` operations, and update dispatch header
comments to reference only current CI workflows
- **Workspace containerized workflows now pin bash for run steps**
([#395](#395))
- Set `defaults.run.shell: bash` in containerized workspace release and
prepare jobs so `set -euo pipefail` scripts do not execute under POSIX
`sh`
- Prevent downstream smoke-test failures caused by `set: Illegal option
-o pipefail` in container jobs
- **Downstream release templates now require explicit app tokens for
write paths**
([#400](#400))
- Update `assets/workspace/.github/workflows/prepare-release.yml`,
`release-core.yml`, `release-publish.yml`, `release.yml`, and
`sync-issues.yml` to remove `github.token` fallback from protected write
operations
- Route protected branch/ref writes through Commit App tokens and
release orchestration/issue operations through Release App tokens
- Document downstream token requirements in `docs/DOWNSTREAM_RELEASE.md`
and `docs/CROSS_REPO_RELEASE_GATE.md`
- Use `github.token` specifically for Actions cache deletion in
`sync-issues.yml` because that API path requires explicit `actions:
write` job token scope
- Use Commit App credentials for rollback checkout in `release.yml` so
rollback branch/tag writes can still bypass protected refs
- **setup-env retries uv install on transient GitHub Releases download
failures** ([#407](#407))
- Add `continue-on-error` plus a delayed second attempt for
`astral-sh/setup-uv` in `.github/actions/setup-env/action.yml`
- Reduce flaky release publish failures when GitHub CDN returns
transient HTTP errors for uv release assets
- **Smoke-test deploy keeps workspace scaffold as root CHANGELOG**
([#403](#403))
- Stop overwriting `CHANGELOG.md` with a minimal stub in
`assets/smoke-test/.github/workflows/repository-dispatch.yml`
- Require the workspace `CHANGELOG.md` from `init-workspace` so
downstream `prepare-release` validation matches shipped layout
- When the first changelog section is `## [X.Y.Z] - …` (TBD or a release
date), remap that top version header to `## Unreleased` so downstream
`prepare-release` can run
- **Smoke-test dispatch release validate no longer runs docker inside
devcontainer**
([#421](#421))
- Remove redundant `docker manifest inspect` step from
`release-core.yml` validate job (container image is already proof of
accessibility; `resolve-image` validates on the runner)
- Set `GH_REPO` for rollback `gh issue create` in workspace
`release.yml` when git checkout is skipped
- **Container image tests expect current uv minor line**
([#423](#423))
- Update `tests/test_image.py` `EXPECTED_VERSIONS["uv"]` to match uv
0.11.x from the latest release install path in the image build
- **Container image tests expect current just minor line**
([#423](#423))
- Update `tests/test_image.py` `EXPECTED_VERSIONS["just"]` to match just
1.48.x from the latest release install path in the image build
- **Smoke-test dispatch approves release PR before downstream release**
([#430](#430))
- Grant `pull-requests: write` on `ready-release-pr` and approve with
`github.token` (`github-actions[bot]`)
- Satisfy `release-core.yml` approval gate without the release app
self-approving its own PR
- **commit-action retries enabled for transient git ref API failures**
([#436](#436))
- Set `MAX_ATTEMPTS: "3"` on every `vig-os/commit-action` step so v0.2.0
bounded retry actually runs (default was 1)
- Covers smoke-test deploy, prepare-release, release finalization,
sync-issues, and workspace templates
- **Release validation fails when bot approves PR**
([#438](#438))
- Add fallback to individual PR review check when `reviewDecision` is
empty (bot approvals not counted by branch protection)
- **Downstream candidate RC tag can match upstream dispatch**
([#441](#441))
- Workspace `release.yml` / `release-core.yml` accept optional
`rc-number` so candidate tags are not always recomputed from local tags
only
- Smoke-test `repository-dispatch.yml` exposes `base_version` and
`rc_number` job outputs for orchestration that calls workspace
`release.yml`
- **Release validate fails early when GitHub Release already exists**
([#443](#443))
- Validate job in `.github/workflows/release.yml` queries `GET
/repos/.../releases/tags/<PUBLISH_VERSION>` with retries and classifies
errors like the downstream RC gate; only a documented not-found response
is treated as “no release,” and ambiguous API failures fail closed
before build/sign/publish
- Publish job uses the same existence checks before and after `gh
release create` instead of `gh release view` with discarded stderr
- **Release tag resolution and GitHub Release view retries**
([#446](#446))
- Fall back to plain `refs/tags/<tag>` when the peeled ref is empty
(lightweight remote tags) in `.github/workflows/release.yml`,
`release-core.yml`, and `release-publish.yml`
- Use one retried `gh release view` in workspace `release-publish.yml`
so draft/prerelease skip paths parse JSON from the same successful
response
- **Workspace release publish `tag_already_exists` input coercion**
([#451](#451))
- Pass a boolean into `release-publish.yml` via
`needs.core.outputs.tag_already_exists == 'true'` so `workflow_call`
does not reject string `"true"`/`"false"` job outputs

### Security

- **Smoke-test dispatch workflow permissions now follow least
privilege** ([#340](#340))
- Reduce `assets/smoke-test/.github/workflows/repository-dispatch.yml`
workflow token permissions from write to read by default
- Grant `contents: write` only to `publish-release`, the single job that
creates or edits GitHub Releases
SSH non-login shells don't source .bashrc/.profile, so tools
installed to ~/.local/bin (like uv) are not found during bootstrap.

Refs: #70
…project files

setup-tailscale.sh: replace underscores with hyphens in auto-generated
hostname — DNS labels cannot contain underscores.

init-workspace.sh: add pyproject.toml, uv.lock, .python-version to
PRESERVE_FILES so --force upgrades don't overwrite project config.

Refs: #70
devc-remote.sh now reads CLAUDE_CODE_OAUTH_TOKEN from macOS keychain
(service: devc-remote) when not set as env var. First step toward
full secret resolution chain (#464).

Refs: #70
Move Tailscale OAuth credentials (TS_CLIENT_ID, TS_CLIENT_SECRET) to
the same keychain-fallback pattern as Claude OAuth token. All three
secrets now resolve at deploy time: env var → macOS keychain → skip.

No secrets need to be in shell profile env vars anymore.

Refs: #70
Extract sanitize_dns_label() helper and use it in wait_for_tailscale()
so the pattern matches the sanitized hostname from setup-tailscale.sh.

Refs: #70
Previously skipped compose up when container was already running,
causing injected secrets to never reach the container. Now always
runs compose up -d which is idempotent and auto-recreates only
when config (env vars, devices, etc.) has changed.

Refs: #70
uv sync failure (corrupt lockfile, version mismatch) was killing the
entire lifecycle script before Tailscale and Claude Code could install.
Sync is now non-fatal — a warning is printed but setup continues.

Refs: #70
compose up -d silently recreates containers when config changes,
but CONTAINER_FRESH stayed 0 because it only checked the pre-up
state. Now compares container ID before and after compose up to
detect recreates and run post-create lifecycle accordingly.

Refs: #70
podman compose prefixes output with >>>> banner lines which polluted
the container ID comparison, making recreated containers look identical.

Refs: #70
Containers often have clock skew causing apt-get update to fail with
"Release file not valid yet". Add fallback manual install with
Acquire::Check-Valid-Until=false. Also remove debug log line.

Refs: #70
…skew

apt-get update on all repos fails when container clock is skewed.
Install directly from Tailscale repo with Dir::Etc::sourcelist
to only update that single source, bypassing clock-skew on system repos.

Refs: #70
Same issue as Tailscale: container clock skew breaks apt-get update
on system repos. Install Node.js from nodesource repo only.

Refs: #70
The nodesource setup_lts.x script runs apt-get update on all repos,
failing with clock skew. Add the repo GPG key and source list manually,
then update only that repo.

Refs: #70
gpg refuses to overwrite an existing file without --yes, causing
the nodesource key import to fail on container re-creation.

Refs: #70
nodesource nodejs depends on system python3, so we need all repos
updated with Check-Valid-Until=false, not just the nodesource repo.

Refs: #70
apt returns 100 when any repo has clock issues even with
Check-Valid-Until=false. The repos we need still update successfully
so ignore the exit code.

Refs: #70
…e user

Mirrors local dev environment aliases. cl=claude, cld=claude
--dangerously-skip-permissions. Added to both claude user .bashrc
and root .bashrc for consistent DX regardless of SSH user.

Refs: #70
Tailscale SSH sessions don't inherit compose env vars. The wrapper
now reads CLAUDE_CODE_OAUTH_TOKEN from /proc/1/environ when not
in the current environment, matching the claude user .bashrc pattern.

Refs: #70
runuser without --pty doesn't allocate a terminal, causing Claude
Code's interactive TUI to exit immediately. Add --pty flag.

Refs: #70
Copies ~/.claude/{CLAUDE.md,settings.json,commands/} into the
container's claude user home after lifecycle scripts run. Gives
the remote Claude Code the same global instructions, permissions,
and custom commands as the local dev environment.

Refs: #70
Pre-create .claude.json with hasCompletedOnboarding=true so the
interactive TUI doesn't show the login method selection screen
when CLAUDE_CODE_OAUTH_TOKEN is already set.

Refs: #70
hasCompletedOnboarding alone was not enough — Claude Code also
checks hasCompletedAuthFlow before skipping the interactive
login method selection screen.

Refs: #70
setup-claude.sh now creates settings.json with additionalDirectories
pointing to the workspace project and skipDangerousModePermissionPrompt
so Claude Code trusts the workspace without interactive prompts.

Refs: #70
Instead of pre-configuring additionalDirectories, pass the current
working directory via --add-dir so Claude Code always trusts wherever
it's launched from.

Refs: #70
… settings

Claude Code checks hasTrustDialogAccepted in two places: the global
.claude.json (projects dict keyed by path) and the per-project
settings.json. Set both to avoid the interactive trust prompt on
first run.

Refs: #70
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants