fix: CI paths filter and workflow_dispatch#12
Conversation
- Replace "dev-unknown" fallback with null to prevent false update notifications for dev agents when version string can't be fetched - Add missing Prisma migration for latestDevAgentRelease fields - Add comment clarifying VF_CHANNEL is documentation-only
The agent infers its channel from the version string prefix (dev- vs semver). Writing VF_CHANNEL to the env file served no purpose.
Previously, checksums fetched from GitHub were only held in memory. After a server restart within the 24h cache window, the version was loaded from the DB but checksums were empty, causing agent self-update checksum verification to fail. Store checksums as JSON in SystemSettings for both stable and dev channels, and load them on cache hits.
- Always update checkedAt when dev release fetch succeeds, even if dev-version.txt is missing, to prevent GitHub API rate limit exhaustion - Add concurrency group to agent-dev-binaries CI job to prevent race conditions on concurrent pushes to main - Guard stable tag construction against null version to avoid "vnull" URL
The paths filter on the push trigger prevented CI from running when the workflow file itself was modified in a merge commit (GitHub evaluates against the pre-merge workflow). Removing it ensures dev binary builds run on every push to main, matching the design intent. Also adds workflow_dispatch for manual triggering.
Greptile SummaryThis PR removes the Verified findings:
The fix is to add branch guards like Confidence Score: 3/5
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[CI Trigger] --> B{Event type?}
B -->|pull_request to main| C{paths filter match?}
B -->|push to main/tags| D[check job]
B -->|workflow_dispatch| D
C -->|yes| D
C -->|no| SKIP[CI skipped]
D[check: Lint & Type Check] --> E{Event + Ref?}
E -->|push or workflow_dispatch| F[server-image]
E -->|push or workflow_dispatch| G[agent-image]
E -->|push or workflow_dispatch + NOT v-tag| H[agent-dev-binaries]
E -->|v-tag ref| I[agent-binaries]
F --> J[Push Docker image to ghcr.io]
G --> K[Push agent image to ghcr.io]
H --> L["Delete + recreate 'dev' pre-release on GitHub"]
I --> M[Upload binary artifacts]
I --> N[release job]
M --> N
N --> O[Create versioned GitHub Release]
style SKIP fill:#f55,color:#fff
style L fill:#f90,color:#fff
style H fill:#f90,color:#fff
style F fill:#f90,color:#fff
style G fill:#f90,color:#fff
Last reviewed commit: 28184dd |
.github/workflows/ci.yml
Outdated
| agent-dev-binaries: | ||
| name: Agent Dev Binaries | ||
| needs: check | ||
| if: github.event_name == 'push' && !startsWith(github.ref, 'refs/tags/v') |
There was a problem hiding this comment.
workflow_dispatch only runs the check job
The agent-dev-binaries job (and both image jobs) guard on github.event_name == 'push', so a manual dispatch via gh workflow run ci.yml will only execute the lint/type-check step — no dev binaries are built or published.
If the intent of workflow_dispatch is just to manually re-run lint, this is fine but worth documenting. If the intent is to also be able to manually rebuild and publish dev binaries (e.g. after a transient CI failure or to test without a dummy push), the condition needs to include workflow_dispatch:
| if: github.event_name == 'push' && !startsWith(github.ref, 'refs/tags/v') | |
| if: (github.event_name == 'push' || github.event_name == 'workflow_dispatch') && !startsWith(github.ref, 'refs/tags/v') |
The same consideration applies to server-image and agent-image if those should also be triggerable manually.
Prompt To Fix With AI
This is a comment left during a code review.
Path: .github/workflows/ci.yml
Line: 145
Comment:
**`workflow_dispatch` only runs the `check` job**
The `agent-dev-binaries` job (and both image jobs) guard on `github.event_name == 'push'`, so a manual dispatch via `gh workflow run ci.yml` will only execute the lint/type-check step — no dev binaries are built or published.
If the intent of `workflow_dispatch` is just to manually re-run lint, this is fine but worth documenting. If the intent is to also be able to manually rebuild and publish dev binaries (e.g. after a transient CI failure or to test without a dummy push), the condition needs to include `workflow_dispatch`:
```suggestion
if: (github.event_name == 'push' || github.event_name == 'workflow_dispatch') && !startsWith(github.ref, 'refs/tags/v')
```
The same consideration applies to `server-image` and `agent-image` if those should also be triggerable manually.
How can I resolve this? If you propose a fix, please make it concise.| if (needsCheck) { | ||
| const release = await fetchDevRelease(); | ||
| if (release) { | ||
| checkedAt = new Date(); | ||
| const versionString = await fetchDevVersionString(release); | ||
| if (versionString) { | ||
| latestVersion = versionString; | ||
| checksums = await fetchChecksums(release); | ||
| } | ||
| await prisma.systemSettings.upsert({ | ||
| where: { id: "singleton" }, | ||
| update: { | ||
| latestDevAgentRelease: latestVersion, | ||
| latestDevAgentReleaseCheckedAt: checkedAt, | ||
| ...(versionString ? { latestDevAgentChecksums: JSON.stringify(checksums) } : {}), | ||
| }, | ||
| create: { | ||
| id: "singleton", | ||
| latestDevAgentRelease: latestVersion, | ||
| latestDevAgentReleaseCheckedAt: checkedAt, | ||
| ...(versionString ? { latestDevAgentChecksums: JSON.stringify(checksums) } : {}), | ||
| }, | ||
| }); | ||
| } |
There was a problem hiding this comment.
checkedAt stamped even when no version data was retrieved
When the dev GitHub release exists but dev-version.txt is missing (e.g. the asset upload step failed in a previous CI run), the function sets checkedAt = new Date() and writes it to the database on line 220, even though versionString is null and no valid version data was obtained.
This means the server will suppress further checks for 24 hours (CHECK_INTERVAL_MS) despite having stale/empty version data — effectively silencing the version-check for a full day any time the asset is missing.
checkedAt (and the corresponding DB timestamp) should only be updated when valid version data is actually retrieved:
if (versionString) {
latestVersion = versionString;
checksums = await fetchChecksums(release);
checkedAt = new Date(); // only stamp when we have real data
}
await prisma.systemSettings.upsert({
where: { id: "singleton" },
update: {
latestDevAgentRelease: latestVersion,
...(checkedAt ? { latestDevAgentReleaseCheckedAt: checkedAt } : {}),
...(versionString ? { latestDevAgentChecksums: JSON.stringify(checksums) } : {}),
},
create: {
id: "singleton",
latestDevAgentRelease: latestVersion,
...(checkedAt ? { latestDevAgentReleaseCheckedAt: checkedAt } : {}),
...(versionString ? { latestDevAgentChecksums: JSON.stringify(checksums) } : {}),
},
});Prompt To Fix With AI
This is a comment left during a code review.
Path: src/server/services/version-check.ts
Line: 217-240
Comment:
**`checkedAt` stamped even when no version data was retrieved**
When the `dev` GitHub release exists but `dev-version.txt` is missing (e.g. the asset upload step failed in a previous CI run), the function sets `checkedAt = new Date()` and writes it to the database on line 220, even though `versionString` is `null` and no valid version data was obtained.
This means the server will suppress further checks for 24 hours (`CHECK_INTERVAL_MS`) despite having stale/empty version data — effectively silencing the version-check for a full day any time the asset is missing.
`checkedAt` (and the corresponding DB timestamp) should only be updated when valid version data is actually retrieved:
```typescript
if (versionString) {
latestVersion = versionString;
checksums = await fetchChecksums(release);
checkedAt = new Date(); // only stamp when we have real data
}
await prisma.systemSettings.upsert({
where: { id: "singleton" },
update: {
latestDevAgentRelease: latestVersion,
...(checkedAt ? { latestDevAgentReleaseCheckedAt: checkedAt } : {}),
...(versionString ? { latestDevAgentChecksums: JSON.stringify(checksums) } : {}),
},
create: {
id: "singleton",
latestDevAgentRelease: latestVersion,
...(checkedAt ? { latestDevAgentReleaseCheckedAt: checkedAt } : {}),
...(versionString ? { latestDevAgentChecksums: JSON.stringify(checksums) } : {}),
},
});
```
How can I resolve this? If you propose a fix, please make it concise.| downloadUrl: `https://github.com/${AGENT_REPO}/releases/download/v${latestAgentVersion}/vf-agent-linux-amd64`, | ||
| checksum: `sha256:${agentChecksums["vf-agent-linux-amd64"] ?? ""}`, | ||
| targetVersion: latest.version!, | ||
| downloadUrl: `https://github.com/${AGENT_REPO}/releases/download/${latest.tag}/vf-agent-linux-amd64`, |
There was a problem hiding this comment.
Dev update button always targets amd64
The download URL is hardcoded to vf-agent-linux-amd64 for dev builds, the same as the pre-existing stable path. If the fleet ever includes arm64 agents on the dev channel, the update will push the wrong binary.
| downloadUrl: `https://github.com/${AGENT_REPO}/releases/download/${latest.tag}/vf-agent-linux-amd64`, | |
| downloadUrl: `https://github.com/${AGENT_REPO}/releases/download/${latest.tag}/vf-agent-linux-${node.arch ?? "amd64"}`, |
This was already present before this PR for the stable path, but worth fixing now that the dev path is being wired up. Consider adding an arch field to the node model or deriving it from the existing agentVersion string if the build embeds the architecture.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/(dashboard)/fleet/page.tsx
Line: 204
Comment:
**Dev update button always targets `amd64`**
The download URL is hardcoded to `vf-agent-linux-amd64` for dev builds, the same as the pre-existing stable path. If the fleet ever includes `arm64` agents on the dev channel, the update will push the wrong binary.
```suggestion
downloadUrl: `https://github.com/${AGENT_REPO}/releases/download/${latest.tag}/vf-agent-linux-${node.arch ?? "amd64"}`,
```
This was already present before this PR for the stable path, but worth fixing now that the dev path is being wired up. Consider adding an `arch` field to the node model or deriving it from the existing `agentVersion` string if the build embeds the architecture.
How can I resolve this? If you propose a fix, please make it concise.| // Dev-to-dev: different SHA means update available | ||
| if (currentIsDev && latestIsDev) { | ||
| return current !== latest; | ||
| } |
There was a problem hiding this comment.
Dev "update available" fires for nodes running a newer build than the published dev release
current !== latest treats any SHA difference as "node is behind", but a node could legitimately be running a build whose commit is newer than the rolling dev release (e.g. there's a gap between when a commit lands on main and when CI finishes publishing the new dev release). During that window the UI will show a spurious "Update available" badge and offer to "downgrade" the agent.
For rolling dev builds this may be an accepted trade-off, but it's worth a comment in the code to set expectations:
| // Dev-to-dev: different SHA means update available | |
| if (currentIsDev && latestIsDev) { | |
| return current !== latest; | |
| } | |
| // Dev-to-dev: any SHA difference shows "update available". | |
| // Note: this can produce a false positive for the brief window between | |
| // a push landing on main and the dev release being published by CI. | |
| if (currentIsDev && latestIsDev) { | |
| return current !== latest; | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/version.ts
Line: 18-21
Comment:
**Dev "update available" fires for nodes running a newer build than the published dev release**
`current !== latest` treats any SHA difference as "node is behind", but a node could legitimately be running a build whose commit is *newer* than the rolling `dev` release (e.g. there's a gap between when a commit lands on `main` and when CI finishes publishing the new dev release). During that window the UI will show a spurious "Update available" badge and offer to "downgrade" the agent.
For rolling dev builds this may be an accepted trade-off, but it's worth a comment in the code to set expectations:
```suggestion
// Dev-to-dev: any SHA difference shows "update available".
// Note: this can produce a false positive for the brief window between
// a push landing on main and the dev release being published by CI.
if (currentIsDev && latestIsDev) {
return current !== latest;
}
```
How can I resolve this? If you propose a fix, please make it concise.The push-gated jobs (images, dev binaries) excluded workflow_dispatch events, so manual triggers only ran lint. Now all build jobs run on both push and workflow_dispatch.
| name: Server Image | ||
| needs: check | ||
| if: github.event_name == 'push' | ||
| if: github.event_name == 'push' || github.event_name == 'workflow_dispatch' |
There was a problem hiding this comment.
workflow_dispatch from any branch overwrites dev artifacts
The updated conditions at lines 55, 100, and 145 allow workflow_dispatch to publish Docker images and recreate the rolling dev GitHub pre-release from any branch — not just main. A developer can run:
gh workflow run ci.yml --ref feature/my-branchThis would:
- Push
dev-tagged Docker images from the feature branch to ghcr.io (server-imageat line 55,agent-imageat line 100) - Delete and recreate the rolling
devpre-release with binaries from the feature branch (agent-dev-binariesat line 145) - Cancel any in-progress
main-branch run via thecancel-in-progress: trueconcurrency setting onagent-dev-binaries
Fix: Restrict workflow_dispatch to main (and semver tags for the image jobs):
# server-image (line 55) and agent-image (line 100):
if: github.event_name == 'push' || (github.event_name == 'workflow_dispatch' && github.ref == 'refs/heads/main')
# agent-dev-binaries (line 145):
if: (github.event_name == 'push' || (github.event_name == 'workflow_dispatch' && github.ref == 'refs/heads/main')) && !startsWith(github.ref, 'refs/tags/v')This ensures manual dispatch can only republish from main.
Prompt To Fix With AI
This is a comment left during a code review.
Path: .github/workflows/ci.yml
Line: 55
Comment:
`workflow_dispatch` from any branch overwrites dev artifacts
The updated conditions at lines 55, 100, and 145 allow `workflow_dispatch` to publish Docker images and recreate the rolling `dev` GitHub pre-release from *any* branch — not just `main`. A developer can run:
```bash
gh workflow run ci.yml --ref feature/my-branch
```
This would:
1. Push `dev`-tagged Docker images from the feature branch to ghcr.io (`server-image` at line 55, `agent-image` at line 100)
2. Delete and recreate the rolling `dev` pre-release with binaries from the feature branch (`agent-dev-binaries` at line 145)
3. Cancel any in-progress `main`-branch run via the `cancel-in-progress: true` concurrency setting on `agent-dev-binaries`
**Fix**: Restrict `workflow_dispatch` to `main` (and semver tags for the image jobs):
```yaml
# server-image (line 55) and agent-image (line 100):
if: github.event_name == 'push' || (github.event_name == 'workflow_dispatch' && github.ref == 'refs/heads/main')
# agent-dev-binaries (line 145):
if: (github.event_name == 'push' || (github.event_name == 'workflow_dispatch' && github.ref == 'refs/heads/main')) && !startsWith(github.ref, 'refs/tags/v')
```
This ensures manual dispatch can only republish from `main`.
How can I resolve this? If you propose a fix, please make it concise.
Summary
pathsfilter frompushtrigger so dev binary builds run on every push to main (not just when specific files change)workflow_dispatchtrigger for manual CI runsContext
PR #11 merged but didn't trigger the new
agent-dev-binariesjob because GitHub evaluates workflow definitions from the pre-merge state — the job didn't exist in the workflow file onmainyet. Removing thepathsfilter also ensures docs-only or config-only commits still produce dev binaries, matching the design intent.Test plan
agent-dev-binariesjobgh workflow run ci.ymlworks for manual dispatch