Skip to content

[gardener] install-workflow template silently fails when tree repo flips from public to private #331

@serenakeyitan

Description

@serenakeyitan

Symptom

Every First-Tree Sync workflow run on this repo has been failing at the Clone tree repo step for ~24h / 100+ consecutive runs:

fatal: Authentication failed for 'https://github.com/agent-team-foundation/first-tree-context.git/'
remote: Invalid username or token. Password authentication is not supported for Git operations.

Root cause is not the repo's token config — it's a latent bug in the workflow template that gardener generates.

Root cause

The template in src/products/gardener/engine/install-workflow.ts (lines ~193–222) structures the Clone tree repo step as:

  1. Try an anonymous git clone first.
  2. Only if that fails, fall back to TREE_REPO_TOKEN.

When the tree repo is public (the common case at install time), the anonymous clone always succeeds, and TREE_REPO_TOKEN is never exercised in CI. The secret can be missing, stale, or wrong-scoped and nobody notices — the workflow stays green.

The moment the tree repo flips to private (which happened here — first-tree-context is now private), the anonymous clone starts failing and the fallback path is exercised for the first time. If the token was never a working PAT with contents:read on the tree repo, every run fails overnight.

There's no signal during the public-to-private flip that anything is about to break. This is the "works until it silently doesn't" footgun.

Why this is a gardener concern, not a first-tree-repo concern

The workflow file header says:

Managed by first-tree gardener install-workflow. Regenerate with that command (re-run with --force) rather than hand-editing — the gardener skill may roll the template forward on upgrades.

So the template shape, its auth contract, and its failure mode are all gardener's. Every user who runs gardener install-workflow today inherits the same latent trap.

Proposed fix

Change the template in src/products/gardener/engine/install-workflow.ts so the Clone tree repo step:

  1. Always uses TREE_REPO_TOKEN (drop the anonymous-first attempt).
  2. Fails fast with a clear error if TREE_REPO_TOKEN is empty, naming the secret and the gh secret set command the user needs to run.

This mirrors how every other gh/git CI step in this repo authenticates, surfaces the failure at install time instead of at first-private-clone time, and makes the behavior identical for public and private tree repos.

Also worth considering as part of the same change:

  • Have gardener install-workflow print a post-install check: read the tree repo's visibility via gh repo view and tell the user whether TREE_REPO_TOKEN is required now and in the future.
  • Add a one-line note to the onboarding narrative's workflow-mode section so users who flip a tree repo private know to verify the secret.

Scope

Template change only. No runtime CLI behavior changes. Workflow regenerates cleanly with first-tree gardener install-workflow --force on any consuming repo.

Repro

  1. Install the workflow against any public tree repo — CI passes.
  2. Flip the tree repo to private.
  3. Every subsequent First-Tree Sync run fails at Clone tree repo unless TREE_REPO_TOKEN happens to be a working PAT with contents:read.

Reproduced right now on agent-team-foundation/first-treeagent-team-foundation/first-tree-context.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions