Skip to content

0.2.1: fix evo get hiding inherited gates + version-drift checks#10

Merged
alokwhitewolf merged 9 commits intomainfrom
fix/evo-get-inherited-gates
Apr 15, 2026
Merged

0.2.1: fix evo get hiding inherited gates + version-drift checks#10
alokwhitewolf merged 9 commits intomainfrom
fix/evo-get-inherited-gates

Conversation

@alokwhitewolf
Copy link
Copy Markdown
Collaborator

@alokwhitewolf alokwhitewolf commented Apr 15, 2026

Fixes #2, where evo get <id> returned the node's raw gates array from graph.json (own gates only), while evo run and evo gate list walk ancestry. Inspection output and actual enforcement disagreed: an agent could read evo get, see "gates": [], and conclude nothing was enforced while gates inherited from root were firing.

cmd_get now resolves gates via collect_gates_from_path, same as cmd_run. gates is the effective set; own_gates holds the raw per-node array.

Tests

tests/ previously contained only e2e.py, a hand-rolled script spawning subprocesses per test. Added:

  • tests/unit/test_core.py -- unit tests for collect_gates_from_path (inheritance, dedup on name collision, sibling isolation).
  • sdk/python/test/test_run.py -- Python SDK tests mirroring sdk/node/test/run.test.js. The Python SDK previously had no tests.

Lockstep versioning

All three published packages (evo-hq-cli, evo-hq-agent, @evo-hq/evo-agent) now share one version. Tag v<N> publishes all three; independent versions create foot-guns. SDK bumps without code changes are acceptable cost for the simpler mental model.

Seven version strings asserted equal on every PR by scripts/check_versions.py:

  • plugins/evo/pyproject.toml (evo-hq-cli)
  • plugins/evo/src/evo/__init__.py (__version__, what evo --version prints)
  • plugins/evo/.claude-plugin/plugin.json
  • plugins/evo/.codex-plugin/plugin.json
  • sdk/python/pyproject.toml (evo-hq-agent)
  • sdk/python/src/evo_agent/__init__.py
  • sdk/node/package.json (@evo-hq/evo-agent)

All bump to 0.2.1.

Runtime check for host/CLI drift

bin/evo-version-check compares installed CLI version against the plugin manifest. Discover skill's step 0 runs it instead of a bare evo --version. Catches hosts that have refetched a new plugin while the globally-installed CLI is old.

CI

New .github/workflows/ci.yml runs scripts/check_versions.py on every PR and push.

CLI publish automation

publish.yml previously only handled the two SDKs; evo-hq-cli was published manually for 0.2.0. New publish-cli job mirrors publish-python against plugins/evo/: runs check_versions.py, builds, asserts tag matches pyproject version, twine check, twine upload. Reuses PYPI_API_TOKEN.

Tag conventions:

  • v* -> all three (both SDKs + CLI)
  • cli-v* -> CLI only (new)
  • py-v* / node-v* -> unchanged; emergency-republish paths under lockstep

Build tested locally with uv build + twine check on plugins/evo/ -- both sdist and wheel pass metadata validation.

Tested

  • python3 tests/e2e.py -- 5 flows green. test_gate_flow now also asserts evo get output.
  • python3 tests/unit/test_core.py -- 7/7.
  • python3 sdk/python/test/test_run.py -- 6/6.
  • scripts/check_versions.py -- all 7 sources match.
  • bin/evo-version-check against a real mismatch -- exits 1 with the right uv tool install --force command.
  • uv build && twine check dist/* on plugins/evo/ -- both artifacts pass.

cmd_get was printing the node's raw gates array from graph.json, which
only contains gates directly attached to that node. cmd_gate list and
cmd_run both walk ancestry via collect_gates_from_path, so gate
inheritance was correct at execution time -- only the inspection output
disagreed.

Now cmd_get resolves gates the same way: gates holds the effective
(ancestor-walked) set, own_gates holds the raw per-node array for
anyone who needs it.

Adds unit tests covering collect_gates_from_path directly (inheritance,
dedup on name collision, sibling isolation, root-only case) and extends
test_gate_flow in e2e.py to assert the new evo get output shape.

Fixes #2
The Node SDK has test/run.test.js but the Python SDK had no tests at
all. Mirror the same cases (trace file emission, score aggregation,
Gate.check score/passed semantics) plus a few Python-only additions
(explicit finish score, finish idempotence, ValueError on Gate.check
with neither score nor passed).

Stdlib only, zero new deps. Run with python3 sdk/python/test/test_run.py.
Patch release for the evo get gates-inheritance fix (#2). SDK packages
stay at 0.2.0 -- no SDK code changed.
Claude Code and Codex refetch the plugin (and its bundled CLI
reference) based on the version in .claude-plugin/plugin.json and
.codex-plugin/plugin.json. Bumping the CLI alone leaves installed
hosts stuck on 0.2.0.
…uard

Two mechanisms, belt and suspenders:

1. scripts/check_versions.py asserts all four version strings
   (pyproject.toml, src/evo/__init__.py, both plugin.json manifests)
   agree. Runs in a new CI workflow on every PR and push.

2. plugins/evo/bin/evo-version-check compares the plugin manifest
   version against the installed CLI's `--version` output. Hosts
   refetch the plugin on version bumps but do not reinstall the
   globally-installed CLI, so drift is a real failure mode. The
   discover skill now runs this check at step 0 instead of a bare
   `evo --version`.

Also bumps src/evo/__init__.py to 0.2.1 -- it was missed in the
release commit, and was the fourth version string that the check
now covers (and immediately caught).
devin-ai-integration[bot]

This comment was marked as resolved.

@alokwhitewolf alokwhitewolf changed the title 0.2.1: fix evo get gates inheritance, catch version drift 0.2.1: fix evo get hiding inherited gates + version-drift checks Apr 15, 2026
The failure-path message hardcoded "Bump all three together" while
the list already has four entries. Use len(versions) for both the
success and failure prints.

Caught by Devin review on #10.
Previously publish.yml only handled the two SDKs (evo-hq-agent,
@evo-hq/evo-agent). The CLI was published manually for 0.2.0, leaving
no tag trace and no guarantee that the PyPI artifact matches the
committed source.

New job mirrors publish-python against plugins/evo/: runs
check_versions.py first, builds with `python -m build`, asserts tag
matches pyproject version, twine check, twine upload. Reuses
PYPI_API_TOKEN.

Tag conventions:
- v*      -> publishes both SDKs AND the CLI
- cli-v*  -> publishes only the CLI
- py-v*   -> unchanged (Python SDK only)
- node-v* -> unchanged (Node SDK only)

Build tested locally with `uv build` + `twine check` on plugins/evo:
both sdist and wheel pass metadata validation.
Convention: all three published packages (evo-hq-cli, evo-hq-agent,
@evo-hq/evo-agent) share one version. Tag v<N> publishes all three;
independent versions create foot-guns. SDK bumps without code changes
are acceptable cost for the simpler mental model.

Extends scripts/check_versions.py to cover three more sources:
- sdk/python/pyproject.toml
- sdk/python/src/evo_agent/__init__.py
- sdk/node/package.json

Seven version strings total, all asserted equal on every PR.
@alokwhitewolf alokwhitewolf merged commit b04b823 into main Apr 15, 2026
1 of 2 checks passed
alokwhitewolf added a commit that referenced this pull request Apr 15, 2026
The failure-path message hardcoded "Bump all three together" while
the list already has four entries. Use len(versions) for both the
success and failure prints.

Caught by Devin review on #10.
@alokwhitewolf alokwhitewolf deleted the fix/evo-get-inherited-gates branch April 15, 2026 17:27
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 7 additional findings in Devin Review.

Open in Devin Review

Comment on lines +44 to +49
CLI_VERSION=$(
printf '%s\n' "$CLI_OUTPUT" \
| grep -oE 'evo-hq-cli[[:space:]]+[0-9]+\.[0-9]+\.[0-9]+' \
| awk '{print $2}' \
| head -1
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 set -euo pipefail causes premature exit in evo-version-check, bypassing diagnostic error handling

The CLI_VERSION (line 44-49) and PLUGIN_VERSION (line 28-32) command substitutions use grep in pipelines. When grep finds no match (exit 1), set -o pipefail propagates the non-zero exit code, and set -e causes the script to exit immediately — before reaching the [ -z "$CLI_VERSION" ] check (line 50-53) that would print a diagnostic and exit 2.

This is most impactful for CLI_VERSION: when the user has the wrong evo binary (e.g., the unrelated SLAM package from PyPI), evo --version outputs something like evo 1.x, the grep -oE 'evo-hq-cli...' finds no match, and the script silently exits with code 1 — no stderr message at all. The SKILL.md (plugins/evo/skills/discover/SKILL.md:32) interprets exit 1 as "version mismatch" and instructs the agent to show stderr verbatim (which is empty), producing a confusing user experience.

Verified with bash 5.x

bash -c 'set -euo pipefail; X=$(echo "hello" | grep -o "world" | head -1); echo reached' exits with code 1 without printing "reached". Adding || true at the end of the pipeline fixes it.

Suggested change
CLI_VERSION=$(
printf '%s\n' "$CLI_OUTPUT" \
| grep -oE 'evo-hq-cli[[:space:]]+[0-9]+\.[0-9]+\.[0-9]+' \
| awk '{print $2}' \
| head -1
)
CLI_VERSION=$(
printf '%s\n' "$CLI_OUTPUT" \
| grep -oE 'evo-hq-cli[[:space:]]+[0-9]+\.[0-9]+\.[0-9]+' \
| awk '{print $2}' \
| head -1 \
|| true
)
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

evo get <id>: gates field shows [] even when experiment inherits gates

1 participant