0.2.1: fix evo get hiding inherited gates + version-drift checks#10
0.2.1: fix evo get hiding inherited gates + version-drift checks#10alokwhitewolf merged 9 commits intomainfrom
Conversation
cmd_get was printing the node's raw gates array from graph.json, which only contains gates directly attached to that node. cmd_gate list and cmd_run both walk ancestry via collect_gates_from_path, so gate inheritance was correct at execution time -- only the inspection output disagreed. Now cmd_get resolves gates the same way: gates holds the effective (ancestor-walked) set, own_gates holds the raw per-node array for anyone who needs it. Adds unit tests covering collect_gates_from_path directly (inheritance, dedup on name collision, sibling isolation, root-only case) and extends test_gate_flow in e2e.py to assert the new evo get output shape. Fixes #2
The Node SDK has test/run.test.js but the Python SDK had no tests at all. Mirror the same cases (trace file emission, score aggregation, Gate.check score/passed semantics) plus a few Python-only additions (explicit finish score, finish idempotence, ValueError on Gate.check with neither score nor passed). Stdlib only, zero new deps. Run with python3 sdk/python/test/test_run.py.
Patch release for the evo get gates-inheritance fix (#2). SDK packages stay at 0.2.0 -- no SDK code changed.
Claude Code and Codex refetch the plugin (and its bundled CLI reference) based on the version in .claude-plugin/plugin.json and .codex-plugin/plugin.json. Bumping the CLI alone leaves installed hosts stuck on 0.2.0.
…uard Two mechanisms, belt and suspenders: 1. scripts/check_versions.py asserts all four version strings (pyproject.toml, src/evo/__init__.py, both plugin.json manifests) agree. Runs in a new CI workflow on every PR and push. 2. plugins/evo/bin/evo-version-check compares the plugin manifest version against the installed CLI's `--version` output. Hosts refetch the plugin on version bumps but do not reinstall the globally-installed CLI, so drift is a real failure mode. The discover skill now runs this check at step 0 instead of a bare `evo --version`. Also bumps src/evo/__init__.py to 0.2.1 -- it was missed in the release commit, and was the fourth version string that the check now covers (and immediately caught).
The failure-path message hardcoded "Bump all three together" while the list already has four entries. Use len(versions) for both the success and failure prints. Caught by Devin review on #10.
Previously publish.yml only handled the two SDKs (evo-hq-agent, @evo-hq/evo-agent). The CLI was published manually for 0.2.0, leaving no tag trace and no guarantee that the PyPI artifact matches the committed source. New job mirrors publish-python against plugins/evo/: runs check_versions.py first, builds with `python -m build`, asserts tag matches pyproject version, twine check, twine upload. Reuses PYPI_API_TOKEN. Tag conventions: - v* -> publishes both SDKs AND the CLI - cli-v* -> publishes only the CLI - py-v* -> unchanged (Python SDK only) - node-v* -> unchanged (Node SDK only) Build tested locally with `uv build` + `twine check` on plugins/evo: both sdist and wheel pass metadata validation.
Convention: all three published packages (evo-hq-cli, evo-hq-agent, @evo-hq/evo-agent) share one version. Tag v<N> publishes all three; independent versions create foot-guns. SDK bumps without code changes are acceptable cost for the simpler mental model. Extends scripts/check_versions.py to cover three more sources: - sdk/python/pyproject.toml - sdk/python/src/evo_agent/__init__.py - sdk/node/package.json Seven version strings total, all asserted equal on every PR.
The failure-path message hardcoded "Bump all three together" while the list already has four entries. Use len(versions) for both the success and failure prints. Caught by Devin review on #10.
| CLI_VERSION=$( | ||
| printf '%s\n' "$CLI_OUTPUT" \ | ||
| | grep -oE 'evo-hq-cli[[:space:]]+[0-9]+\.[0-9]+\.[0-9]+' \ | ||
| | awk '{print $2}' \ | ||
| | head -1 | ||
| ) |
There was a problem hiding this comment.
🔴 set -euo pipefail causes premature exit in evo-version-check, bypassing diagnostic error handling
The CLI_VERSION (line 44-49) and PLUGIN_VERSION (line 28-32) command substitutions use grep in pipelines. When grep finds no match (exit 1), set -o pipefail propagates the non-zero exit code, and set -e causes the script to exit immediately — before reaching the [ -z "$CLI_VERSION" ] check (line 50-53) that would print a diagnostic and exit 2.
This is most impactful for CLI_VERSION: when the user has the wrong evo binary (e.g., the unrelated SLAM package from PyPI), evo --version outputs something like evo 1.x, the grep -oE 'evo-hq-cli...' finds no match, and the script silently exits with code 1 — no stderr message at all. The SKILL.md (plugins/evo/skills/discover/SKILL.md:32) interprets exit 1 as "version mismatch" and instructs the agent to show stderr verbatim (which is empty), producing a confusing user experience.
Verified with bash 5.x
bash -c 'set -euo pipefail; X=$(echo "hello" | grep -o "world" | head -1); echo reached' exits with code 1 without printing "reached". Adding || true at the end of the pipeline fixes it.
| CLI_VERSION=$( | |
| printf '%s\n' "$CLI_OUTPUT" \ | |
| | grep -oE 'evo-hq-cli[[:space:]]+[0-9]+\.[0-9]+\.[0-9]+' \ | |
| | awk '{print $2}' \ | |
| | head -1 | |
| ) | |
| CLI_VERSION=$( | |
| printf '%s\n' "$CLI_OUTPUT" \ | |
| | grep -oE 'evo-hq-cli[[:space:]]+[0-9]+\.[0-9]+\.[0-9]+' \ | |
| | awk '{print $2}' \ | |
| | head -1 \ | |
| || true | |
| ) |
Was this helpful? React with 👍 or 👎 to provide feedback.
Fixes #2, where
evo get <id>returned the node's rawgatesarray fromgraph.json(own gates only), whileevo runandevo gate listwalk ancestry. Inspection output and actual enforcement disagreed: an agent could readevo get, see"gates": [], and conclude nothing was enforced while gates inherited from root were firing.cmd_getnow resolves gates viacollect_gates_from_path, same ascmd_run.gatesis the effective set;own_gatesholds the raw per-node array.Tests
tests/previously contained onlye2e.py, a hand-rolled script spawning subprocesses per test. Added:tests/unit/test_core.py-- unit tests forcollect_gates_from_path(inheritance, dedup on name collision, sibling isolation).sdk/python/test/test_run.py-- Python SDK tests mirroringsdk/node/test/run.test.js. The Python SDK previously had no tests.Lockstep versioning
All three published packages (
evo-hq-cli,evo-hq-agent,@evo-hq/evo-agent) now share one version. Tagv<N>publishes all three; independent versions create foot-guns. SDK bumps without code changes are acceptable cost for the simpler mental model.Seven version strings asserted equal on every PR by
scripts/check_versions.py:plugins/evo/pyproject.toml(evo-hq-cli)plugins/evo/src/evo/__init__.py(__version__, whatevo --versionprints)plugins/evo/.claude-plugin/plugin.jsonplugins/evo/.codex-plugin/plugin.jsonsdk/python/pyproject.toml(evo-hq-agent)sdk/python/src/evo_agent/__init__.pysdk/node/package.json(@evo-hq/evo-agent)All bump to 0.2.1.
Runtime check for host/CLI drift
bin/evo-version-checkcompares installed CLI version against the plugin manifest. Discover skill's step 0 runs it instead of a bareevo --version. Catches hosts that have refetched a new plugin while the globally-installed CLI is old.CI
New
.github/workflows/ci.ymlrunsscripts/check_versions.pyon every PR and push.CLI publish automation
publish.ymlpreviously only handled the two SDKs;evo-hq-cliwas published manually for 0.2.0. Newpublish-clijob mirrorspublish-pythonagainstplugins/evo/: runscheck_versions.py, builds, asserts tag matches pyproject version,twine check,twine upload. ReusesPYPI_API_TOKEN.Tag conventions:
v*-> all three (both SDKs + CLI)cli-v*-> CLI only (new)py-v*/node-v*-> unchanged; emergency-republish paths under lockstepBuild tested locally with
uv build+twine checkonplugins/evo/-- both sdist and wheel pass metadata validation.Tested
python3 tests/e2e.py-- 5 flows green.test_gate_flownow also assertsevo getoutput.python3 tests/unit/test_core.py-- 7/7.python3 sdk/python/test/test_run.py-- 6/6.scripts/check_versions.py-- all 7 sources match.bin/evo-version-checkagainst a real mismatch -- exits 1 with the rightuv tool install --forcecommand.uv build && twine check dist/*onplugins/evo/-- both artifacts pass.