fix(startup): repair juniper_plant_all.bash for current service contracts (Pass 1)#235
Merged
fix(startup): repair juniper_plant_all.bash for current service contracts (Pass 1)#235
Conversation
…acts Pass 1 of the 2026-05-07 startup/shutdown scripts audit. Addresses two service-blocking failures and three degraded-config issues uncovered while validating util/juniper_plant_all.bash against the current state of each target repo. BROKEN fixes: - #1 juniper-cascor-worker now receives the required CASCOR_SERVER_URL, derived from JUNIPER_CASCOR_HOST/PORT (override-friendly). Worker exited immediately on launch because the script set zero env vars; the prior 2-second kill -0 check only emitted a WARNING and let plant_all report success. - #2 juniper-canopy default conda env switched from JuniperCanopy to JuniperCanopy1, which carries the LIBTORCH-strip activate hook needed to prevent the rust_mudgeon LIBTORCH from preempting the env's torch. JUNIPER_CANOPY_CONDA still respects caller-provided overrides. DEGRADED fixes: - #3 juniper-canopy now receives canonical pydantic-prefixed env vars (JUNIPER_CANOPY_CASCOR_SERVICE_URL, JUNIPER_CANOPY_JUNIPER_DATA_URL) rather than the deprecated CASCOR_SERVICE_URL alias. - #5 worker health is now probed via /v1/health/ready against the worker's HTTP health listener (default 127.0.0.1:8210) — same shape as the other three services. systemd code path also updated. - #6 pre-flight block validates JuniperCascor conda env, the juniper-cascor-worker console-script binary, and the worker's health-listener port before any service is launched. Adds tests/test_juniper_plant_all.py (20 tests, all passing) covering: - script bash syntax (bash -n) - canopy conda env default - worker env-var wiring (CASCOR_SERVER_URL, health port, auth token) - health URL composition for both nohup and systemd code paths - canopy canonical env-var rename and legacy-alias removal - pre-flight worker conda env / binary / port checks - end-to-end smoke test: missing worker binary aborts pre-flight before any service is launched (synthetic JUNIPER_PROJECT_DIR / JUNIPER_CONDA_DIR fixture). Audit document at notes/STARTUP_SHUTDOWN_SCRIPTS_AUDIT_2026-05-07.md captures all 12 findings, severity grading, source-of-truth references, and the Pass 2 roadmap (NIT-class items #4, #7-#11). shellcheck and pre-commit clean. Full test suite (112 tests) passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
pcalnon
added a commit
that referenced
this pull request
May 7, 2026
fix(startup): Pass 2 nit-class refinements (stacked on PR #235)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Pass 1 of the 2026-05-07 startup/shutdown scripts audit. Fixes the BROKEN and DEGRADED items in
util/juniper_plant_all.bashso that the host-level orchestration script matches the current state of the four target services (juniper-data, juniper-cascor, juniper-canopy, juniper-cascor-worker).Full audit, including the Pass 2 (NIT) roadmap, in
notes/STARTUP_SHUTDOWN_SCRIPTS_AUDIT_2026-05-07.md.What's fixed
🔴 BROKEN
CASCOR_SERVER_URL. Worker exited immediately on launch (config validation error injuniper_cascor_worker/config.py:153–156); the prior 2-secondkill -0check only emitted a WARNING and letplant_allreport success. Now derivesws://${JUNIPER_CASCOR_HOST}:${JUNIPER_CASCOR_PORT}/ws/v1/workers(override-friendly) and passes through optionalCASCOR_AUTH_TOKEN.JuniperCanopytoJuniperCanopy1. OnlyJuniperCanopy1has00_isolate_from_tch_rs.shin itsactivate.d/, which prevents the rust_mudgeonLIBTORCHfrom preempting the env's torch and breaking ~770 canopy tests.JUNIPER_CANOPY_CONDAstill respects caller overrides.🟡 DEGRADED
JUNIPER_CANOPY_CASCOR_SERVICE_URL,JUNIPER_CANOPY_JUNIPER_DATA_URL) rather than the deprecatedCASCOR_SERVICE_URLlegacy alias./v1/health/readyagainst the worker's HTTP health listener (default127.0.0.1:8210) — same shape as the other three services. Bothnohupandsystemdcode paths updated.Out of scope (Pass 2)
NIT items #4, #7–#11 (cascor-host export, data uvicorn host honoring, deferred uvicorn pre-flight, pid-file format hardening, chop worker grep tightening) are tracked in the audit document and will land in a follow-up PR. #12 (intentional duplicate echo placeholders) stays as-is per memory
feedback_chop_all_echo_debug.Files changed
util/juniper_plant_all.bash— +59 / −28tests/test_juniper_plant_all.py— new file (20 tests)notes/STARTUP_SHUTDOWN_SCRIPTS_AUDIT_2026-05-07.md— new fileTest plan
bash -n util/juniper_plant_all.bashpassesshellcheckclean on both plant and chop scriptspre-commit run --files <changed>clean (black, isort, flake8, mypy, bandit, shellcheck)python3 -m unittest tests.test_juniper_plant_all— 20/20 tests passpython3 -m unittest discover tests— 112/112 tests passbash scripts/test_resume_file_safety.bashpassesutil/juniper_plant_all.bashend-to-end on a host with all four envs present and confirming/v1/healthreturns 200 on all four ports (8100, 8201, 8050, 8210)🤖 Generated with Claude Code