Add the slim staged workflow chain#8
Open
guangyucoder wants to merge 16 commits intomainfrom
Open
Conversation
#### Context The config layer had grown around NimbleOptions and a large getter surface. This branch moves parsing to an Ecto schema and simplifies callers to read typed nested settings directly. #### TL;DR *Replace the bespoke config access layer with an Ecto-backed schema and typed settings access.* #### Summary - Replace the old config extraction path with an Ecto embedded schema and defaults - Collapse `Config` down to `settings/0`, `settings!/0`, and a small runtime helper surface - Update callers and tests to read nested config structs directly instead of one-off getter wrappers - Keep the repo green under format, lint, coverage, and dialyzer by adding the missing schema specs and tests #### Alternatives - Keep NimbleOptions and shrink the wrapper layer incrementally, but that would preserve the custom parser shape - Keep the strict refactor without extra schema tests, but that left `make -C elixir all` red on coverage and dialyzer #### Test Plan - [x] `make -C elixir all` - [x] `cd elixir && mix test test/symphony_elixir/workspace_and_config_test.exs --warnings-as-errors`
#### Context The schema and tests now treat tracker state lists as YAML lists and only lowercase state-limit keys, but `SPEC.md` still documented older and stricter behavior. #### TL;DR *Align `SPEC.md` with the current workflow config and state normalization behavior.* #### Summary - document tracker `active_states` and `terminal_states` as YAML string lists - update the cheat sheet defaults to use list examples instead of CSV strings - remove trim-based state normalization language from the spec #### Alternatives - restore support for comma-separated tracker state strings in the schema - keep trim-based normalization in the spec and change the implementation #### Test Plan - [x] `make -C elixir all` - [x] Docs-only diff reviewed against current schema and tests
#### Context Symphony Elixir had a few correctness gaps around retry scheduling, refresh coalescing, workspace path handling, Codex turn policy forwarding, and running-issue refreshes. #### TL;DR *Stabilize Symphony Elixir orchestration, workspace safety, and Codex policy passthrough.* #### Summary - Ignore stale retry timers and coalesce superseded poll ticks in the orchestrator. - Canonicalize workspace paths and clarify last-known-good workflow reload behavior. - Pass explicit Codex turn sandbox policies through unchanged and add integration coverage. - Paginate Linear by-id issue-state refreshes so running issue reconciliation sees all IDs. #### Alternatives - Keep local validation and rewriting of explicit turn sandbox policies, but that drifted from Codex semantics and broke documented policy types. - Keep single-request by-id issue refreshes, but that truncates after 50 issues and can stop healthy workers during reconciliation. #### Test Plan - [x] `make -C elixir all` - [x] Targeted regression coverage for AppServer turn policy passthrough and Linear by-id pagination
#### Context Add a real end-to-end path for Symphony Elixir that exercises Linear and Codex together, and make it easy to invoke explicitly from one Make target. #### TL;DR *Add a real live Linear/Codex E2E test plus a `make e2e` entrypoint.* #### Summary - Add an opt-in live Elixir test that creates a real Linear project and issue. - Run a real Codex turn and verify both workspace output and Linear side effects. - Require Codex to post a Linear comment and close the issue before completion. - Add `make e2e` with clear failures for missing `LINEAR_API_KEY` or `codex`. - Document the new E2E entrypoint and behavior in the Elixir README. #### Alternatives - Keep using ad hoc `mix test` commands and environment toggles. - Keep shimming Linear or post-processing issue state locally. - Those keep the happy path less reproducible and do not prove Codex can mutate Linear itself during the turn. #### Test Plan - [x] `make -C elixir all` - [x] `env -u LINEAR_API_KEY make -C elixir e2e` - [x] `LINEAR_API_KEY=$(tr -d '\r\n' < ~/.linear_api_key) SYMPHONY_RUN_LIVE_E2E=1 mix test test/symphony_elixir/live_e2e_test.exs`
#### Context Symphony needs to run tickets on SSH workers, keep the live E2E reliable, and preserve correct remote path semantics. #### TL;DR *Add SSH workers, Docker-backed SSH live E2E coverage, per-host caps, and correct worker-side `~` resolution.* #### Summary - Add SSH worker execution, remote workspace handling, and SSH app-server launch support in Elixir. - Add live E2E coverage for both local workers and SSH workers, with Docker-backed SSH workers by default. - Add an optional shared `worker.max_concurrent_agents_per_host` cap and document the SSH extension. - Keep `workspace.root` raw so SSH workers resolve `~` on the worker, while local paths expand at local use sites. #### Alternatives - Keep expanding `workspace.root` in config and carry duplicate raw path state, but that complicates the config model. - Require only absolute remote workspace roots, but that makes real worker setup less ergonomic and less representative. #### Test Plan - [x] `make -C elixir all` - [x] `cd elixir && env -u SYMPHONY_LIVE_SSH_WORKER_HOSTS LINEAR_API_KEY="$(tr -d '\r\n' < ~/.linear_api_key)" SYMPHONY_RUN_LIVE_E2E=1 mix test test/symphony_elixir/live_e2e_test.exs:128`
#### Context The Codex app-server merges stderr into stdout (`stderr_to_stdout`). Non-JSON diagnostic lines written by Codex during healthy turns were incorrectly emitted as `:malformed` protocol events in the Symphony UI. <img width="2496" height="446" alt="Screenshot 2026-03-12 at 4 27 20 PM" src="https://github.com/user-attachments/assets/3d85fa5b-c0d9-4079-87c6-850c7ccf7ee2" /> #### TL;DR *Only emit `:malformed` events for JSON-like protocol frames (lines starting with `{`); silently log all other non-JSON stream output.* #### Summary - Gate `:malformed` event emission on a new `protocol_message_candidate?/1` check — only lines starting with `{` - Preserve full logging of all non-JSON stderr noise via `log_non_json_stream_line` - Fix existing "Capture stderr log" test to assert no `:malformed` event is emitted for stderr noise - Add regression test verifying truncated JSON-like frames still surface as `:malformed` #### Alternatives - Separate stderr from stdout with a dedicated pipe: more invasive change, unnecessary given the simple heuristic - Suppress all non-JSON output silently: would lose valuable diagnostic logging #### Test Plan - [x] `make -C elixir all` - [x] `mise exec -- mix format lib/symphony_elixir/codex/app_server.ex test/symphony_elixir/app_server_test.exs` - [x] Additional: `MIX_ENV=test mise exec -- mix run --no-start -e 'Application.put_env(:symphony_elixir, :workflow_file_path, System.fetch_env!("SYMPHONY_TEST_WORKFLOW")); Mix.Task.run("test", ["test/symphony_elixir/app_server_test.exs"])'` Co-authored-by: Codex <noreply@openai.com>
#### Context SSH failover currently happens inside `AgentRunner`, which can bypass per-host caps and rerun a ticket on a second host after the first host already started setup. #### TL;DR *Keep each worker run on one SSH host and let the orchestrator own retries.* #### Summary - Remove `AgentRunner`'s internal cross-host failover loop - Keep a worker lifetime pinned to one selected SSH host - Add a regression test proving startup failure on one host does not fall through to another #### Alternatives - Keep internal failover and classify retryable errors, but that duplicates orchestrator scheduling logic - Only patch per-host cap enforcement, but invisible cross-host reruns would still risk duplicate side effects #### Test Plan - [x] `make -C elixir all` - [x] `cd elixir && mise exec -- mix test test/symphony_elixir/core_test.exs:1167 test/symphony_elixir/core_test.exs:1237 test/symphony_elixir/core_test.exs:1337 --seed 0`
## Summary Pin floating external GitHub Actions workflow refs to immutable SHAs. ## Why See the rationale doc: https://docs.google.com/document/d/1qOURCNx2zszQ0uWx7Fj5ERu4jpiYjxLVWBWgKa2wTsA/edit?tab=t.0 ## Validation - `rg -n --pcre2 "uses:\s*(?!\./)(?!docker://)[^#\n]+@(?![0-9a-f]{40}(?:\s+#.*)?$)\S+" .github/workflows` - `git diff --check` - `git diff --stat -- .github/workflows`
Slim redesign to replace the current fork's ~2500 lines of machine-layer (DispatchResolver / Closeout / unit system) with ~240 lines of marker-based stage routing + 4 closeout gates + 3 WORKFLOW.md files. Upstream delta targeted at ~18 lines (PromptBuilder workflow_path opt).
When opts[:workflow_path] is provided, load that file explicitly via Workflow.load/1 instead of Workflow.current(), and raise on blank body instead of falling back to the default implement prompt. When absent, existing behavior (Workflow.current + default_prompt fallback) is preserved. Enables stage-specific WORKFLOW-*.md files in the upcoming StageOrchestrator. See docs/design/stage-workflow-chain.md §2.7. 4/4 tests pass.
WORKFLOW-review.md: second-opinion reviewer prompt. Reads diff, optional visual review on frontend paths, writes code-review marker in Linear workpad BEGIN/END section. Must not commit; pre-marker requires clean working tree + HEAD unchanged from stage start. WORKFLOW-docfix.md: docs sweeper prompt. Only *.md / docs/**. no-updates path writes marker immediately; updated path commits first (docs: prefix) then writes marker with post-commit HEAD. Pre-marker requires clean working tree. Both files contain the shared Marker Contract (STRICT) block per §3.1. See docs/design/stage-workflow-chain.md §3.2 and §3.3.
Parses symphony-marker fenced blocks inside the <!-- SYMPHONY-MARKERS-BEGIN --> / END region of a Linear workpad comment. Silently drops invalid blocks (bad YAML, missing required field, wrong value). Archived blocks (symphony-marker-archived) and blocks outside the region are ignored. Public API: - parse/2: returns valid markers in text order, filtered by issue_identifier - review_pending?/1: last (review-request|code-review) in current round is a review-request - latest_code_review/1: last code-review in current round - latest_review_sha/1: reviewed_sha of latest CLEAN code-review - docs_checked_matches_review?/1: last docs-checked in current round matches latest clean review sha Schema: common fields (kind, round_id, stage_round, reviewed_sha, issue_identifier) + per-kind (code-review verdict, docs-checked docfix_outcome). Optional code-review findings parsed best-effort — malformed findings do not drop the whole marker. See docs/design/stage-workflow-chain.md §2.5. 18/18 tests pass.
Summary: - add an optional after_implement hook and workspace helper for stage-aware post-implement checks - append review-request markers automatically after implement turns when the hook passes and the workspace is clean - align WORKFLOW.md and tests with machine-owned marker handoff, and fix prompt_builder formatting plus marker_parser specs Rationale: - move the repetitive implement-to-review handoff into the harness so the prompt can stay focused on engineering work - keep the change narrow by reusing the existing orchestrator exit path instead of introducing another execution pipeline - leave implement failures in the active loop with workpad narration rather than forcing Rework for transient validation failures Tests: - mix test (279 tests, 0 failures, 2 skipped) - make -C elixir all (currently fails on existing Credo debt on slim, including pre-existing StageCloseout/StageOrchestrator findings) Co-authored-by: Codex <codex@openai.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
The slim branch replaces heavier workflow machinery with a smaller staged chain built around implement/review/doc-fix prompts, bounded workpad markers, and orchestrator-driven routing.
TL;DR
Add the slim staged workflow chain and automate implement-to-review handoff.
Summary
Alternatives
WORKFLOW.md, but that leaves more repetitive behavior in the prompt and drifts from harness behavior.Test Plan
make -C elixir allcurrently fails on existing Credo debt onslim, including pre-existingStageCloseout/StageOrchestratorfindingsmix testmix test test/symphony_elixir/core_test.exs test/symphony_elixir/workspace_and_config_test.exs test/symphony_elixir/prompt_builder_test.exs