Summary
Post-merge follow-up from PR #194.
The resolver changes in evals/framework/grader.py fixed the immediate red CI and stale legacy-bridge issues, but two additional review concerns remain open and should be handled explicitly in a follow-up:
-
Respect configured config.git.runtime.mode before cross-mode fallbacks.
- Current resolver checks project-visible sessions before legacy fallback regardless of active mode.
- In mixed-layout repos with stale
.specwright-local state, grading could prefer the wrong runtime mode.
-
Resolve the session for the current worktree instead of the first glob match.
- When
worktrees/*/session.json is globbed, the first attached workflow can belong to a different worktree/helper worktree.
- Grader state/gate checks should prefer the session that matches the current checkout context (for example
worktreePath / worktree identity).
Why this is deferred
These are correctness hardening items, but they were not needed to release the workflow-legibility/devex improvement. PR #194 was merged once CI was green and the primary operator-surface/runtime blockers were fixed.
Suggested scope
- tighten resolver order by active runtime mode
- filter project-visible sessions to the current worktree
- add regression coverage for mixed-mode stale state and multi-worktree helper session ordering
Source
Summary
Post-merge follow-up from PR #194.
The resolver changes in
evals/framework/grader.pyfixed the immediate red CI and stale legacy-bridge issues, but two additional review concerns remain open and should be handled explicitly in a follow-up:Respect configured
config.git.runtime.modebefore cross-mode fallbacks..specwright-localstate, grading could prefer the wrong runtime mode.Resolve the session for the current worktree instead of the first glob match.
worktrees/*/session.jsonis globbed, the first attached workflow can belong to a different worktree/helper worktree.worktreePath/ worktree identity).Why this is deferred
These are correctness hardening items, but they were not needed to release the workflow-legibility/devex improvement. PR #194 was merged once CI was green and the primary operator-surface/runtime blockers were fixed.
Suggested scope
Source
evals/framework/grader.py