feat: rebuild orchestration with harness pattern, codex review gates, and provenance by zircote · Pull Request #2 · zircote/sigint

zircote · 2026-04-01T20:36:45Z

Summary

Extract research-orchestrator agent from monolithic start skill, enabling reuse across start, update, and augment
Add 4 blocking codex review gates (post-findings, post-merge, post-report, post-issues) with quarantine-on-failure
Enforce source provenance on every finding with full URL re-verification during codex review
Rewrite update from inline command to swarm-orchestrated skill with delta detection
Add schedule command for Desktop Scheduled Task management
Implement Anthropic long-running agent harness pattern: progress file, lineage tracking, session init protocol

New files (4)

agents/research-orchestrator.md -- orchestrator owning phase management, codex gates, delta detection
skills/update/SKILL.md -- swarm-based update with delta detection
skills/schedule/SKILL.md -- Desktop Scheduled Task wrapper
commands/schedule.md -- thin command shell

Modified files (11)

skills/start/SKILL.md -- gutted to thin launcher (-540 lines)
agents/dimension-analyst.md -- self-reflection, WebSearch retry, provenance, Atlatl tools
agents/report-synthesizer.md -- post-report codex gate
agents/issue-architect.md -- post-issues codex gate
commands/update.md -- delegates to skill with swarm tools
commands/resume.md -- harness init protocol, Atlatl tools
skills/trend-modeling/SKILL.md -- findings_trend_modeling key fix
skills/trend-analysis/SKILL.md -- key separation clarification
docs/reference/agents.md -- orchestrator docs, tool fixes, spawned-by fixes
evals/agents/research-orchestrator/evals.json -- 8 dimensions
.claude-plugin/plugin.json -- version 0.5.0

Test plan

… and provenance Extract research-orchestrator agent from start skill to enable reuse across start, update, and augment commands. Add four blocking codex review gates (post-findings, post-merge, post-report, post-issues) with quarantine-on-failure semantics. Enforce source provenance on every finding with full URL re-verification during review. Key changes: - New agents/research-orchestrator.md owns all phase management - skills/start/SKILL.md gutted to thin launcher (-540 lines) - Dimension-analyst gains self-reflection protocol, WebSearch retry, and inline provenance schema - New skills/update/SKILL.md with swarm orchestration + delta detection - New skills/schedule/SKILL.md for Desktop Scheduled Task management - Lineage tracking in state.json for full research provenance chain - research-progress.md rendered on phase transitions (harness pattern) - /sigint:resume reads progress file first (harness init protocol) - Blackboard dual-write (blackboard + file) as default, not fallback - Fix trend-modeling blackboard key conflict (findings_trend_modeling) - Plugin version bump to 0.5.0

Copilot

Pull request overview

This PR restructures sigint’s research workflow around a reusable research-orchestrator harness that owns phase management, codex review gates, provenance requirements, and continuity artifacts (progress + lineage), while turning existing commands/skills into thin launchers.

Changes:

Introduces agents/research-orchestrator.md implementing the harness pattern, provenance schema, delta detection, and multiple codex review gates.
Refactors /sigint:start and /sigint:update to delegate orchestration to the new agent; adds scheduling skill/command for recurring updates.
Clarifies blackboard key separation for trend analysis vs trend modeling; updates docs/evals/plugin version to 0.5.0.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
`agents/research-orchestrator.md`	New orchestrator agent defining phases, gates, provenance, delta detection, progress/lineage behaviors.
`skills/start/SKILL.md`	Simplified into a thin launcher that delegates to the orchestrator.
`skills/update/SKILL.md`	New update skill that loads prior state and delegates update-mode orchestration.
`skills/schedule/SKILL.md`	New scheduling skill for recurring update runs via Desktop Scheduled Tasks.
`commands/update.md`	Updated command to execute the update skill (pass-through arguments).
`commands/schedule.md`	New command shell delegating to the schedule skill.
`commands/resume.md`	Updated to prefer `research-progress.md` first (harness init protocol) and include lineage/quarantine context.
`agents/dimension-analyst.md`	Adds provenance requirements, retry protocol, self-reflection, and dual-write guidance.
`agents/report-synthesizer.md`	Adds a blocking post-report codex gate request flow.
`agents/issue-architect.md`	Adds a blocking post-issues codex gate request flow.
`skills/trend-analysis/SKILL.md`	Clarifies blackboard key usage.
`skills/trend-modeling/SKILL.md`	Fixes blackboard key to `findings_trend_modeling`.
`docs/reference/agents.md`	Documents orchestrator, tools, modes, and adds `trend_modeling` mapping.
`evals/agents/research-orchestrator/evals.json`	Updates eval to include `trend_modeling` as an eighth dimension.
`.claude-plugin/plugin.json`	Bumps plugin version to `0.5.0`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Address Codex adversarial review findings: 1. Update-mode merge now reconciles against prior state instead of blindly appending. UPDATED findings replace prior entries in-place, CONFIRMED findings get a last_confirmed timestamp, POTENTIALLY_REMOVED findings move to archived_findings[] instead of staying in the active array. Delta detection runs BEFORE merge to drive reconciliation. 2. Remove /sigint:schedule command and skill — no scheduler backend exists yet (Desktop Scheduled Task API is unspecified). Shipping a placeholder that silently pretends to succeed is worse than not shipping it. Will re-add when a concrete scheduler integration is available.

…, JSON, progress file - Fix --delta/--no-delta flag inconsistency: default enabled, --no-delta disables - Add --topic flag to update for explicit session selection - Make post-report and post-issues codex gates self-contained instead of blocking on team-lead SendMessage (eliminates deadlock risk) - Fix single-quoted JSON response templates to valid double-quoted JSON - Fix progress file: append status sections instead of regenerating (preserves audit trail from phase transition log entries)

Ran autoresearch improvement loops on all skills modified in this PR. Results: - start: 100% baseline (no changes needed) - update: 36% → 65% — clearer argument parsing, structured error paths showing planned workflow, explicit reconciliation semantics - trend-analysis: 94% → 100% — macro trend category labeling, emerging signal implication language, three-valued logic intro - trend-modeling: 83% → 98% — mandatory correlation notation, extended notation definition table, large model (7+ var) guidance Also adds evals for start (9 cases) and update (8 cases) skills that previously had none.

Copilot AI review requested due to automatic review settings April 1, 2026 20:36

Copilot started reviewing on behalf of zircote April 1, 2026 20:37 View session

Copilot AI reviewed Apr 1, 2026

View reviewed changes

zircote added 3 commits April 1, 2026 16:54

zircote merged commit bfd8f0c into main Apr 2, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: rebuild orchestration with harness pattern, codex review gates, and provenance#2

feat: rebuild orchestration with harness pattern, codex review gates, and provenance#2
zircote merged 4 commits intomainfrom
feat/orchestration-rebuild-v2

zircote commented Apr 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zircote commented Apr 1, 2026

Summary

New files (4)

Modified files (11)

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants