Skip to content

feat: rebuild orchestration with harness pattern, codex review gates, and provenance#2

Merged
zircote merged 4 commits intomainfrom
feat/orchestration-rebuild-v2
Apr 2, 2026
Merged

feat: rebuild orchestration with harness pattern, codex review gates, and provenance#2
zircote merged 4 commits intomainfrom
feat/orchestration-rebuild-v2

Conversation

@zircote
Copy link
Copy Markdown
Owner

@zircote zircote commented Apr 1, 2026

Summary

  • Extract research-orchestrator agent from monolithic start skill, enabling reuse across start, update, and augment
  • Add 4 blocking codex review gates (post-findings, post-merge, post-report, post-issues) with quarantine-on-failure
  • Enforce source provenance on every finding with full URL re-verification during codex review
  • Rewrite update from inline command to swarm-orchestrated skill with delta detection
  • Add schedule command for Desktop Scheduled Task management
  • Implement Anthropic long-running agent harness pattern: progress file, lineage tracking, session init protocol

New files (4)

  • agents/research-orchestrator.md -- orchestrator owning phase management, codex gates, delta detection
  • skills/update/SKILL.md -- swarm-based update with delta detection
  • skills/schedule/SKILL.md -- Desktop Scheduled Task wrapper
  • commands/schedule.md -- thin command shell

Modified files (11)

  • skills/start/SKILL.md -- gutted to thin launcher (-540 lines)
  • agents/dimension-analyst.md -- self-reflection, WebSearch retry, provenance, Atlatl tools
  • agents/report-synthesizer.md -- post-report codex gate
  • agents/issue-architect.md -- post-issues codex gate
  • commands/update.md -- delegates to skill with swarm tools
  • commands/resume.md -- harness init protocol, Atlatl tools
  • skills/trend-modeling/SKILL.md -- findings_trend_modeling key fix
  • skills/trend-analysis/SKILL.md -- key separation clarification
  • docs/reference/agents.md -- orchestrator docs, tool fixes, spawned-by fixes
  • evals/agents/research-orchestrator/evals.json -- 8 dimensions
  • .claude-plugin/plugin.json -- version 0.5.0

Test plan

  • research-orchestrator spawns from start/update/augment
  • dimension-analyst self-reflection executes
  • provenance records in every finding
  • codex review gates fire at all 4 points
  • gate failures quarantine to quarantine.json
  • lineage array grows with each action
  • research-progress.md generated and readable by resume
  • delta detection classifies correctly
  • trend-modeling uses findings_trend_modeling key
  • blackboard dual-write produces files

… and provenance

Extract research-orchestrator agent from start skill to enable reuse
across start, update, and augment commands. Add four blocking codex
review gates (post-findings, post-merge, post-report, post-issues)
with quarantine-on-failure semantics. Enforce source provenance on
every finding with full URL re-verification during review.

Key changes:
- New agents/research-orchestrator.md owns all phase management
- skills/start/SKILL.md gutted to thin launcher (-540 lines)
- Dimension-analyst gains self-reflection protocol, WebSearch retry,
  and inline provenance schema
- New skills/update/SKILL.md with swarm orchestration + delta detection
- New skills/schedule/SKILL.md for Desktop Scheduled Task management
- Lineage tracking in state.json for full research provenance chain
- research-progress.md rendered on phase transitions (harness pattern)
- /sigint:resume reads progress file first (harness init protocol)
- Blackboard dual-write (blackboard + file) as default, not fallback
- Fix trend-modeling blackboard key conflict (findings_trend_modeling)
- Plugin version bump to 0.5.0
Copilot AI review requested due to automatic review settings April 1, 2026 20:36
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restructures sigint’s research workflow around a reusable research-orchestrator harness that owns phase management, codex review gates, provenance requirements, and continuity artifacts (progress + lineage), while turning existing commands/skills into thin launchers.

Changes:

  • Introduces agents/research-orchestrator.md implementing the harness pattern, provenance schema, delta detection, and multiple codex review gates.
  • Refactors /sigint:start and /sigint:update to delegate orchestration to the new agent; adds scheduling skill/command for recurring updates.
  • Clarifies blackboard key separation for trend analysis vs trend modeling; updates docs/evals/plugin version to 0.5.0.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
agents/research-orchestrator.md New orchestrator agent defining phases, gates, provenance, delta detection, progress/lineage behaviors.
skills/start/SKILL.md Simplified into a thin launcher that delegates to the orchestrator.
skills/update/SKILL.md New update skill that loads prior state and delegates update-mode orchestration.
skills/schedule/SKILL.md New scheduling skill for recurring update runs via Desktop Scheduled Tasks.
commands/update.md Updated command to execute the update skill (pass-through arguments).
commands/schedule.md New command shell delegating to the schedule skill.
commands/resume.md Updated to prefer research-progress.md first (harness init protocol) and include lineage/quarantine context.
agents/dimension-analyst.md Adds provenance requirements, retry protocol, self-reflection, and dual-write guidance.
agents/report-synthesizer.md Adds a blocking post-report codex gate request flow.
agents/issue-architect.md Adds a blocking post-issues codex gate request flow.
skills/trend-analysis/SKILL.md Clarifies blackboard key usage.
skills/trend-modeling/SKILL.md Fixes blackboard key to findings_trend_modeling.
docs/reference/agents.md Documents orchestrator, tools, modes, and adds trend_modeling mapping.
evals/agents/research-orchestrator/evals.json Updates eval to include trend_modeling as an eighth dimension.
.claude-plugin/plugin.json Bumps plugin version to 0.5.0.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread skills/update/SKILL.md Outdated
Comment thread skills/update/SKILL.md Outdated
Comment thread skills/schedule/SKILL.md Outdated
Comment thread agents/report-synthesizer.md Outdated
Comment thread agents/issue-architect.md Outdated
Comment thread agents/research-orchestrator.md Outdated
Comment thread agents/research-orchestrator.md
zircote added 3 commits April 1, 2026 16:54
Address Codex adversarial review findings:

1. Update-mode merge now reconciles against prior state instead of
   blindly appending. UPDATED findings replace prior entries in-place,
   CONFIRMED findings get a last_confirmed timestamp, POTENTIALLY_REMOVED
   findings move to archived_findings[] instead of staying in the active
   array. Delta detection runs BEFORE merge to drive reconciliation.

2. Remove /sigint:schedule command and skill — no scheduler backend
   exists yet (Desktop Scheduled Task API is unspecified). Shipping a
   placeholder that silently pretends to succeed is worse than not
   shipping it. Will re-add when a concrete scheduler integration is
   available.
…, JSON, progress file

- Fix --delta/--no-delta flag inconsistency: default enabled, --no-delta disables
- Add --topic flag to update for explicit session selection
- Make post-report and post-issues codex gates self-contained instead of
  blocking on team-lead SendMessage (eliminates deadlock risk)
- Fix single-quoted JSON response templates to valid double-quoted JSON
- Fix progress file: append status sections instead of regenerating
  (preserves audit trail from phase transition log entries)
Ran autoresearch improvement loops on all skills modified in this PR.
Results:

- start: 100% baseline (no changes needed)
- update: 36% → 65% — clearer argument parsing, structured error
  paths showing planned workflow, explicit reconciliation semantics
- trend-analysis: 94% → 100% — macro trend category labeling,
  emerging signal implication language, three-valued logic intro
- trend-modeling: 83% → 98% — mandatory correlation notation,
  extended notation definition table, large model (7+ var) guidance

Also adds evals for start (9 cases) and update (8 cases) skills
that previously had none.
@zircote zircote merged commit bfd8f0c into main Apr 2, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants