Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .claude-plugin/marketplace.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
{
"name": "agent-team",
"description": "Orchestrates parallel work via Agent Teams with automated coordination, workspace tracking, and hook enforcement",
"version": "2.3.0",
"version": "2.4.0",
"source": {
"source": "url",
"url": "https://github.com/ducdmdev/agent-team-plugin.git"
Expand Down
2 changes: 1 addition & 1 deletion .claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "agent-team",
"description": "Orchestrates parallel work via Agent Teams with automated coordination, workspace tracking, and hook enforcement",
"version": "2.3.0",
"version": "2.4.0",
"author": {
"name": "Duc Do"
}
Expand Down
15 changes: 15 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,21 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [2.4.0] - 2026-03-09

### Added
- **PROGRESS message type**: Optional granular progress reporting for long-running tasks
- **CHECKPOINT message type**: Intermediate results with downstream task notification
- **Confidence grades**: Optional `[X%]` annotation on reviewer/auditor findings
- **Priority marking**: Optional `priority={critical|high|normal|low}` on STARTING/HANDOFF
- **Checkpoint/Rollback pattern**: Save and resume long-running tasks at natural breakpoints
- **Deadline Escalation pattern**: Proactive time-based escalation for stalled tasks
- **Circular Dependency Detection**: DAG validation in Phase 2 to prevent deadlocks
- **Graceful Degradation pattern**: Controlled scope reduction under resource pressure
- **Warm vs Cold Handoff**: Context-level distinction for result handoffs
- **Anti-Pattern Catalog**: 8 documented coordination pitfalls with prevention/mitigation
- **Scaling Patterns documentation**: Read-only extension, phased execution, sub-agent specialization

## [2.3.0] - 2026-03-09

### Added
Expand Down
1 change: 1 addition & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ docs/ Shared phases + reference docs consumed by skills at runt
| `CHANGELOG.md` | Version history | Add entry for each release |
| `README.md` | User-facing documentation | Keep in sync with feature changes |
| `tests/` | Hook and structure tests | `hooks/` for hook tests, `structure/` for plugin validation |
| `.agent-team/0309-protocol-research/` | Research findings | Reference only — do not modify. Contains 4 reports on protocol, patterns, resilience, and scaling |

## Conventions

Expand Down
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,13 @@ HANDOFF #N: what I produced that another teammate needs
QUESTION: what I need to know
```

Optional extended messages for long-running tasks:

```
PROGRESS #N: milestone={desc}, percent={0-100}, eta={minutes}
CHECKPOINT #N: intermediate results, artifacts, ready_for=[task IDs]
```

## Hooks

Five hooks enforce team discipline automatically:
Expand Down
54 changes: 54 additions & 0 deletions docs/communication-protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ Canonical definition of structured messages used by all teammates. The lead read
## Contents

- [Structured Messages](#structured-messages)
- [Extended Messages (Optional)](#extended-messages-optional)
- [Reviewer/Auditor Findings Format](#reviewerauditor-findings-format)
- [Tester Results Format](#tester-results-format)
- [Auditor Compliance Format](#auditor-compliance-format)
Expand All @@ -23,6 +24,53 @@ HANDOFF #N: {what I produced that another teammate needs, key details}
QUESTION: {what I need to know, what I already checked in workspace}
```

## Extended Messages (Optional)

These message types are optional enhancements. Teammates use them when the lead requests granular updates or when tasks are long-running.

### Progress Reporting

For long-running tasks (>5 minutes expected), teammates report intermediate progress:

```
PROGRESS #N: milestone={description}, percent={0-100}, eta={minutes or omitted}
```

Example:
```
PROGRESS #5: milestone="security scan phase 2 of 4", percent=50, eta=3
```

**Lead processing**: Log milestone in `tasks.md` Notes column. No workspace file update needed unless the milestone unblocks another task.

### Checkpoint (Partial Completion)

When a task produces intermediate artifacts that downstream tasks can consume early:

```
CHECKPOINT #N: {what was completed}, artifacts={file references}, ready_for=[task IDs]
```

Example:
```
CHECKPOINT #5: completed 50/100 tests, early findings: 3 failures in auth module, artifacts=.agent-team/{team}/test-results-partial.md, ready_for=[6]
```

**Lead processing**: If `ready_for` lists task IDs, message the dependent teammate with the checkpoint details. Log in `progress.md` Handoffs section.

### Priority Marking

Teammates can signal task urgency in STARTING and HANDOFF messages:

```
STARTING #N: priority={critical|high|normal|low}, {what I plan to do, which files I'll touch}
HANDOFF #N: priority={critical|high|normal|low}, {what I produced, key details}
```

Default is `normal` — omit the field for routine work. Use `critical` only when the task blocks multiple teammates or has a deadline.

**Lead processing**: Prioritize `critical` and `high` messages. For `critical` HANDOFF, forward immediately (don't batch).

## Reviewer/Auditor Findings Format

Use consistent severity labels with sequential numbering per severity within each task:
Expand All @@ -31,6 +79,12 @@ Use consistent severity labels with sequential numbering per severity within eac
- **M{n}** (medium — should fix): file:line, description
- **L{n}** (low — suggestion): file:line, description

**Optional confidence grade**: Append `[X%]` to any finding when confidence is meaningful:
- `H1[95%]: src/auth.py:15, SQL injection via unsanitized input, fix: use parameterized query`
- `M2[60%]: src/api.py:42, possible race condition under load`

Omit the grade when confidence is obviously high (most findings). Use it when a finding is uncertain or based on inference rather than direct evidence.

In COMPLETED messages, include total counts: "N issues: X high, Y medium, Z low"

## Tester Results Format
Expand Down
198 changes: 198 additions & 0 deletions docs/coordination-patterns.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,13 @@ Patterns for the lead to handle common coordination scenarios during Phase 4.
- [Re-plan on Block](#re-plan-on-block) — revising the plan when a critical block invalidates it
- [Adversarial Review Rounds](#adversarial-review-rounds) — multi-round cross-review for critical changes
- [Quality Gate](#quality-gate) — final validation pass before synthesis
- [Checkpoint/Rollback](#checkpointrollback) — save and resume long-running tasks
- [Deadline Escalation](#deadline-escalation) — time-based proactive escalation
- [Circular Dependency Detection](#circular-dependency-detection) — prevent deadlocks in Phase 2
- [Graceful Degradation](#graceful-degradation) — scope reduction under resource pressure
- [Auto-Block on Repeated Failures](#auto-block-on-repeated-failures) — escalation after repeated failures
- [Direct Handoff](#direct-handoff) — authorized peer-to-peer messaging with audit trail
- [Anti-Pattern Catalog](#anti-pattern-catalog) — known coordination pitfalls to avoid

## Communication Protocol

Expand Down Expand Up @@ -201,6 +206,25 @@ When Teammate A produces output that Teammate B needs:

Do NOT have teammates message each other directly for handoffs unless they need a back-and-forth discussion. The lead summarizing and forwarding keeps coordination clean and maintains the workspace audit trail.

### Warm vs Cold Handoff

- **Warm handoff**: Lead forwards full context — what was done, why, key decisions, and specific next steps for the receiving teammate. Use when the handoff requires understanding of reasoning.
```
A finished task #3 (auth token refactor). Key changes:
- Moved token validation to src/auth/validate.ts
- New interface: TokenResult { valid: boolean, claims: Claims }
- Decision: used JWT over opaque tokens (see progress.md Decision Log)
You can now proceed with task #5 using the new TokenResult interface.
```

- **Cold handoff**: Lead forwards minimal context — just file paths and a pointer to workspace. Use when the receiving teammate only needs to know what files to read.
```
A finished task #3. Output files: src/auth/validate.ts, src/auth/types.ts.
Check workspace tasks.md for full details. Proceed with task #5.
```

**Default to warm handoffs** — the extra context costs little and prevents follow-up QUESTION messages. Use cold handoffs only when the downstream task is clearly independent (e.g., reviewer just needs to read files).

## Teammate Not Responding

If a teammate hasn't sent an update after an extended period:
Expand Down Expand Up @@ -382,6 +406,156 @@ Completion criteria: Build exits 0 with no errors.

Assign to the nearest available teammate (reviewer or tester preferred, implementer if no others are available).

## Checkpoint/Rollback

Save consistent state at natural breakpoints during long-running tasks. Enables recovery from mid-task failures without losing completed work.

### When to Use

- Tasks expected to take >10 minutes
- Multi-step migrations, large refactors, or batch operations
- Any task where partial failure is possible and rework is expensive

### Protocol

1. **Lead instructs** in spawn prompt: "For long tasks, send CHECKPOINT messages at natural breakpoints (after each module, after each migration step, etc.)"
2. **Teammate sends** CHECKPOINT at each breakpoint:
```
CHECKPOINT #N: {what was completed}, artifacts={file references}, ready_for=[task IDs]
```
3. **Lead logs** checkpoint in `progress.md` Decision Log: "Checkpoint: task #N at [milestone]"
4. **On failure**: Lead messages teammate with last checkpoint context:
```
Resume from checkpoint. Last known state:
- Completed: {checkpoint description}
- Artifacts: {file references}
- Remaining: {what's left to do}
```
5. **If teammate is unrecoverable**: spawn replacement with checkpoint context in prompt

### Workspace Integration

- Checkpoints are logged in `progress.md` Decision Log (not a separate file)
- Checkpoint artifacts live in the workspace directory: `.agent-team/{team}/checkpoint-{task-id}.md`
- On task completion, checkpoint artifacts can be cleaned up or kept for audit

### Key Rule

Checkpoints are lightweight — a one-line CHECKPOINT message, not a full state dump. The workspace files (`tasks.md`, `issues.md`) already track team-level state. Checkpoints track task-level progress within a single teammate's scope.

## Deadline Escalation

Proactive time-based escalation to prevent tasks from exceeding the user's time budget.

### When to Use

- User has an implicit or explicit time constraint
- A task has been in_progress for an extended period with no PROGRESS or COMPLETED message
- The team session is approaching context limits

### Protocol

1. **Lead tracks** estimated task duration in `progress.md`:
```
**Session started**: {timestamp}
```
2. **Lead proactively checks** tasks that have been in_progress without updates:
```
Status check on task #N — it's been [duration] since your last update.
What's your progress? Use PROGRESS or COMPLETED format.
If blocked, use BLOCKED so I can log and route it.
```
3. **Escalation ladder**:
- **Nudge** (first check): request status update
- **Warn** (second check, ~5 min later): "Task #N is at risk. Need status or BLOCKED report."
- **Escalate** (third check): mark task as at-risk in `tasks.md`, consider reassignment or scope reduction
4. **Scope reduction option**: if task is too large, lead proposes splitting:
```
Task #N is taking longer than expected. Options:
a) Continue (estimated X more minutes)
b) Split: complete [partial scope], defer [remaining scope] as follow-up
c) Reassign to [other teammate]
```

### Key Rule

Deadline escalation is proactive, not punitive. The goal is visibility — silent tasks are the biggest risk to team throughput. Combine with the PROGRESS message type for teammates to self-report before escalation triggers.

## Circular Dependency Detection

Validate task dependency graphs before execution to prevent silent deadlocks.

### When to Use

- Phase 2 plan has 4+ tasks with `blocked by` relationships
- Any time tasks form chains longer than 2 levels deep

### Protocol

1. **During Phase 2**: Before presenting the plan, trace all dependency chains:
- For each task with `blocked by`, follow the chain: A blocks B blocks C...
- If any chain leads back to a task already visited, there's a cycle
2. **On cycle detected**: Do NOT present the plan. Instead, restructure:
- Option A: Merge the cyclic tasks into one (assign to same teammate)
- Option B: Remove the weakest dependency (the one where the blocker could be worked around)
- Option C: Split one task to break the cycle (the blocking portion runs first)
3. **Log**: Record the detected cycle and resolution in `progress.md` Decision Log

### Example

```
Task #1: Set up database schema
Task #2: Write API endpoints (blocked by #1)
Task #3: Write migrations (blocked by #2)
Task #1 update: schema depends on migration format (blocked by #3) ← CYCLE

Resolution: Merge #1 and #3 into single task "Database schema + migrations"
```

### Prevention

The best prevention is Phase 1 decomposition by independent modules, not by sequential steps. If streams need constant handoffs, merge them.

## Graceful Degradation

Reduce scope rather than stopping when the team hits resource limits or unrecoverable blockers.

### When to Use

- Context window is running low (frequent compaction)
- Multiple teammates are blocked and remediation isn't viable
- User's time budget is exceeded but partial delivery has value

### Protocol

1. **Detect degradation trigger**:
- 2+ context compactions in short succession
- 3+ teammates blocked simultaneously
- Lead judges that full scope cannot be completed
2. **Assess salvageable work**: read `tasks.md` — which tasks are COMPLETED? What partial value exists?
3. **Present scope reduction to user**:
```
Scope reduction needed: [trigger reason]

Completed work (will be preserved):
- [task IDs and summaries]

Work to defer (will be logged as follow-up):
- [task IDs and summaries]

Approve reduced scope?
```
4. **If approved**:
- Mark deferred tasks as `deferred` in `tasks.md`
- Shut down teammates working on deferred tasks
- Continue to Phase 5 with completed work only
- Include deferred items in report's Follow-up section
5. **Log**: Record scope reduction decision in `progress.md` Decision Log

### Key Rule

Graceful degradation is a controlled retreat, not a failure. The user gets partial value immediately and a clear list of what remains. This is always better than a team that burns context trying to finish everything and produces nothing.

## Auto-Block on Repeated Failures

Prevents teammates from spinning on the same error. Escalates automatically after repeated failures.
Expand Down Expand Up @@ -433,3 +607,27 @@ For pre-approved information transfers between specific teammates, bypassing the
### Key Rule

The audit trail MUST be maintained. Direct handoffs save time but must still be logged via the lead's workspace updates.

## Anti-Pattern Catalog

Known coordination anti-patterns to avoid. These emerge from research into multi-agent systems (CrewAI, AutoGen, LangGraph, MetaGPT) and distributed systems theory.

### Critical (Prevent by Design)

**Circular Wait Deadlock**: Tasks A→B→C→A where each blocks the next. Prevention: validate dependency DAG in Phase 2 (see [Circular Dependency Detection](#circular-dependency-detection)).

**Race Condition on Shared State**: Two teammates simultaneously edit the same file; last write wins. Prevention: 1:1 file ownership mapping in Phase 2 + PreToolUse hook enforcement.

**Context Overflow Cascade**: Workspace grows unbounded; teammates can't read full context; compaction fires repeatedly. Prevention: batch workspace updates, keep workspace files concise, use [Graceful Degradation](#graceful-degradation) when compaction frequency increases.

**Infinite Re-Debate Loop**: Two teammates keep revisiting a completed decision. Prevention: once a task is COMPLETED, no further work on it unless explicitly reassigned by the lead. Log decisions in `progress.md` Decision Log as the authoritative record.

### Warning (Monitor and Mitigate)

**Silent Failure**: Teammate completes but sends no message — task appears blocked but is actually done. Mitigation: First Contact Verification + proactive check-ins. If idle 2+ cycles without any message, investigate.

**Scope Explosion**: Team grows beyond lead's effective span of control (>6 agents). Mitigation: enforce team size limits in Phase 3; for >6, use hierarchical sub-leads or phased execution.

**Single Point of Failure**: All work depends on one teammate; if they fail, the whole team stalls. Mitigation: avoid assigning >50% of tasks to any single teammate. For critical paths, ensure another teammate can take over.

**Byzantine Output**: Teammate reports task complete but output is incorrect or hallucinated. Mitigation: Adversarial Review Rounds for critical tasks; verify file changes actually exist before marking tasks complete (TaskCompleted hook already does this for implementers).
Loading