feat(workflow): add DAG orchestration engine for multi-agent workflows by Ernest-Wu · Pull Request #73 · HKUDS/OpenHarness

Ernest-Wu · 2026-04-08T08:10:54Z

Problem

OpenHarness currently only supports single agent loops with simple subagent spawning. There is no structured way to define complex multi-step workflows with dependencies, automatic retry, failure propagation, or execution observability.

Solution

Add a complete Workflow DAG Engine that enables multi-agent orchestration with:

Dependency-aware execution: Nodes run only when all dependencies complete
Parallel execution: Independent nodes run concurrently via topological layering
Automatic retry: Exponential backoff with jitter (configurable per node)
Failure propagation: Downstream nodes auto-skipped when upstream fails
Observability: JSON, Graphviz DOT, and HTML report exports

Built-in Templates

refactor — Code analysis → Refactor → Review
feature-dev — Planning → Implementation → Tests → Docs
test-and-docs — Run tests → Fix failures → Verify → Update docs

Changes

New Files (17)

src/openharness/workflow/ — Core engine (types, scheduler, executor, parser, trace, recovery)
src/openharness/commands/workflow.py — CLI commands
src/openharness/workflow/templates/*.yaml — 3 built-in templates
tests/test_workflow/ — 40 unit tests + 4 E2E tests (real MiniMax-M2.7 API)
docs/workflow-engine.md — Complete usage documentation

Modified Files (3)

src/openharness/cli.py — Added workflow subcommand registration
src/openharness/workflow/executor.py — ErrorEvent handling, restricted tool context
CHANGELOG.md — Added entry under Unreleased

Verification

✅ uv run ruff check src tests scripts — All checks passed
✅ uv run pytest -q — 546 passed, 10 skipped, 1 xfailed
✅ E2E tests with real MiniMax-M2.7 API on AutoAgent workspace:
- basic_execution: 3414 chars output, glob/grep/bash tools called
- parallel_execution: 2 nodes run concurrently in 8.6s
- dependency_chain: upstream results flow to downstream nodes
- failure_propagation: error handling works correctly

CLI Usage

# List available templates
oh workflow list

# Show template details
oh workflow show refactor

# Run a workflow (dry run)
oh workflow run refactor -v target_path=src/main.py --dry-run

# Export workflow structure
oh workflow export refactor -f json
oh workflow export refactor -f html -o report.html

Add a complete workflow DAG engine that enables multi-agent orchestration with dependency-aware, parallel-capable node execution. Problem: - OpenHarness only supports single agent loops with simple subagent spawning - No structured way to define complex multi-step workflows with dependencies - Missing automatic retry, failure propagation, and execution observability Changes: - Workflow DAG scheduler with topological sort and layered parallel execution - YAML workflow definition parser with variable interpolation - Node execution engine integrating with the existing Agent Loop - Automatic retry with exponential backoff and jitter - Failure propagation: downstream nodes skipped when upstream fails - Execution tracing with JSON, Graphviz DOT, and HTML report exports - 3 built-in templates: refactor, feature-dev, test-and-docs - CLI commands: oh workflow list|show|run|export - 40 unit/integration tests covering types, scheduler, parser, and tracing Verification: - uv run ruff check src tests scripts (all passed) - uv run pytest -q (546 passed, 6 skipped, 1 xfailed) - All new code follows existing project conventions and typing style Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

Add end-to-end test suite following harness-eval skill patterns: - Uses real MiniMax M2.7 API calls (no mocks) - Tests on unfamiliar AutoAgent workspace - 4 test scenarios: basic execution, parallel nodes, dependency chains, failure propagation - Validates actual tool execution and result collection Also fixes: - parser.py: correctly distinguish YAML content from file paths - executor.py: properly collect AssistantTextDelta events for output - executor.py: avoid deepcopy of QueryContext (unpickleable HTTP clients) Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

- Remove unused imports (ToolRegistry, tempfile) - Fix f-string warnings without placeholders - Mark E2E tests with @pytest.mark.skip (require real API key) - E2E tests should be run directly with python, not via pytest Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

- Fix base_url construction for OpenAI-compatible client (ensure /v1 suffix) - Use MiniMax-M2.5 model (M2.7 returns 529 overloaded errors) - Add ErrorEvent handling in node executor - Clean up debug logging and temporary test files - All 4 E2E scenarios now pass with real API calls: - basic_execution: 3414 chars output, glob/grep/bash tools called - parallel_execution: 2 nodes run concurrently - dependency_chain: upstream results flow to downstream nodes - failure_propagation: error handling works correctly Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

- MiniMax-M2.7 API is working (earlier 529 was temporary overload) - Rewrite E2E test file to clean structure - All 4 E2E scenarios verified with M2.7: - basic_execution: 3414 chars output - parallel_execution: 8.6s concurrent - dependency_chain: 118s with context passing - failure_propagation: proper error handling Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

Ernest-Wu and others added 7 commits April 8, 2026 11:09

fix: move pytest import to top of file

5f5d642

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

chore: remove CLAUDE.md and QWEN.md from PR

56ab4ec

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(workflow): add DAG orchestration engine for multi-agent workflows#73

feat(workflow): add DAG orchestration engine for multi-agent workflows#73
Ernest-Wu wants to merge 7 commits intoHKUDS:mainfrom
Ernest-Wu:feature/workflow-dag-engine

Ernest-Wu commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ernest-Wu commented Apr 8, 2026

Problem

Solution

Built-in Templates

Changes

New Files (17)

Modified Files (3)

Verification

CLI Usage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant