feat(workflow): add DAG orchestration engine for multi-agent workflows#73
Open
Ernest-Wu wants to merge 7 commits intoHKUDS:mainfrom
Open
feat(workflow): add DAG orchestration engine for multi-agent workflows#73Ernest-Wu wants to merge 7 commits intoHKUDS:mainfrom
Ernest-Wu wants to merge 7 commits intoHKUDS:mainfrom
Conversation
Add a complete workflow DAG engine that enables multi-agent orchestration with dependency-aware, parallel-capable node execution. Problem: - OpenHarness only supports single agent loops with simple subagent spawning - No structured way to define complex multi-step workflows with dependencies - Missing automatic retry, failure propagation, and execution observability Changes: - Workflow DAG scheduler with topological sort and layered parallel execution - YAML workflow definition parser with variable interpolation - Node execution engine integrating with the existing Agent Loop - Automatic retry with exponential backoff and jitter - Failure propagation: downstream nodes skipped when upstream fails - Execution tracing with JSON, Graphviz DOT, and HTML report exports - 3 built-in templates: refactor, feature-dev, test-and-docs - CLI commands: oh workflow list|show|run|export - 40 unit/integration tests covering types, scheduler, parser, and tracing Verification: - uv run ruff check src tests scripts (all passed) - uv run pytest -q (546 passed, 6 skipped, 1 xfailed) - All new code follows existing project conventions and typing style Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Add end-to-end test suite following harness-eval skill patterns: - Uses real MiniMax M2.7 API calls (no mocks) - Tests on unfamiliar AutoAgent workspace - 4 test scenarios: basic execution, parallel nodes, dependency chains, failure propagation - Validates actual tool execution and result collection Also fixes: - parser.py: correctly distinguish YAML content from file paths - executor.py: properly collect AssistantTextDelta events for output - executor.py: avoid deepcopy of QueryContext (unpickleable HTTP clients) Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Remove unused imports (ToolRegistry, tempfile) - Fix f-string warnings without placeholders - Mark E2E tests with @pytest.mark.skip (require real API key) - E2E tests should be run directly with python, not via pytest Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Fix base_url construction for OpenAI-compatible client (ensure /v1 suffix) - Use MiniMax-M2.5 model (M2.7 returns 529 overloaded errors) - Add ErrorEvent handling in node executor - Clean up debug logging and temporary test files - All 4 E2E scenarios now pass with real API calls: - basic_execution: 3414 chars output, glob/grep/bash tools called - parallel_execution: 2 nodes run concurrently - dependency_chain: upstream results flow to downstream nodes - failure_propagation: error handling works correctly Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- MiniMax-M2.7 API is working (earlier 529 was temporary overload) - Rewrite E2E test file to clean structure - All 4 E2E scenarios verified with M2.7: - basic_execution: 3414 chars output - parallel_execution: 8.6s concurrent - dependency_chain: 118s with context passing - failure_propagation: proper error handling Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
OpenHarness currently only supports single agent loops with simple subagent spawning. There is no structured way to define complex multi-step workflows with dependencies, automatic retry, failure propagation, or execution observability.
Solution
Add a complete Workflow DAG Engine that enables multi-agent orchestration with:
Built-in Templates
refactor— Code analysis → Refactor → Reviewfeature-dev— Planning → Implementation → Tests → Docstest-and-docs— Run tests → Fix failures → Verify → Update docsChanges
New Files (17)
src/openharness/workflow/— Core engine (types, scheduler, executor, parser, trace, recovery)src/openharness/commands/workflow.py— CLI commandssrc/openharness/workflow/templates/*.yaml— 3 built-in templatestests/test_workflow/— 40 unit tests + 4 E2E tests (real MiniMax-M2.7 API)docs/workflow-engine.md— Complete usage documentationModified Files (3)
src/openharness/cli.py— Added workflow subcommand registrationsrc/openharness/workflow/executor.py— ErrorEvent handling, restricted tool contextCHANGELOG.md— Added entry under UnreleasedVerification
uv run ruff check src tests scripts— All checks passeduv run pytest -q— 546 passed, 10 skipped, 1 xfailedbasic_execution: 3414 chars output, glob/grep/bash tools calledparallel_execution: 2 nodes run concurrently in 8.6sdependency_chain: upstream results flow to downstream nodesfailure_propagation: error handling works correctlyCLI Usage