Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
9ec7ee6
feat: cross-engine team orchestration with DAG execution and role system
realDuang Apr 16, 2026
aa37b21
chore: bump @github/copilot-sdk to 0.2.2 and default theme to dark
realDuang Apr 17, 2026
0a3885c
refactor(orchestration): simplify plan card and fix subtask editor co…
realDuang Apr 17, 2026
11df2db
feat(orchestration): backend persistence, team worktree, and auto-det…
realDuang Apr 17, 2026
a14ff5c
feat(ui): team orchestration settings, sidebar, and dashboard view
realDuang Apr 17, 2026
fd00719
feat: add Agent Team orchestration (Light Brain & Heavy Brain)
FridayLiu Apr 17, 2026
db8b931
Merge remote-tracking branch 'origin/main' into codemux/agentteam
FridayLiu Apr 17, 2026
d2461ff
feat: add agent team orchestration and relay
FridayLiu Apr 18, 2026
d9dc8fc
Harden agent-team orchestration
FridayLiu Apr 19, 2026
49441b2
Add headless server restart command
FridayLiu Apr 19, 2026
ad6016b
Fix team run UI follow-ups
FridayLiu Apr 19, 2026
f284648
Preserve team run worktree context
FridayLiu Apr 19, 2026
b9dc0cd
Merge branch 'fridayliu/feat/agentteam' into feat/agent-team-merged
FridayLiu Apr 20, 2026
f43aafb
feat(agent-team): Phase 1 — unified types + plan-confirm / role-mappi…
FridayLiu Apr 20, 2026
9e1c163
feat(agent-team): Phase 2 — role resolution + plan confirmation wiring
FridayLiu Apr 20, 2026
7604bc2
feat(agent-team): Phase 3a — extract TeamRunCard + wire plan-confirm UI
FridayLiu Apr 20, 2026
98033a5
test(agent-team): cover Phase 2 role resolution + plan-confirm paths
FridayLiu Apr 20, 2026
f1d6975
docs(agent-team): describe merged Light/Heavy + plan-confirm + role d…
FridayLiu Apr 20, 2026
1ad949d
feat(agent-team): relay aggregated result to parent session on comple…
FridayLiu Apr 20, 2026
dd465a1
feat(agent-team): route write tasks through teamWorktreeDir in DAGExe…
FridayLiu Apr 20, 2026
00496cc
feat(agent-team): sync Settings role mappings to AgentTeamService
FridayLiu Apr 20, 2026
0c0e853
feat(agent-team): group AgentTeamService runs in sidebar via teamId b…
FridayLiu Apr 20, 2026
637e1a3
docs(agent-team): note worktree routing, result aggregation, sidebar …
FridayLiu Apr 20, 2026
e8c6add
refactor(orchestration): rename Team*/AgentTeam* to Orchestration*
xinyul Apr 21, 2026
a36023f
fix(orchestration): wire Heavy Brain mode, fix project grouping and r…
FridayLiu Apr 27, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ CodeMux はチャットにとどまりません — 開発ワークフローを

- **LAN**: IPアドレスの自動検出 + QRコードで、数秒で準備完了
- **パブリックインターネット**: ワンクリックで [Cloudflare Tunnel](https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/) — ポート転送、VPN、ファイアウォール変更は一切不要。**クイックトンネル**(ランダムな一時URL、設定不要)と**ネームドトンネル**(`~/.cloudflared/` 認証情報による永続カスタムドメイン)の両方をサポート
- **セキュリティ内蔵**: デバイス認証、JWT トークン、Cloudflare 経由のHTTPS; クイックトンネルURLは再起動ごとにローテーション、ネームドトンネルはカスタムホスト名を維持
- **セキュリティ内蔵**: デバイス認証、JWT トークン、Cloudflare 経由のHTTPS; クイックトンネルURLはトンネル自体を作り直したときにローテーションし、ネームドトンネルはカスタムホスト名を維持

#### IM ボットチャネル

Expand Down
2 changes: 1 addition & 1 deletion README.ko.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ CodeMux는 채팅을 넘어 — 개발 워크플로를 인터페이스에서 직

- **LAN**: 자동 감지된 IP + QR 코드, 수초 내 준비 완료
- **공용 인터넷**: 원클릭 [Cloudflare Tunnel](https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/) — 포트 포워딩, VPN, 방화벽 변경 불필요. **퀵 터널**(랜덤 임시 URL, 제로 설정)과 **네임드 터널**(`~/.cloudflared/` 인증 정보를 통한 영구 커스텀 도메인) 모두 지원
- **내장 보안**: 기기 인증, JWT 토큰, Cloudflare를 통한 HTTPS; 퀵 터널 URL은 재시작마다 변경, 네임드 터널은 커스텀 호스트명 유지
- **내장 보안**: 기기 인증, JWT 토큰, Cloudflare를 통한 HTTPS; 퀵 터널 URL은 터널 자체를 다시 만들 때 변경되며, 네임드 터널은 커스텀 호스트명을 유지합니다

#### IM 봇 채널

Expand Down
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ Access your coding agents from any device — phone, tablet, or another machine

- **LAN**: Auto-detected IP + QR code, ready in seconds
- **Public Internet**: One-click [Cloudflare Tunnel](https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/) — no port forwarding, no VPN, no firewall changes. Supports both **quick tunnels** (random ephemeral URL, zero config) and **named tunnels** (persistent custom domain via `~/.cloudflared/` credentials)
- **Security built-in**: Device authorization, JWT tokens, HTTPS via Cloudflare; quick tunnel URLs rotate on every restart, named tunnels preserve your custom hostname
- **Security built-in**: Device authorization, JWT tokens, HTTPS via Cloudflare; quick tunnel URLs rotate whenever the tunnel itself is recreated, while named tunnels preserve your custom hostname

#### IM Bot Channels

Expand Down Expand Up @@ -201,6 +201,7 @@ bun run server:dev
# Run the same headless dev stack in the background
bun run server:up
bun run server:status
bun run server:restart
bun run server:down

# Run the headless dev stack and start a Cloudflare quick tunnel
Expand All @@ -215,6 +216,8 @@ bun run server:access-requests

`bun run start` is still the lightest option for a web-only standalone server. The desktop app's "Public Access" toggle manages Cloudflare inside the packaged app; on a headless dev server, `bun run server:tunnel` provides the equivalent quick-tunnel workflow from the shell.

If you want to restart CodeMux itself without rotating the current quick-tunnel URL, use `bun run server:restart`. It restarts the managed app process and keeps the existing `cloudflared` process alive whenever possible, so remote browsers can usually stay on the same public origin.


`bun run server:tunnel` now prints the access code after startup. When a remote browser submits that code, you can stay entirely in SSH and run `bun run server:access-requests` to review and interactively approve or deny pending requests. If you started CodeMux with `bun run server:dev`, open a second SSH session and run `bun run server:access-code` / `bun run server:access-requests`.

Expand Down Expand Up @@ -301,6 +304,7 @@ bun run dev # Electron + Vite HMR
bun run server:dev # Foreground headless Electron dev
bun run server:up # Background headless Electron dev
bun run server:tunnel # Background headless Electron dev + quick tunnel
bun run server:restart # Restart app only; preserve managed quick tunnel when possible
bun run server:access-code # Print the current 6-digit access code
bun run server:access-requests # Interactively review pending remote access requests
bun run server:down # Stop background headless Electron dev
Expand Down
2 changes: 1 addition & 1 deletion README.ru.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ CodeMux выходит за рамки чата — предоставляет

- **LAN**: Автоматически определённый IP + QR-код, готово за секунды
- **Публичный интернет**: Одним кликом [Cloudflare Tunnel](https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/) — без проброса портов, VPN и изменений файрвола. Поддерживаются как **быстрые туннели** (случайный временный URL, без настройки), так и **именованные туннели** (постоянный пользовательский домен через учётные данные `~/.cloudflared/`)
- **Встроенная безопасность**: Авторизация устройств, JWT-токены, HTTPS через Cloudflare; URL быстрых туннелей меняются при каждом перезапуске, именованные туннели сохраняют ваш домен
- **Встроенная безопасность**: Авторизация устройств, JWT-токены, HTTPS через Cloudflare; URL быстрых туннелей меняются при пересоздании самого туннеля, а именованные туннели сохраняют ваш домен

#### Каналы IM-ботов

Expand Down
2 changes: 1 addition & 1 deletion README.zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ CodeMux 不只是聊天 —— 它提供集成工具,让你直接在界面中

- **局域网**:自动检测 IP + 二维码,几秒内即可就绪
- **公网**:一键 [Cloudflare Tunnel](https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/) —— 无需端口转发、无需 VPN、无需防火墙更改。支持**快速隧道**(随机临时 URL,零配置)和**命名隧道**(通过 `~/.cloudflared/` 凭证持久化自定义域名)
- **内置安全机制**:设备授权、JWT 令牌、通过 Cloudflare 的 HTTPS;快速隧道 URL 每次重启时轮换,命名隧道保留你的自定义主机名
- **内置安全机制**:设备授权、JWT 令牌、通过 Cloudflare 的 HTTPS;快速隧道 URL 会在隧道本身被重建时轮换,命名隧道保留你的自定义主机名

#### IM 机器人渠道

Expand Down
18 changes: 9 additions & 9 deletions bun.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

160 changes: 160 additions & 0 deletions docs/orchestration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
# Orchestration

The Orchestration system is a multi-agent layer that lets a single user prompt fan
out into a DAG of subtasks executed by one or more engine sessions. It fuses two
designs:

1. **fridayliu/feat/agentteam** — Light/Heavy brain split with a DAG executor,
guardrails, and a user-channel relay for human-in-the-loop Heavy Brain runs.
2. **PR #117 (realDuang/feat/agent-team)** — role-based orchestrator with
plan-confirmation UI, role→engine mapping settings, and team worktrees.

Both histories are preserved via a merge commit; the merged system keeps
fridayliu's Light/Heavy brain core and absorbs PR #117's plan-confirm, role, and
team-worktree concepts.

## Two Brains

| Brain | Entry file | Behavior |
| ------ | -------------------------------------------- | ------------------------------------------------------------------------ |
| Light | `electron/main/services/orchestration/light-brain.ts` | One-shot planner: asks an engine to produce a DAG, then executes it. |
| Heavy | `electron/main/services/orchestration/heavy-brain.ts` | Persistent orchestrator: drives dispatch iteratively; supports UserChannel relays for clarification from the human. |

Both brains share:

- **DAG executor** (`dag-executor.ts`) — honors `dependsOn`, runs tasks in
parallel up to a concurrency cap.
- **TaskExecutor** (`task-executor.ts`) — runs a single task inside a dedicated
engine session, retries once on transient failure, and now calls an optional
`RoleResolver` to map `task.role` → `{engineType, modelId}` before dispatch.
- **Guardrails** (`guardrails.ts`) — rate limits, loop detection, max turns.

## Plan Confirmation

The Light Brain pauses between planning and execution when
`orchestrationRun.requirePlanConfirmation` is true (default for Light, off for Heavy).

Flow:

1. Light Brain generates a DAG and sets `orchestrationRun.status = "awaiting-confirmation"`.
2. Gateway emits `orchestration.updated`; UI shows the plan in `OrchestrationCards` with a
"Confirm & execute" button.
3. User inspects/edits tasks in `OrchestrationCards`, then fires
`gateway.confirmTeamPlan(runId, tasks)`.
4. The service's `confirmPlan(runId, tasks)` resolves the pending gate;
Light Brain resumes with the user-edited DAG.

Rejection (run cancellation) rejects the pending gate and marks the run failed.

Gateway keys: `TEAM_CONFIRM_PLAN`. Relevant types live in `src/types/unified.ts`.

## Role Resolution

Tasks can declare a semantic role instead of a concrete engine:

```ts
{ role: "explorer", ... } // read-only investigation, prefer fast engine
{ role: "coder", ... } // read/write, prefer capable engine
```

Built-in roles (see `DEFAULT_ROLE_MAPPINGS` in
`electron/main/services/orchestration/index.ts`):

| role | read-only | intended use |
| ---------- | --------- | ------------------------------------ |
| explorer | yes | codebase reconnaissance |
| researcher | yes | external docs / web research |
| reviewer | yes | code review pass |
| designer | no | design/architecture drafts |
| coder | no | implementation |

Mappings are persisted in `settings.json` under `team.roleMappings`. They can be
overridden at runtime via `OrchestrationService.updateRoleMappings()` / gateway
`TEAM_UPDATE_ROLE_MAPPINGS`.

Resolution order in `TaskExecutor`:

1. If `task.engineType` is explicitly set → use it verbatim.
2. Else if `task.role` is set and `resolveRole` returns a mapping → use it.
3. Else fall back to the run's `defaultEngineType`.

## Team Worktree

Read/write tasks share a single git worktree so successive tasks see each
other's edits. `OrchestrationRun.teamWorktreeName` / `teamWorktreeDir` carry the
shared worktree through the run, and `DAGExecutor.runSingleTask`
routes each task based on its read/write intent:

- **Write-capable task** (`needsWorktree !== false`, or role mapping
has `readOnly: false`) → runs with `directory = teamWorktreeDir`
and `defaultWorktreeId = teamWorktreeName`, so all writers share a
single worktree.
- **Read-only task** (`needsWorktree === false`, or role mapping has
`readOnly: true`) → runs in the run's primary directory, avoiding
contention with writers.

When `teamWorktreeDir` is unset, all tasks use the run's directory
and existing `run.worktreeId`.

The gateway `WORKTREE_CREATE` handler whitelists `team-*` names so the
orchestrator can provision worktrees even when the global worktree
feature flag is off.

## Result Aggregation to Parent

When a run reaches a terminal state (`completed` / `failed`) and has a
`parentSessionId`, `OrchestrationService.relayResultsToParentSession()`
sends the aggregated `finalResult` + failed-task list as a user message
to the parent session. The parent engine then summarizes for the user,
keeping everything in one conversation.

Gated by `OrchestrationRun.aggregateToParent` (defaults to `true`; set `false`
to disable). Failures are swallowed with a warn log so a broken parent
session cannot corrupt run state.

## Sidebar Grouping

Chat wraps `connectOrchestrationHandlers()` to mirror every TeamRun update into
PR #117's orchestration sidebar registry (`registerTeam` +
`associateRunWithTeam` + `associateChildSession`), so Light/Heavy brain
child sessions collapse under their parent session in
`SessionSidebar` with the same UX as PR #117's orchestrator-service
flow.

## Service Coexistence

During the merge we kept PR #117's `orchestrator-service.ts` (role-based
orchestrator) alongside fridayliu's `orchestration/index.ts` (Light/Heavy brain).
Both are wired into `ws-server.ts` and both have a UI surface in `Chat.tsx`.
This lets the two flows ship side-by-side; a future cleanup pass can decide
whether to delete the PR #117 service once the Light/Heavy flow owns all the
PR #117 use cases.

## Request Types

| Gateway key | Purpose |
| ------------------------------ | ------------------------------------------ |
| `TEAM_RUN_CREATE` | Create a new team run |
| `TEAM_RUN_LIST` | List all runs |
| `TEAM_RUN_CANCEL` | Cancel an active run |
| `TEAM_RUN_DELETE` | Delete a completed run |
| `TEAM_CONFIRM_PLAN` | Resolve the plan-confirmation gate |
| `TEAM_GET_ROLE_MAPPINGS` | Read current role→engine mappings |
| `TEAM_UPDATE_ROLE_MAPPINGS` | Persist role→engine mapping edits |

See `src/types/unified.ts` for the payload schemas.

## Tests

Primary specs:

- `tests/unit/electron/services/orchestration/index.test.ts` — lifecycle,
persistence, role mappings, plan-confirm gate.
- `tests/unit/electron/services/orchestration/light-brain.test.ts` — planning,
awaiting-confirmation pause/resume, rejection.
- `tests/unit/electron/services/orchestration/heavy-brain.test.ts` — dispatch,
UserChannel relays.
- `tests/unit/electron/services/orchestration/task-executor.test.ts` — role
resolution, retry, error surfacing.
- `tests/unit/electron/services/orchestration/dag-executor.test.ts` — DAG order,
concurrency cap.
Loading