feat(step5): DeleteBlock + DeleteTab sagas#633
Conversation
Step 5 (PR 1 of 2) of the architecture-completeness plan. Converts the SQLite-first DeleteBlock + DeleteTab paths to reducer-driven sagas via the persist subscriber. Closes (partial) reducer-architecture-gaps §4. DeleteWorkspace ships in PR 2 of this step (cascade-heavy). - New: agentmux-srv/src/sagas/delete_block.rs (~140 LOC + tests) - New: agentmux-srv/src/sagas/delete_tab.rs (~150 LOC + tests) - Modified: service.rs - DeleteBlock/CloseTab RPCs route through sagas - Modified: sagas/mod.rs - register new modules
|
@codex review |
1 similar comment
|
@codex review |
There was a problem hiding this comment.
ReAgent Diagnostics
| Field | Value |
|---|---|
| ReAgent Version | 5.12.11 |
| Trigger | PR opened |
| Project Context | CLAUDE.md loaded |
| Model | claude-opus-4-6 |
| Effort | high |
| Ref Repos | Disabled |
| Merge Analysis | 4 regression(s) detected |
| Review Time | 154.5s |
| Timestamp | 2026-05-01T08:08:22Z |
| Repository | agentmuxai/agentmux |
| PR | #633 |
Issues:
- [P1] agentmux-srv/src/sagas/delete_tab.rs:75 - TOCTOU race on last-tab guard: pre-check acquires lock, verifies
tab_ids.len() > 1, then drops the lock beforeSagaCtx::dispatchre-acquires it. The reducer'shandle_delete_tab(reducer.rs:411) does not enforce a last-tab guard, so two concurrent CloseTab RPCs for different tabs in a 2-tab workspace can both pass the pre-check and both succeed, leaving zero tabs — the exact state the guard exists to prevent. - [P2] agentmux-srv/src/sagas/delete_block.rs:89 -
delete_controller(&block_id)runs beforealloc_saga_id/emit_saga_started. Ifemit_saga_startedfails (saga_id collision), the PTY/controller is already killed but the block still exists in reducer state and SQLite — an unrecoverable side-effect leak. Move the controller kill to afteremit_saga_startedsucceeds (or intorun_inner).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7063a14dd2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if workspace.tab_ids.len() <= 1 { | ||
| return Err(format!( | ||
| "DeleteTab: cannot delete last tab in workspace {}", | ||
| workspace_id |
There was a problem hiding this comment.
Preserve last-tab CloseTab semantics for keyboard flows
This new guard makes CloseTab fail whenever a workspace has a single tab, but the frontend still calls WorkspaceService.CloseTab unconditionally from keyboard handlers (Cmd+W/Cmd+Shift+W via simpleCloseStaticTab in frontend/app/store/keymodel.ts). Before this commit those calls succeeded; now they return an error, and the caller still runs deleteLayoutModelForTab(tabId), which can leave client state diverged from backend state. In practice, users on a one-tab workspace can no longer close that tab via keyboard and may end up with broken local layout state.
Useful? React with 👍 / 👎.
| crate::backend::blockcontroller::delete_controller(&block_id); | ||
|
|
||
| let saga_id = alloc_saga_id(state); | ||
| if let Err(e) = emit_saga_started( | ||
| state, |
There was a problem hiding this comment.
Start saga before tearing down the block controller
The controller is deleted before emit_saga_started, but emit_saga_started can fail-fast (e.g., durable saga log insert failure) and return early. In that path the RPC returns an error without dispatching DeleteBlock, leaving the block/tab records intact while the PTY/controller has already been torn down. This introduces a new partial-failure mode where delete appears to fail yet still kills the running block process.
Useful? React with 👍 / 👎.
1. (reagent P1) Last-tab guard moved from saga pre-check (TOCTOU race)
into reducer::handle_delete_tab. Two concurrent CloseTab RPCs on
different tabs in a 2-tab workspace could both pass the saga's
pre-check and both succeed under the old code. Now the check
happens inside the reducer's state lock, atomic with the delete.
- Saga's pre-check stripped (just checks tab existence).
- Reducer rejects with Event::Error(InvalidCommand) if last tab.
- DeleteWorkspace cascade unaffected (removes tabs inline).
- PromoteBlockToTab compensation unaffected (workspace always has
≥2 tabs when compensation fires).
- Existing test `delete_last_tab_clears_active_to_none` updated to
assert new rejection behavior.
- Existing test `delete_tab_cascades_blocks` updated to create a
second tab so the test tab isn't the only one.
2. (reagent P2) DeleteBlock controller-kill moved from before
emit_saga_started to AFTER reducer + SQLite writes succeed. The
earlier ordering left an unrecoverable side-effect leak if
start_saga rejected a saga_id collision: PTY dead, block still in
reducer + SQLite. Now: success → both controller and block gone;
any failure path → controller stays alive, block stays. Idempotent.
68 reducer + 28 saga tests pass.
|
@codex review |
There was a problem hiding this comment.
ReAgent Diagnostics
| Field | Value |
|---|---|
| ReAgent Version | 5.12.11 |
| Trigger | Re-review (new commits) |
| Project Context | CLAUDE.md loaded |
| Model | claude-opus-4-6 |
| Effort | high |
| Ref Repos | Disabled |
| Merge Analysis | 5 regression(s) detected |
| Review Time | 144.2s |
| Timestamp | 2026-05-01T08:23:24Z |
| Repository | agentmuxai/agentmux |
| PR | #633 |
Issues:
- [P1] frontend/app/store/keymodel.ts:161 -
simpleCloseStaticTabcallsdeleteLayoutModelForTab(tabId)unconditionally without awaiting CloseTab. With the new last-tab guard, Cmd+Shift+W on a single-tab workspace will fire-and-forget CloseTab (which now fails), then immediately destroy the local layout model — leaving the tab visible in the tab bar but with no layout, corrupting client state. The tabbar close button already guards this (tabbar.tsx:63if (allTabs.length <= 1) return), but the keyboard path does not.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a66268343d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if workspace.tab_ids.len() <= 1 { | ||
| let v = state.bump_version(); | ||
| return vec![Event::Error { |
There was a problem hiding this comment.
Allow rollback to delete a newly created sole tab
This last-tab guard also blocks the CreateTab error-compensation path, which rolls back failed SQLite writes by dispatching Command::DeleteTab from dispatch_service (CreateTab branch via compensate_via_reducer). When creating the first tab in a workspace, a persistence failure now returns an error but cannot remove that just-created tab from reducer state because DeleteTab is rejected as "last tab", leaving state/DB divergence in precisely the failure mode this rollback was added to handle.
Useful? React with 👍 / 👎.
| if result.is_ok() { | ||
| crate::backend::blockcontroller::delete_controller(&block_id); | ||
| } |
There was a problem hiding this comment.
Tear down block controller on failed post-dispatch writes
Conditioning delete_controller on result.is_ok() leaks PTY/controllers when reducer dispatch succeeds but SagaCtx::dispatch fails during apply_event_to_wstore (e.g., SQLite write error): the block is already removed from reducer state, the RPC returns an error, and retries then fail pre-check (block not found), so the controller is never cleaned up. This regression only appears on persistence failures but leaves orphan processes/resources that previously were always torn down.
Useful? React with 👍 / 👎.
Round 2 introduced a strict reducer-level last-tab guard (atomic but TOCTOU-safe). Round 3 walks it back because both bots converged on showing the guard regresses legitimate flows: 1. (codex P1 round 1) Cmd+Shift+W on 1-tab workspace silently fails; `simpleCloseStaticTab` still calls `deleteLayoutModelForTab` → client/server diverge. 2. (codex P1 round 2) `CreateTab` rollback compensation can't remove a just-created sole tab when persistence fails. 3. (reagent P1 round 2) Frontend keyboard handler doesn't gate at the call site like the close button does. Resolution: - **Reducer accepts last-tab deletes** (revert round 2's guard). - **Saga keeps best-effort pre-check** (UX guard; TOCTOU is rare and user-driven concurrent CloseTabs can't reach it because frontend call sites gate). - **`simpleCloseStaticTab` gates** with `if (allTabs.length <= 1) return`, mirroring `tabbar.tsx:63`. Plus (codex P2 round 2): controller-kill in `delete_block` was conditioned on saga `Ok` — leaked PTY when reducer succeeded but `apply_event_to_wstore` failed (block already removed from reducer state, RPC returned error, retries pre-check rejected as "not found"). Round 3: kill controller whenever the block is gone from reducer state (covers both success and reducer-succeeded-wstore-failed cases). Idempotent on missing controller. Tests: 68 reducer + 28 saga tests pass. Restored the original `delete_last_tab_clears_active_to_none` and `delete_tab_cascades_blocks` test bodies (round 2 had rewritten them for the strict guard).
|
@codex review |
There was a problem hiding this comment.
ReAgent Diagnostics
| Field | Value |
|---|---|
| ReAgent Version | 5.12.11 |
| Trigger | Re-review (new commits) |
| Project Context | CLAUDE.md loaded |
| Model | claude-opus-4-6 |
| Effort | high |
| Ref Repos | Disabled |
| Merge Analysis | 4 regression(s) detected |
| Review Time | 157.8s |
| Timestamp | 2026-05-01T08:39:30Z |
| Repository | agentmuxai/agentmux |
| PR | #633 |
LGTM
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 668edac15c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if let Err(reason) = ctx | ||
| .dispatch(Command::DeleteTab { | ||
| workspace_id: workspace_id.clone(), | ||
| tab_id: tab_id.clone(), | ||
| }) | ||
| .await | ||
| { | ||
| // No automatic compensation. See module doc-comment for | ||
| // rationale. | ||
| tracing::warn!( | ||
| workspace_id = %workspace_id, | ||
| tab_id = %tab_id, | ||
| "[saga] DeleteTab dispatch failed (no automatic compensation): {}", | ||
| reason | ||
| ); | ||
| return Err(format!("DeleteTab: {}", reason)); |
There was a problem hiding this comment.
Clean up tab block controllers on dispatch write failure
If ctx.dispatch(Command::DeleteTab) mutates the reducer but then fails during the SQLite apply path, this function returns an error without tearing down any block controllers for the deleted tab. Unlike delete_block::run, there is no post-dispatch cleanup path here, so a transient persistence failure can leave PTY/controller processes running while the tab is already gone from reducer state, and retries will be rejected by prechecks because the tab no longer exists.
Useful? React with 👍 / 👎.
…e failure Mirror of the delete_block fix from round 3. When DeleteTab dispatch mutates the reducer but apply_event_to_wstore fails, the tab is gone from reducer state but block controllers for blocks in that tab stay running. Retries reject as "tab not found" so controllers leak. Fix: capture block_ids before dispatch; after run_saga returns, check if the tab is gone from reducer state. If so, kill controllers for every block that was in it. Idempotent on missing. 28/28 saga tests pass.
|
@codex review |
There was a problem hiding this comment.
ReAgent Diagnostics
| Field | Value |
|---|---|
| ReAgent Version | 5.12.11 |
| Trigger | Re-review (new commits) |
| Project Context | CLAUDE.md loaded |
| Model | claude-opus-4-6 |
| Effort | high |
| Ref Repos | Disabled |
| Merge Analysis | 4 regression(s) detected |
| Review Time | 126.6s |
| Timestamp | 2026-05-01T08:48:07Z |
| Repository | agentmuxai/agentmux |
| PR | #633 |
LGTM
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 54f07a1c7e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if workspace.tab_ids.len() <= 1 { | ||
| return Err(format!( | ||
| "DeleteTab: cannot delete last tab in workspace {}", | ||
| workspace_id | ||
| )); |
There was a problem hiding this comment.
Enforce last-tab guard atomically with deletion
DeleteTab checks workspace.tab_ids.len() <= 1 before dispatch, but the guard and ctx.dispatch(Command::DeleteTab { ... }) happen in separate critical sections. Two concurrent CloseTab requests on a 2-tab workspace can both pass this pre-check and then both delete, leaving the workspace with zero tabs (the state this change is trying to prevent). This is reachable from UI code paths that issue async closes without request serialization, so the guard is not reliable unless it is enforced atomically at reducer/command level (or with equivalent serialization).
Useful? React with 👍 / 👎.
… compensation Round 4 of bot oscillation between codex (TOCTOU vs keyboard semantics vs compensation) settled. Final answer: atomic reducer guard with a force: bool bypass for compensation paths. - agentmux-common: Command::DeleteTab gains `force: bool` (serde default = false for backward compat). - reducer::handle_delete_tab: rejects last-tab delete unless force=true. - service.rs CreateTab compensate path passes force=true. - promote_block_to_tab.rs ctx.compensate passes force=true. - delete_tab saga + auto-close-empty-source paths pass force=false (defense-in-depth backstop; their callers already gate). Tests updated: tests that intentionally delete sole tab pass force=true; others pass force=false. proptest passes force=true to exercise cascade. 68 reducer + 28 saga tests pass.
|
@codex review |
There was a problem hiding this comment.
ReAgent Diagnostics
| Field | Value |
|---|---|
| ReAgent Version | 5.12.11 |
| Trigger | Re-review (new commits) |
| Project Context | CLAUDE.md loaded |
| Model | claude-opus-4-6 |
| Effort | high |
| Ref Repos | Disabled |
| Merge Analysis | 4 regression(s) detected |
| Review Time | 49.9s |
| Timestamp | 2026-05-01T11:37:33Z |
| Repository | agentmuxai/agentmux |
| PR | #633 |
LGTM
|
Codex Review: Didn't find any major issues. Breezy! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Summary
Step 5 (PR 1 of 2) of the architecture-completeness plan. Converts the
SQLite-first DeleteBlock + DeleteTab paths into proper sagas through
the reducer + persist subscriber. DeleteWorkspace ships in PR 2 of
this step (cascade-heavy, deserves its own review focus).
Closes (partial): reducer-architecture-gaps §4 SQLite-first deletes.
Plan: `docs/retro/next-steps-architecture-completeness-2026-05-01.md` step 5.
What changed
Saga shape
Both sagas use the existing `SagaCtx::dispatch` (which now writes to the durable saga log under the hood, transparent — landed via PR #631). Steps:
Last-tab guard (DeleteTab)
The legacy `("workspace", "CloseTab")` handler did not enforce a last-tab guard; the saga adds one, mirroring `TearOffTab`. Rationale: leaving a workspace with zero tabs is awkward UI state, and callers who want full teardown should issue `DeleteWorkspace` (Step 5 PR 2). Auto-close paths inside MoveBlockToTab / PromoteBlockToTab already gate on `total_tabs > 1` and stay on direct `Command::DeleteTab` dispatch (out of scope per brief), so no internal regression.
Compensation
Delete sagas are awkward to compensate (un-delete is rarely clean). Approach:
Test plan
🤖 Generated with Claude Code