fix(bounty): multi-store atomicity, saga settlement, reconciler framework#254
Merged
windoliver merged 14 commits intomainfrom Apr 15, 2026
Merged
fix(bounty): multi-store atomicity, saga settlement, reconciler framework#254windoliver merged 14 commits intomainfrom
windoliver merged 14 commits intomainfrom
Conversation
…work (#240) Addresses the three codex-flagged correctness bugs in bounty operations and adds infrastructure to prevent recurrence: - Add pending_settlement saga pivot state so settleBountyOperation never enters a terminal state before capture() confirms - Add pre-flight status checks in claim/settle to prevent wasted side effects - Add input validation (amount > 0, non-empty title) at operation boundary - Add LRU doc cache + ETag-forwarding in NexusBountyStore to reduce VFS round-trips in multi-transition flows - Add SweepReconciler framework with pluggable strategies for periodic consistency repair - Add BountyIndexSweep (dual-write index repair), SettlementSweep (resume stalled pending_settlement), HandoffSweep (detect orphans) - Add FailingBountyStore test wrapper for partial-failure injection - Add lazy eviction of expired reservations in InMemoryCreditsService 115 tests pass across 4 test files including all 3 acceptance criteria from Issue #240.
1. Freeze fulfillment CID after saga pivot: when resuming a pending_settlement bounty, reject attempts to change the contribution CID. Prevents non-deterministic settlements. 2. Remove stale cache from transitionBounty: mutations always read fresh from VFS to get a valid ETag. Cache is still used for read-only getBounty() pre-flight checks. 3. Wire SweepReconciler into server startup: BountyIndexSweep and SettlementSweep now run on a 60s timer with graceful shutdown. Closes the "recovery not wired" gap.
1. Remove process-local bounty cache entirely: mutable objects must not be cached without cross-process invalidation. getBounty() now always reads fresh from VFS. Add validateBountyTransition() call in transitionBounty() to reject stale state before CAS write. 2. Extend settlement recovery to handle "completed" status: if capture succeeded and completeBounty committed but settleBounty failed, the operation and SettlementSweep can now resume from "completed" state. Prevents stranded post-capture bounties.
… 1 MEDIUM 1. [critical] SettlementSweep hard-fails when bounty has reservationId but no creditsService — prevents settling escrowed bounties without actually capturing funds. 2. [high] Remove claimed→completed from state machine — force all settlement through pending_settlement pivot. Update conformance tests and bounty-logic tests to use beginSettlement first. 3. [medium] BountyIndexSweep now calls repairIndex unconditionally for every bounty — cleans both missing current-status entries AND stale old-status markers.
1. Remove SettlementSweep from server startup: local runtime has no CreditsService, so the sweep would hard-fail on escrowed bounties. Only BountyIndexSweep is registered. Settlement sweep will be enabled when a production CreditsService is wired in. 2. Release orphaned claims on bounty transition failure: if claimBounty() fails, re-read the bounty and release the claim only if the bounty is still open (confirming the transition didn't commit). Post-commit failures keep the claim for consistency.
1. Add pending_settlement to Zod schemas in core/schemas.ts and mcp/tools/bounties.ts — prevents parsers from rejecting bounties in the new pivot state. 2. Re-enable SettlementSweep in server: it safely recovers non-escrowed bounties (no reservationId). Escrowed bounties log an error and wait for CreditsService. Update doc comment in bounty.ts lifecycle.
SettlementSweep: completed bounties have already captured — skip the creditsService requirement and just advance to settled. Only pending_settlement bounties need the capture step. Remaining findings (out of scope for #240): - Claim renewal/heartbeat path: pre-existing design gap, not introduced by this branch. Tracked separately. - Nexus MCP sweep wiring: requires architectural changes to MCP server startup. Tracked as follow-up integration work.
1. Same-agent claim renewal: claimBountyOperation now allows the
current claim holder to extend their lease without reopening the
bounty. Different agents are still rejected. Prevents long-running
bounties from getting stranded when the claim lease expires.
2. Wire SweepReconciler into both MCP entry points:
- serve.ts (stdio): starts BountyIndexSweep + SettlementSweep
after store setup, stops on shutdown
- serve-http.ts (HTTP): starts at process level using zone-scoped
Nexus bounty store (not session-scoped), stops on shutdown
The reconciler now runs in all three runtimes that can create
bounties: HTTP server, stdio MCP, and HTTP MCP.
1. [high] Claim renewal with expired lease: detect if existing claim is expired and create a fresh claim ID instead of reusing the stale one. Rebinds the bounty to the new claim atomically. 2. [high] Remove SettlementSweep from MCP runtimes: no CreditsService available, escrowed bounties would fail every cycle. Only BountyIndexSweep registered. Settlement recovery deferred to #253. 3. [medium] BountyIndexSweep now detection-based: queries status-filtered lists to find actual drift, only calls repairIndex when missing. No more unconditional rewrite of every healthy bounty each cycle.
1. [high] Claim rebind after lease expiry: allow claimed→claimed self-transition so expired claim IDs get rotated to fresh ones. The bounty record is atomically rebound to the new claim. 2. [high] Re-enable SettlementSweep in MCP runtimes: completed bounties (already captured) can settle without CreditsService. Only pending_settlement+reservationId cases log errors. 3. [medium] repairIndex version-aware: re-reads with ETag before deleting stale markers. Skips cleanup if a concurrent transition changed the bounty between read and delete.
1. [high] Claim renewal checks lease validity (not just status): only reuse claimId if both status=active AND leaseExpiresAt > now. 2. [high] Compensation on rotated-claim rebind failure: release the orphaned new claim if bountyStore.claimBounty throws. 3. [medium] Remove sweep reconciler from stdio MCP: per-agent processes must not run zone-wide sweeps (N×load, CAS conflicts). Sweeps run only in HTTP server + HTTP MCP (singleton processes).
1. [high] Rebind compensation re-reads bounty: only releases the new claim if the bounty didn't commit the rebind (post-commit safety). 2. [medium] Serialized sweep cycles: in-flight guard prevents overlapping async cycles from contending with each other. 3. [medium] BountyIndexSweep stale-marker detection: known limitation — listBounties(status) filters stale entries before the sweep sees them. Full fix requires a raw index listing API (store-layer change). repairIndex handles cleanup when triggered by other paths.
All three "persistent" findings from the review loop are now fixed using existing Nexus VFS operations — no Nexus changes needed. 1. repairIndex race: check exists() before each stale marker delete, re-read the authoritative document right before deleting to confirm the bounty hasn't transitioned TO that status concurrently. 2. BountyIndexSweep stale-marker detection: new listIndexStatuses() method on NexusBountyStore checks which status index entries actually exist using client.exists(). Sweep now detects both missing current entries AND stale old-status entries in a single pass. 3. Added listIndexStatuses to BountyStore interface (optional) and FailingBountyStore wrapper.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #240. Addresses the three codex-flagged correctness bugs in bounty operations and adds infrastructure to prevent recurrence.
pending_settlementpivot state ensuressettleBountyOperationnever enters a terminal state beforecapture()confirms. Resumable from any intermediate state (pending_settlement,completed) by both the operation and the background reconciler.BountyIndexSweep(dual-write index repair),SettlementSweep(resume stalled settlements), andHandoffSweep(detect orphaned contributions).claimed → completedbypass removed — all settlement must go throughpending_settlement.validateBountyTransitionenforced inNexusBountyStorebefore every CAS write.What changed
bounty.ts,bounty-logic.ts,bounty-store.ts, both store implspending_settlementstate,beginSettlement(), 3-step settle flowsweep-reconciler.ts,bounty-index-sweep.ts,settlement-sweep.ts,handoff-sweep.tsserver/serve.ts,mcp/serve.ts,mcp/serve-http.tsoperations/bounty.tsclaimBountyOperationschemas.ts,mcp/tools/bounties.tspending_settlementadded to all Zod enumsbounty.test.ts,sweep-reconciler.test.ts,failing-bounty-store.tsAdversarial review
6 rounds of Codex adversarial review. 12 findings fixed (1 critical, 8 high, 1 medium). Key fixes:
completedbounties recoverable (capture already happened, just advance to settled)CreditsServicepending_settlementas mandatory pivotTest plan
bun test)bun run check— 2 pre-existing warnings only)capture()throws after state transitions → bounty retryable frompending_settlementcreateBounty()post-commit throw → reservation NOT voidedclaimBounty()post-commit throw → claim NOT released (or released only on pre-commit)pending_settlementbounties tosettledcompletedbounties (post-capture, just needssettleBounty)Follow-up
CreditsService(NexusPay): Production CreditsService (NexusPay integration) #253