Promote dev to main: redis-back review-dispatch dedup#1249
Merged
zbigniewsobiecki merged 1 commit intomainfrom May 1, 2026
Merged
Promote dev to main: redis-back review-dispatch dedup#1249zbigniewsobiecki merged 1 commit intomainfrom
zbigniewsobiecki merged 1 commit intomainfrom
Conversation
…s gap (#1248) ucho/PR #194 (2026-05-01, CI fully green, single SHA) had two review runs dispatched 29 s apart — both `triggerType=ci-success`, same engine/model/ workItemId, both completed normally, both burned LLM tokens and posted reviews. Confirmed verbatim in Loki: identical `reviewDispatchKey` (`zbigniewsobiecki/ucho:194:9ed484df...`) appeared in claim logs from TWO processes — one from `agent-execution.ts:279` post-completion-hook running in the IMPL worker container, one from `check-suite-success.ts:240` running in the cascade-router. Both `claimReviewDispatch(key, ...)` returned `true` because the dedup `recentlyDispatched` Map at `review-dispatch-dedup.ts:5` was module-scoped, in-memory, per-process state. Two processes = two independent Maps = no dedup. Pure waste. PR #1246 explicitly called this out as out-of-scope ("With the architectural shift above, the in-memory dedup is sufficient. Skip."). That was wrong: the architectural shift in #1246 closes the worker-bail-out path but does nothing about the post-completion-hook path running in a different process. The bug class is broader than ucho/PR #194 — anywhere two processes might independently call `claimReviewDispatch` for the same (project, PR, SHA) is exposed: post-completion-hook ↔ check-suite-success (live now), review- requested ↔ post-completion-hook (waiting to bite), and any future horizontally-scaled router replicas. Architectural fix — move the dedup state to Redis so it's shared across all processes. Atomic primitive: `SET key value NX EX <ttl>` returns `'OK'` exactly once per key within the TTL, regardless of how many processes race it. No application-level locking, no race window. `src/triggers/github/review-dispatch-dedup.ts` — rewritten: - Drops the in-memory `Map<string, number>` and `cleanupExpiredEntries`. - Lazy IORedis singleton (mirrors `src/queue/cancel.ts:28-37`). - `claimReviewDispatch` / `releaseReviewDispatch` are now async. - Keys namespaced under `cascade:review-dedup:`. - Fail-closed on Redis errors: `claim` returns `false` and Sentry-captures under tag `review_dedup_redis_down`. Better to skip a legit dispatch than duplicate. Mirrors spec-017 fail-closed pipeline-capacity-gate posture. - 5-min TTL preserved from #1246. The pre-Redis "30-min TTL clears stale entries on router restart" CLAUDE.md note is now obsolete. - Test-only `__resetForTests()` flushes the namespace. Callers — minimal mechanical await additions: - `check-suite-success.ts:240` — `await claimReviewDispatch(...)`. - `check-suite-success.ts:274` — `onBlocked: () => { void releaseReviewDispatch(...) }` (callback signature is sync; fire-and-forget the release). - `review-requested.ts:101-102` — both calls become `await`. - `review-requested.ts:131` — same fire-and-forget shape. - `agent-execution.ts:279` (post-completion-hook) — `await`. Tests: - `tests/unit/triggers/github/review-dispatch-dedup.test.ts` rewritten around a vi.mock('ioredis', ...) factory whose closure captures a single in-memory store. Every `new Redis(...)` instance shares that backend, so the cross-process invariant is trivially testable: instantiate two IORedis clients and verify the second `SET NX EX` for the same key returns `null`. **This is the regression pin for ucho/PR #194.** - New "fails closed when Redis errors" test covers the Sentry-capture path under tag `review_dedup_redis_down`. - `check-suite-success.test.ts` and `review-requested.test.ts`: replaced the per-test `recentlyDispatched.clear()` with `vi.mock` of the dedup module. Tests that assert the dedup-skip path now use `mockClaimReviewDispatch.mockResolvedValueOnce(true).mockResolvedValueOnce(false)` to simulate the SET NX EX race. Verification: - npx vitest run --project unit-triggers --project unit-api --project unit-core → 7678/7678 pass. - npm run typecheck + npm run lint clean. - Direct-instance test demonstrates two IORedis clients against the shared backend correctly reject the second claim — the production scenario pinned in code. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Promotes 1 commit from `dev`:
What
Move the review-dispatch dedup state from a per-process in-memory `Map` to Redis so the dedup holds across all processes (router, IMPL workers running post-completion-hook, and any future router replicas).
Closes the silent-duplicate-dispatch bug where ucho/PR #194 (CI fully green, single SHA) had two reviews run 29 s apart from the same dedup key — the IMPL worker process and the router process each saw an empty in-memory Map and both claimed.
Reviewed and approved on #1248 by @nhopeatall. Full CI green on dev (3/3 workflows expected, mirroring prior promotions).
Risk
Low. Pure dedup-state migration to Redis (`SET NX EX` atomic primitive, fail-closed on Redis errors with Sentry capture). Tests (7678/7678) all pass; cross-process invariant pinned by direct-instance test.
🤖 Generated with Claude Code