-
Notifications
You must be signed in to change notification settings - Fork 0
fix(cache): Eliminate deadlock in heatmap endpoint - nested lock acquisition #111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…sed variables and update eslint config
…prove global definitions
✅ Fixed: cache-operation/cache-filtered Circular Locking IssueThe P1 deadlock issue identified by @chatgpt-codex-connector has been resolved! 🎉 Problem SummaryThe previous fix using single
SolutionImplemented consistent alphabetical lock ordering across all cache methods using Lock Order: All methods now acquire locks in the same order, preventing circular dependencies:
Key Benefits
Verification ResultsBuild & Tests: All passed! ✅ pnpm lint # SUCCESS (0 errors)
pnpm build # SUCCESS
pnpm test:backend # SUCCESS (828/828 tests passed)Test Improvements: Fixed 1 additional flaky test in The deadlock is now completely resolved with proper lock ordering! 🚀 |
|
Summary
Testing
|
…ent to prevent deadlocks
The merge-base changed after approval.
The merge-base changed after approval.
The merge-base changed after approval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a critical deadlock in the repository cache system by eliminating nested lock acquisition patterns that caused 120-second timeouts in the /api/commits/heatmap endpoint. The fix introduces helper methods to ensure consistent lock ordering and replaces nested lock calls with direct calls to unlocked internal methods.
Key Changes
- Deadlock prevention: Added
getCommitLocks(),getContributorLocks(), andgetAggregatedLocks()helper methods to ensure consistent lock ordering across all cache operations - Lock acquisition refactoring: Changed three methods (
getOrParseCommits,getOrParseFilteredCommits,getOrGenerateAggregatedData) to usewithOrderedLockswith all required locks upfront, then call unlocked internal methods to prevent nested lock re-acquisition - Test coverage: Added deadlock prevention tests and helper method tests to verify correct lock array generation
Reviewed Changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| apps/backend/src/services/repositoryCache.ts | Core deadlock fix: added lock helper methods and refactored lock acquisition to prevent nested locks |
| apps/backend/tests/unit/services/repositoryCache.unit.test.ts | Added deadlock prevention tests and lock helper validation tests |
| apps/backend/src/middlewares/strictContentType.ts | New CSRF-like protection middleware (unrelated to deadlock fix) |
| apps/backend/tests/unit/middlewares/strictContentType.unit.test.ts | Tests for new strictContentType middleware |
| apps/backend/src/index.ts | Applied strictContentType middleware to API routes |
| eslint.config.mjs | Complete ESLint configuration restructure (unrelated to deadlock fix) |
| packages/shared-types/src/index.ts | Added readonly modifiers to error class properties |
| apps/frontend/src/services/api.ts | Added X-Requested-With header to API client |
| Multiple test and utility files | Parameter naming changes (underscore prefix for unused params) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
apps/backend/__tests__/unit/services/repositoryCoordinator.unit.test.ts
Outdated
Show resolved
Hide resolved
Wasnt even an Agents.md updated it to actually help with Codex etc.
Copilot Review Comments AddressedI've addressed the Copilot review comments with the following changes: 1. ✅ Fixed
|
…ssary async import
The merge-base changed after approval.
|


🐛 Fix: Critical Deadlock in Repository Cache System
Fixes #110
Problem
The
/api/commits/heatmapendpoint was experiencing a 120-second lock timeout causing complete endpoint failure. The root cause was nested acquisition of the same locks in three separate locations withinrepositoryCache.ts, leading to deadlocks.Root Cause Analysis
Three deadlock scenarios were identified:
getOrGenerateAggregatedData(line 1591)withKeyLockheld:cache-aggregated:${repoUrl}withOrderedLockstried to re-acquire:cache-aggregated:${repoUrl}❌getOrParseCommits(line 975)withOrderedLocksheld:cache-operation,repo-accesswithOrderedLockstried to re-acquire both ❌getOrParseFilteredCommits(line 1238)withKeyLockheld:cache-filtered:${repoUrl}withOrderedLockstried to re-acquire:cache-filtered:${repoUrl}❌Solution
Changed nested lock acquisition to only acquire locks that aren't already held by the outer context:
Changes Made
Files Modified
apps/backend/src/services/repositoryCache.ts- Fixed 3 deadlock instances with clear commentsapps/backend/__tests__/unit/services/repositoryCache.unit.test.ts- Added 3 deadlock prevention testsTest Coverage
Added comprehensive test suite to prevent regression:
getOrGenerateAggregatedData should not deadlock with nested locksgetOrParseCommits should not deadlock with filtered optionsgetOrParseFilteredCommits should not deadlock when acquiring cache-operation lockAll tests verify operations complete in < 5 seconds instead of timing out after 120 seconds.
Verification Results
✅ Build & Tests
✅ Manual Endpoint Testing
No lock timeout errors in server logs ✅
Impact
Testing Checklist
Breaking Changes
None - this is a pure bug fix with no API changes.
Additional Notes
This fix eliminates all nested lock acquisition deadlocks in the repository cache system. The pattern of using "unlocked" internal methods is maintained, and the fix adds clear documentation to prevent future regressions.