-
Notifications
You must be signed in to change notification settings - Fork 32
fix: Prevent deadlock when releasing a realtime channel #693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The RealtimeChannels mutex is acquired to retrieve the channel relating to an inbound message, so holding that mutex whilst waiting for Detach to complete causes a deadlock because the inbound DETACHED message the Detach call is waiting for also tries to acquire the mutex. This commit fixes the deadlock by releasing the mutex before calling Detach, and then acquiring it again once it completes in order to release the channel. Signed-off-by: Lewis Marshall <lewis.marshall@ably.com>
WalkthroughAdjusts locking sequence in RealtimeChannels.Release to unlock before Detach and re-lock for map mutation. Adds an integration test to verify channel release triggers proper detach, state transitions, and channel removal. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant App as Client Code
participant RC as RealtimeChannels
participant M as Mutex
participant C as Channel
participant Map as ChannelMap
App->>RC: Release(name)
RC->>M: Lock
RC->>Map: Get(name)
RC->>M: Unlock
alt Channel exists
RC->>C: Detach()
Note right of C: Detach proceeds without RC lock
RC->>M: Lock
RC->>Map: Delete(name)
RC->>M: Unlock
RC-->>App: Return
else No channel
RC-->>App: Return (no-op)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
sacOO7
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
ably/realtime_channel_integration_test.go (1)
345-388: Consider adding tests for the deadlock scenario and edge cases.The test effectively verifies the happy path: channel attachment, release, state transitions, and cleanup. However, it doesn't test the specific deadlock scenario that the PR fixes or important edge cases.
Consider adding tests for:
- Deadlock scenario: A test that verifies
Releasedoesn't block when messages arrive during detach (the original bug)- Concurrent Release: Multiple goroutines calling
Releaseon the same channel simultaneously- Release on detached channel: Calling
Releaseon a channel that's alreadyDETACHEDorINITIALIZED- Release + Get race: Concurrent
GetandReleaseon the same channel nameWould you like me to generate test cases for these scenarios?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
ably/realtime_channel.go(1 hunks)ably/realtime_channel_integration_test.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
ably/realtime_channel_integration_test.go (6)
ablytest/sandbox.go (1)
NewRealtime(108-115)ablytest/recorders.go (1)
FullRealtimeCloser(202-204)ablytest/resultgroup.go (2)
Wait(17-31)ConnWaiter(116-163)ably/state.go (6)
ChannelStateChange(601-621)ChannelStateChange(623-623)ChannelStateAttaching(524-524)ChannelStateAttached(528-528)ChannelStateDetaching(532-532)ChannelStateDetached(535-535)ably/export_test.go (1)
ChannelStateChanges(203-203)ablytest/timeout.go (1)
Soon(12-12)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: integration-test (1.24, json)
- GitHub Check: integration-test (1.23, json)
- GitHub Check: integration-test (1.25, json)
- GitHub Check: integration-test (1.19, msgpack)
- GitHub Check: integration-test (1.19, json)
- GitHub Check: integration-test (1.20, json)
🔇 Additional comments (1)
ably/realtime_channel.go (1)
208-223: Approve deadlock fix as-is. The revised lock/unlock aroundDetachcorrectly prevents the previous deadlock, and our search found no concurrentGet/Releaseusage that would trigger the narrow race window.
The RealtimeChannels mutex is acquired to retrieve the channel relating to an inbound message, so holding that mutex whilst waiting for Detach to complete causes a deadlock because the inbound DETACHED message the Detach call is waiting for also tries to acquire the mutex.
This commit fixes the deadlock by releasing the mutex before calling Detach, and then acquiring it again once it completes in order to release the channel.
Summary by CodeRabbit
Bug Fixes
Tests