Skip to content

fix: handle deadlock conditions#145

Merged
worg merged 2 commits intogo-stomp:masterfrom
mfmarche:handle_disconnect_deadlock
Nov 10, 2025
Merged

fix: handle deadlock conditions#145
worg merged 2 commits intogo-stomp:masterfrom
mfmarche:handle_disconnect_deadlock

Conversation

@mfmarche
Copy link

@mfmarche mfmarche commented Nov 5, 2025

Found a deadlock scenario that I hit.

  1. A user calls Disconnect(), which locks c.closeMutex.
  2. At the same time, the processLoop goroutine encounters a network error and calls MustDisconnect().
  3. MustDisconnect() blocks, waiting to acquire c.closeMutex, which is held by Disconnect().
  4. Since processLoop is now blocked, it can no longer read from c.writeCh.
  5. Disconnect() blocks forever, waiting to send the DISCONNECT frame to c.writeCh (if the channel is full) or waiting for a RECEIPT that will never arrive because processLoop is stuck.

Resolved by adding a sendFrame with a timeout, and no longer holding the closeMutex.

Also found other situations where Ack/Nack would cause deadlock conditions. Those were also resolved.

@worg
Copy link
Collaborator

worg commented Nov 5, 2025

LGTM there are some examples failing in latest go version, i'll fix those

@mfmarche
Copy link
Author

mfmarche commented Nov 5, 2025

LGTM there are some examples failing in latest go version, i'll fix those

thanks, I did notice those, I wasn't exactly sure how you wanted them fixed, which is why I held off.

@worg
Copy link
Collaborator

worg commented Nov 8, 2025

fixed the pipeline, please update your branch

Found a deadlock scenario that I hit.

1. A user calls Disconnect(), which locks c.closeMutex.
2. At the same time, the processLoop goroutine encounters a network error and calls MustDisconnect().
3. MustDisconnect() blocks, waiting to acquire c.closeMutex, which is held by Disconnect().
4. Since processLoop is now blocked, it can no longer read from c.writeCh.
5. Disconnect() blocks forever, waiting to send the DISCONNECT frame to c.writeCh (if the channel is full) or waiting for a RECEIPT that will never arrive because processLoop is stuck.

Resolved by adding a sendFrame with a timeout, and no longer holding the
closeMutex.

Also found other situations where Ack/Nack would cause deadlock
conditions. Those were also resolved.
@mfmarche mfmarche force-pushed the handle_disconnect_deadlock branch from 9c99c3a to 5bbf998 Compare November 10, 2025 18:40
@worg worg merged commit e1987b3 into go-stomp:master Nov 10, 2025
2 checks passed
@mfmarche
Copy link
Author

thanks @worg!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants