Skip to content

Conversation

@dirk-zimoch
Copy link
Contributor

The scope of opslock was too small in processRequests(). It would need to overlap with the scope of clientlock but that could lead to deadlock. Solution: use only clientlock with large enough scope and drop opslock.

The scope of opslock was too small in processRequests().
It would need to overlap with the scope of clientlock but
that could lead to deadlock. Solution: use only clientlock
with large enough scope and drop opslock.
@dirk-zimoch dirk-zimoch added the crash Something isn't working label Jan 28, 2025
@ralphlange
Copy link
Member

Are you sure you can drop the opslock?

The 'outstanding operations' map is being used by multiple independent threads: EPICS threads that can add operations and either low-level driver threads (incoming data) or timer threads (timeouts) that remove them.
If that is the case, the map needs to be protected.

The deadlock argument is only true if both sides take both locks in different order. For that, the code under the opslock would have to take the clientlock.

@dirk-zimoch
Copy link
Contributor Author

Always when opslock would have been taken, clientlock is taken as well. In particular, the session workerThread loop keeps clientlock except while it is waiting for events. Thus, all callbacks (readComplete, writeComplete) are run with clientlock hold. processRequests takes clientlock before checking isConnected, so that connection status cannot change (and most important client cannot become invalid) before sending the request.

@ralphlange
Copy link
Member

That's the architectural difference between the two low-level drivers that I was unsure about.
Fine, then!

@ralphlange ralphlange merged commit b351023 into epics-modules:master Feb 11, 2025
5 of 21 checks passed
@dirk-zimoch
Copy link
Contributor Author

See also the comments on clientlock in the workerThread's run() function.

@dirk-zimoch dirk-zimoch deleted the bugfix_race_condition branch March 13, 2025 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

crash Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants