-
Notifications
You must be signed in to change notification settings - Fork 97
Multi-Threading is Broken - Needs Simplification #634
Description
What's wrong
-
The main bug: The worker thread uses
condition_variable::wait()without a predicate. This means threads can wake up spuriously, see an empty queue, and bail out - silently dropping tasks. Entities randomly don't get stepped. -
No RAII locks: There are 36 manual
lock()/unlock()pairs. If anything throws an exception, the mutex stays locked forever → deadlock. -
Too many mutexes: 10 separate mutexes protecting simple boolean flags and related timing data. This is overkill and creates complex ordering rules.
The Fix
Convert 5 mutexes → atomics (for simple bool flags)
| Delete This Mutex | Replace With |
|---|---|
finished_mutex_ |
std::atomic<bool> finished_ |
exit_mutex_ |
std::atomic<bool> exit_ |
paused_mutex_ |
std::atomic<bool> paused_ |
single_step_mutex_ |
std::atomic<bool> single_step_ |
take_step_mutex_ |
std::atomic<bool> take_step_ |
Consolidate 3 timing mutexes → 1
These all protect related timing state and are often needed together (computing actual time warp requires both elapsed wall time and sim time atomically). Also, time_warp_mutex_ is declared but never used.
| Delete These | Replace With |
|---|---|
timer_mutex_ |
timing_mutex_ (single mutex) |
time_mutex_ |
|
time_warp_mutex_ |
Keep 2 mutexes (with improvements)
| Mutex | Change |
|---|---|
contacts_mutex_ |
Convert to std::shared_mutex (concurrent reads) |
entity_pool_mutex_ |
Keep as-is, fix the condition variable predicate |
Other changes
- Replace all 36 manual
lock()/unlock()calls with RAII (lock_guard/unique_lock) - Add predicate lambda to
condition_variable::wait()to fix spurious wakeups
Total: 10 mutexes → 3 mutexes + 5 atomics