Skip to content

Multi-Threading is Broken - Needs Simplification #634

@EthanMBoos

Description

@EthanMBoos

What's wrong

  1. The main bug: The worker thread uses condition_variable::wait() without a predicate. This means threads can wake up spuriously, see an empty queue, and bail out - silently dropping tasks. Entities randomly don't get stepped.

  2. No RAII locks: There are 36 manual lock()/unlock() pairs. If anything throws an exception, the mutex stays locked forever → deadlock.

  3. Too many mutexes: 10 separate mutexes protecting simple boolean flags and related timing data. This is overkill and creates complex ordering rules.

The Fix

Convert 5 mutexes → atomics (for simple bool flags)

Delete This Mutex Replace With
finished_mutex_ std::atomic<bool> finished_
exit_mutex_ std::atomic<bool> exit_
paused_mutex_ std::atomic<bool> paused_
single_step_mutex_ std::atomic<bool> single_step_
take_step_mutex_ std::atomic<bool> take_step_

Consolidate 3 timing mutexes → 1

These all protect related timing state and are often needed together (computing actual time warp requires both elapsed wall time and sim time atomically). Also, time_warp_mutex_ is declared but never used.

Delete These Replace With
timer_mutex_ timing_mutex_ (single mutex)
time_mutex_
time_warp_mutex_

Keep 2 mutexes (with improvements)

Mutex Change
contacts_mutex_ Convert to std::shared_mutex (concurrent reads)
entity_pool_mutex_ Keep as-is, fix the condition variable predicate

Other changes

  • Replace all 36 manual lock()/unlock() calls with RAII (lock_guard/unique_lock)
  • Add predicate lambda to condition_variable::wait() to fix spurious wakeups

Total: 10 mutexes → 3 mutexes + 5 atomics

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions