Skip to content

sim/mujoco: add concurrency stress test for _require_no_running_policy mutation guard #119

@cagataycali

Description

@cagataycali

Problem

The mutation guard (_require_no_running_policy) is the load-bearing safety
mechanism that stops the LLM from scheduling a scene mutation while a policy
worker is mid-step. If it has a race, we get SIGSEGV on stale model/data
pointers.

We currently trust that:

  1. self._policy_threads[robot_name] = future and any(not f.done() for f in ...) are atomic enough in practice
  2. The executor's submit() of the policy worker happens-before the next
    mutation's check (i.e. the dict assignment is visible before
    _require_no_running_policy runs)
  3. A policy worker finishing and setting its future to done() doesn't race
    with a mutation that checks "is anything running" mid-transition

No test proves any of this.

Proposal

Add tests/simulation/mujoco/test_concurrency.py:

  • Test 1: mutation-during-policy rejected. Start a policy that runs for ~5
    seconds. From the main thread, rapidly call set_gravity() 1000x. Assert:
    every single call returns a friendly error, none succeed, no segfault.
  • Test 2: rapid start→stop→start→stop. Stress the future lifecycle. Assert
    _policy_threads state stays consistent.
  • Test 3: mutation accepted immediately after policy completes. Assert
    that once the future is done, the very next mutation succeeds — no
    lingering state.

Use a cheap mock policy (lambda obs: {"qpos": [0]*n}) so the stress loop
doesn't take minutes.

Acceptance

  • Tests marked pytest.mark.slow if needed (run in hatch run test-integ)
  • Or keep fast enough for hatch run test if the mock is trivial
  • Any discovered races get fixed + regression-tested before closing this issue

Surfaced by a second-opinion review of PR #85.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions