-
Notifications
You must be signed in to change notification settings - Fork 14
feat: MuJoCo simulation backend — AgentTool with 35 actions #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
cagataycali
wants to merge
83
commits into
strands-labs:main
Choose a base branch
from
cagataycali:feat/mujoco-backend
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
83 commits
Select commit
Hold shift + click to select a range
00bad69
feat: MuJoCo simulation backend — AgentTool with 35 actions
cagataycali 609ccc6
fix: address all review comments — ABC, thread-safety, injection, cle…
cagataycali 6cc4239
fix: resolve lint errors — import ordering, format, strict param
cagataycali b3a04d2
fix: acquire _lock around MuJoCo data mutations + sanitize all XML names
cagataycali a6f12bc
ci: add MuJoCo system deps (libosmesa6-dev + MUJOCO_GL=osmesa)
cagataycali 166bea8
feat: add [sim] extra with mujoco dependency
cagataycali 4825e44
fix: rename [sim] extra to [sim-mujoco] per review
08eed8c
fix: rebase on simulation-foundation — SimulationBackend→SimEngine, u…
cagataycali 6b8f6c2
fix: resolve all mypy errors — mixin overrides, Optional types, impor…
cagataycali ea4b0e0
fix: properly fix mypy errors instead of blanket suppression
cagataycali 2b8c46b
fix: zero mypy suppressions — proper type declarations instead of dis…
cagataycali b28b410
feat(sim): use require_optional for imageio in policy_runner
cagataycali 2c7a5c7
fix: address 8 review threads — deps, exports, init, headless, coupli…
cagataycali ce63012
fix: replace Protocol annotation with direct TYPE_CHECKING stubs in P…
strands-agent 0131c88
refactor(mujoco): migrate SimWorld private fields to _backend_state
a1fc8f9
fix(mujoco): resolve 4 bugs — add_robot world model, eval_policy doub…
cagataycali 5a3686c
fix: sync sim_time/step_count in replay_episode and eval_policy, add …
cagataycali f909dba
fix(mujoco): prevent C-level abort on headless without EGL/OSMesa
cagataycali c534e0a
fix(mujoco): resolve mesh path mismatch during robot injection
cagataycali 08c4f80
fix(mujoco): forward observation_mapping/action_mapping through tool_…
cagataycali 99e61c8
refactor(sim): extract backend-agnostic PolicyRunner
cagataycali f7c5f7f
fix(mujoco): make renderer cache thread-local to prevent CGL segfault
cagataycali 77c8719
test(mujoco): regression for renderer thread-safety (CGL segfault)
cagataycali 815d09e
fix(mujoco): reset mj_saveLastXML global state for all inject/eject p…
cagataycali 8ab990c
feat(mujoco): support multiple same-config robots via XML namespacing
cagataycali 9d90116
fix(mujoco): physics + recording callsites for namespaced entities
cagataycali 2f5297b
refactor(sim): tighten get_observation schema, drop camera_name param
cagataycali 8b17112
refactor(sim): consolidate run_policy video params + XML injection fu…
cagataycali 8d03ffa
fix: address 10 blocking review issues from @yinsong1986
cagataycali 57f98ac
fix: extend thread-safety locks, correct apply_force docstring, add t…
cagataycali 452dbda
test: add camera roundtrip + multi-robot asset dir regression tests
cagataycali c1f39c7
fix: address review feedback from @yinsong1986 (2026-05-01)
cagataycali e757cc9
style: drop date reference in reset comment (non-blocking nit)
strands-agent b5fb2f5
fix: block scene mutations while policy is running (concurrency guard)
strands-agent 1557ce1
fix: add concurrency guards to move_object and remove_robot
cagataycali b89d83c
fix: narrow lock gap in set_body/geom_properties + add @requires_gl t…
strands-agent 6781ba6
fix: mock importlib in factory test so it passes when mujoco IS insta…
strands-agent c724b34
feat(sim): multi-camera capture, public DX polish, MP4 recording fix
cagataycali b162281
fix: update _stop_policy → stop_policy in regression tests
strands-agent 0b95948
test: mirror tests/ layout to strands_robots/ source tree
cagataycali c3c7ff9
fix(tests): add CI skipif guards to recording tests in test_rendering.py
cagataycali f83d39b
fix: resolve lint errors from test restructuring
cagataycali b2498ed
chore: strip emojis/dividers from logs+strings, fix leading-space art…
cagataycali 4904164
feat(sim): prefix joint names per-robot in multi-robot recordings
cagataycali 30e35c0
test: add coverage for error paths, module API, multi-robot recordings
cagataycali 1cf4465
chore: apply ruff format/lint fixes
cagataycali e10aeb6
fix(tests): use inline XML fixtures, avoid network-dependent so101 model
cagataycali 4454717
fix(tests): use inline XML fixtures in test_recording_paths to avoid …
cagataycali 0bc42c2
fix(ci): add features=["all"] to hatch env, add importorskip guards
cagataycali 13d125f
fix(ci): install ffmpeg for torchcodec, add remaining importorskip gu…
cagataycali 294d280
fix(ci): install ffmpeg for torchcodec, handle video decode RuntimeError
cagataycali 64ac60b
fix(tests): reduce policy duration and increase result timeout for CI
cagataycali 4ab6454
fix(sim/mujoco): T7/T9/T10 input validation
cagataycali 39578ef
fix(sim/mujoco): T5/T8/T11/T38 concurrency guards + input validation
cagataycali cdb31af
fix(sim/mujoco): T6 add_robot leaves clean zero state
cagataycali 0c11652
fix(sim/mujoco): T3 render/render_depth reject unknown camera_name
cagataycali ec011c4
fix(sim/mujoco): T2 add_camera target actually orients the view
cagataycali d666f59
fix(sim/mujoco): T1/T13 router validation + tool_spec parity
cagataycali 48c0fc0
fix(sim/mujoco): T4 renderer TLS cache cleanup + reuse
cagataycali 8344eee
fix(sim/mujoco): T12 split video-recording story clearly
cagataycali 719fe2a
fix(sim/mujoco): T14/T15/T45 unified error messages
cagataycali cc45b7e
fix(sim/mujoco): T16/T24 idempotent stop-family + friendly stop_policy
cagataycali 38c8ee5
fix(sim/mujoco): T18/T19 mj_forward before reading mass matrix + cont…
cagataycali cf64997
fix(sim/mujoco): T20/T21 friendly render dim + depth warning
cagataycali c3447f6
feat(sim/mujoco): T32/T33/T35/T42 per-entity filters + URDF path vali…
cagataycali ac06c99
feat(sim/mujoco): T27/T28/T29/T30/T34/T41 API-ergonomics batch
cagataycali e58a9e4
feat(sim/mujoco): T31/T17/T37 recording-status lifecycle + observability
cagataycali 10f31ad
feat(sim/mujoco): T22/T23/T25 parameter unification
cagataycali f5c8518
chore(sim/mujoco): strip T# tracking comments, inline world-check, cl…
cagataycali 5f3a2f7
docs(sim/mujoco): T40/T47/T48/T49/T50/T51 document namespacing + impr…
cagataycali 0c8621d
docs: D1 CHANGELOG.md + D4 README simulation section for PR #85
cagataycali 9ab61f5
fix(tests): resolve tool_spec.json via module path, not hardcoded abs…
cagataycali 4ddf236
style: fix ruff format in test_agenttool_contract.py
3ef5afb
fix(sim/mujoco): mixed data_config robots coexist in one scene
cagataycali 9cdf12f
fix(sim/mujoco): remove_robot actually removes the robot from the com…
cagataycali 54b2e55
test(sim): lift factory.py 26→97% and policy_runner 52→91% coverage
cagataycali c73465f
test(sim): lift backend.py 58→93% + add MJCF builder unit tests
cagataycali 7d9a0d9
fix(registry): list_urdfs column widths for narrow terminals (#113)
cagataycali 53fcf47
perf(sim/mujoco): skip camera render in get_observation for non-VLA p…
cagataycali e26275b
perf(sim/mujoco): cache tool_spec JSON at module load
cagataycali d275277
test(hygiene): guard against committed host-specific absolute paths
cagataycali 3a6ad50
fix(sim/mujoco): load_scene + add_* interaction (#115)
cagataycali 306220e
feat(sim/mujoco): support concurrent per-robot policies (#114)
cagataycali File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -10,3 +10,7 @@ dist | |
| .strands_robots | ||
| .coverage | ||
| .ideation/ | ||
| MUJOCO_LOG.TXT | ||
| TASKS.md | ||
| TASKS_TO_FIX_85.md | ||
| .coverage.* | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,187 @@ | ||
| # CHANGELOG | ||
|
|
||
| All notable behavioural changes to `strands-robots` are logged here. Follows | ||
| [Keep a Changelog](https://keepachangelog.com/) conventions. | ||
|
|
||
| ## Unreleased — PR #85 (MuJoCo backend remediation) | ||
|
|
||
| ### Breaking | ||
|
|
||
| These changes tighten the MuJoCo AgentTool contract. Legacy callers that | ||
| silently worked by accident will now receive a clear error instead: | ||
|
|
||
| - **Router input validation**: The ``_dispatch_action`` router rejects any | ||
| top-level parameter that isn't declared on the target method. Passing | ||
| ``step(num_steps=5)`` (wrong name) or ``set_gravity(device="mps")`` | ||
| (stray kwarg) now errors with *"Unknown parameter X for action Y. | ||
| Valid: [...]"* instead of silently dropping the value. Methods whose | ||
| Python signature includes ``**kwargs`` (e.g. ``add_object``) keep their | ||
| pass-through semantics. | ||
| - **Missing required args**: produce *"Action X requires parameter Y."* | ||
| instead of a raw Python ``TypeError``. | ||
| - **Vector dimension validation**: ``position``, ``target``, ``origin``, | ||
| ``force``, ``torque``, ``gravity``, ``direction``, ``point``, ``orientation`` | ||
| (quaternion), and ``color`` (rgba) all validated for length + numeric | ||
| dtype before reaching numpy/MuJoCo. | ||
| - **Camera orientation**: ``add_camera(target=[x,y,z])`` is now honoured | ||
| by baking ``xyaxes`` into the MJCF ``<camera>``. Previously the target | ||
| was silently dropped and every custom camera rendered a default view. | ||
| Degenerate case (``target == position``) errors. | ||
| - **Render camera validation**: ``render(camera_name="missing")`` errors | ||
| with *"Camera 'missing' not found."* instead of silently falling back | ||
| to the free camera while claiming to render from the named one. | ||
| - **Raycast zero-direction guard**: ``raycast(direction=[0,0,0])`` now | ||
| errors with *"direction vector is zero-length"*. Previously MuJoCo's | ||
| C-level ``mj_ray`` would abort the Python process. | ||
| - **apply_force requires a non-zero vector**: passing neither ``force`` | ||
| nor ``torque`` (or both zero) errors. Previously the call silently | ||
| succeeded with no effect. | ||
| - **step(n_steps<0)** rejected (previously it corrupted ``step_count``). | ||
| - **Negative mass / timestep / size** rejected per shape; previously | ||
| ``set_body_properties(mass=-1)`` and ``set_timestep(-0.01)`` silently | ||
| succeeded. | ||
| - **Plane objects auto-static**: ``add_object(shape="plane")`` now forces | ||
| ``is_static=True`` (planes are infinite in MuJoCo). Explicit | ||
| ``is_static=False`` on a plane is a hard error. | ||
| - **Duplicate camera name** rejected. Previously a second ``add_camera`` | ||
| with an existing name silently overwrote the registry entry while | ||
| leaving the old camera in the XML — ghost behaviour. Use | ||
| ``remove_camera`` + ``add_camera`` to replace. | ||
| - **stop_policy(robot_name='')** errors with *"stop_policy requires | ||
| 'robot_name'."* instead of silently matching the first robot. | ||
| - **eval_policy** requires an explicit ``robot_name``. Default | ||
| ``n_episodes`` lowered from 10 to 1. | ||
| - **register_urdf** validates the path: file must exist, be a file, and | ||
| be readable. Previously bad paths were cached and blew up later. | ||
|
|
||
| ### Recording backend split | ||
|
|
||
| - ``start_recording`` (LeRobotDataset: parquet + per-camera MP4) still | ||
| requires the ``[lerobot]`` extra. Its error message when lerobot is | ||
| missing now points callers at ``start_cameras_recording`` for plain | ||
| MP4 (which runs under ``[sim-mujoco]`` alone via imageio-ffmpeg). | ||
| - No API change — the fix is informational. | ||
|
|
||
| ### Resource hygiene | ||
|
|
||
| - ``destroy()`` and ``cleanup()`` now close renderers on the main thread | ||
| and empty the TLS cache. Previously each ``create_world/destroy`` | ||
| cycle leaked one ``mujoco.Renderer`` + its GL context (~33 MB per | ||
| cycle measured). Worker-thread renderers still release themselves on | ||
| thread teardown (we avoid cross-thread ``close()`` to prevent | ||
| ``cgl.free()`` SIGSEGVs on macOS). | ||
| - ``get_mass_matrix`` and ``get_contacts`` run ``mj_forward`` first so | ||
| values are valid immediately after a ``reset`` or ``add_robot`` | ||
| (previously returned stale / uninitialised memory). | ||
|
|
||
| ### Concurrency guards | ||
|
|
||
| Write-mutations are now refused while a policy is running on any robot | ||
| in the world. Previously these could race the policy worker thread and | ||
| produce undefined behaviour or SIGSEGV: | ||
|
|
||
| reset, set_gravity, set_timestep, set_joint_positions, | ||
| set_joint_velocities, apply_force, set_body_properties, | ||
| set_geom_properties, load_state, randomize, move_object | ||
|
|
||
| The error now lists *which* robot(s) are active so the LLM can | ||
| ``stop_policy`` on each without guessing: *"Cannot 'X' while a policy | ||
| is running on 'armA', 'armB'. Stop it first: action='stop_policy'."* | ||
|
|
||
| ### Concurrent per-robot policies (GH #114) | ||
|
|
||
| Multiple ``start_policy`` calls on *different* robots now run | ||
| concurrently. MuJoCo physics is still serialized via ``self._lock`` | ||
| (``mj_step`` and ``ctrl[]`` writes are not thread-safe for concurrent | ||
| mutation), but each policy owns a disjoint slice of ``data.ctrl[]`` so | ||
| two VLA arms can operate in the same scene without semantic conflict. | ||
|
|
||
| - ``start_policy("armA")`` + ``start_policy("armB")`` both succeed. | ||
| Second call no longer hits a global "policy already running" gate. | ||
| - ``start_policy`` on the *same* robot while its policy is active | ||
| still errors (unchanged). | ||
| - ``remove_robot("X")`` now gracefully stops X's own policy before | ||
| removing, instead of requiring a prior ``stop_policy("X")``. Still | ||
| errors if a *different* robot has an active policy (XML round-trip | ||
| invalidates cached IDs everywhere). | ||
| - New action ``list_policies_running`` returns the names of robots | ||
| with live policies. Prunes completed Futures as a side-effect. | ||
| - Completed policy Futures are no longer retained forever in | ||
| ``_policy_threads`` (GH #120 companion fix). | ||
|
|
||
| ### Error message consistency | ||
|
|
||
| - All "no world" paths return the same string: | ||
| *"No world. Call create_world (or load_scene) first."* | ||
| - Unknown-name errors use a uniform ``<Kind> 'X' not found.`` shape | ||
| (Robot / Object / Body / Geom / Joint / Sensor / Camera / Checkpoint). | ||
| - ``stop_recording``, ``stop_cameras_recording``, ``stop_policy``, | ||
| ``close_viewer`` are now **idempotent**: calling them when nothing | ||
| is running returns ``status="success"`` with a *"Was not ..."* message | ||
| so callers can invoke them unconditionally. | ||
| - ``get_recording_status`` returns success in every lifecycle state | ||
| (no world / not recording / recording). | ||
|
|
||
| ### Deprecations | ||
|
|
||
| - **add_robot name-as-registry fallback**: passing ``name="my_bot"`` | ||
| without ``urdf_path`` or ``data_config`` used to resolve ``my_bot`` in | ||
| the model registry. This now fires a ``DeprecationWarning``. Use | ||
| ``add_robot(name="...", data_config="<registry_key>")`` instead. Will | ||
| be removed next major release. | ||
|
|
||
| ### New / extended actions | ||
|
|
||
| - ``forward_kinematics(body_name="X")`` filters to a single body. | ||
| - ``get_features(robot_name="X")`` filters to a single robot's joints | ||
| and actuators. | ||
| - ``set_geom_properties(geom_name="X")`` accepts the bare object name | ||
| as an alias for the injected ``"{name}_geom"``. | ||
| - ``render_all`` flags cameras whose frame has near-zero pixel variance | ||
| (``"⚠️ camera 'X': image appears empty (variance < 1)"``). | ||
| - ``render_depth`` surfaces MuJoCo's one-time ``ARB_clip_control`` | ||
| warning in the response text on macOS, so the LLM knows when depth | ||
| accuracy is reduced. | ||
| - ``render`` / ``render_depth``: width/height validated up front; | ||
| oversized requests get a plain-English message naming the actual | ||
| framebuffer cap (``<global offwidth=...>``) instead of MuJoCo's raw | ||
| error. | ||
| - ``run_policy`` / ``start_policy``: accept optional ``n_steps`` | ||
| (primary) or legacy ``max_steps`` as an alternative to | ||
| ``duration``+``control_frequency``. ``duration = n_steps / | ||
| control_frequency`` when ``n_steps`` is set. | ||
| - **New ``list_policies_running``** action returns the names of robots | ||
| with a live policy — pairs with the new concurrent-policy support | ||
| (see *Concurrent per-robot policies* above). | ||
| - ``randomize(randomize_physics=True)`` now reports per-body mass scales | ||
| and per-geom friction scales in the response (not just range | ||
| endpoints). | ||
| - ``get_contacts`` resolves unnamed geoms to | ||
| ``"<body_name>/geom_<id>"`` so contact pairs are always human-readable. | ||
| - ``get_sensor_data(sensor_name="X")`` on a model with no sensors now | ||
| distinguishes *"Sensor 'X' not found. Model has no sensors."* from | ||
| the generic "no sensors in model" success. | ||
|
|
||
| ### Tests | ||
|
|
||
| - New: ``tests/simulation/mujoco/test_agenttool_contract.py`` — ~50 | ||
| tests that lock in router validation, tool_spec ↔ method parity, | ||
| unified error messages, idempotent stop family, ``mj_forward`` before | ||
| reads, render-dim validation, feature filters, camera duplicate | ||
| policy, plane auto-static, policy horizon unification, and more. | ||
| - New: ``tests/simulation/mujoco/test_renderer_hygiene.py`` — 4 tests | ||
| asserting TLS cache is emptied on ``destroy``, renderer reuse works | ||
| for identical ``(w,h)``, and ``create_world`` after ``destroy`` | ||
| rebuilds cleanly. | ||
| - New: ``tests/simulation/mujoco/test_recording_backends.py`` — 2 tests | ||
| (one skipped when ``lerobot`` IS installed) pinning the | ||
| MP4-without-lerobot backend. | ||
| - New: ``tests/simulation/mujoco/test_input_validation.py`` — 11 tests | ||
| for step/raycast/apply_force validation. | ||
| - New: ``tests_integ/test_resource_hygiene.py`` — 3 integration tests | ||
| (require ``psutil``): 50 create/destroy cycles grow RSS < 50 MB; 500 | ||
| renders at fixed dims grow RSS < 100 MB; TLS cache cleared on destroy. | ||
|
|
||
| Test count: **256 → 362** (+106 new regression tests), zero | ||
| regressions. ``hatch run lint`` (ruff + mypy) clean across 102 source | ||
| files. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.