strands-labs · cagataycali · Mar 31, 2026 · Mar 31, 2026
diff --git a/.github/workflows/test-lint.yml b/.github/workflows/test-lint.yml
@@ -20,19 +20,30 @@ jobs:
           ref: ${{ inputs.ref }}
           persist-credentials: false
 
+      - name: Install uv
+        uses: astral-sh/setup-uv@v4
+        with:
+          version: "latest"
+
       - name: Set up Python
         uses: actions/setup-python@v5
         with:
           python-version: '3.12'
-          cache: 'pip'
 
-      - name: Install dependencies
+      - name: Install system dependencies (OpenGL for MuJoCo)
         run: |
-          pip install --no-cache-dir hatch
-          pip install --no-cache-dir -e ".[all,dev]"
+          sudo apt-get update
+          sudo apt-get install -y libosmesa6-dev
+
+      - name: Install dependencies
+        run: uv sync --extra all --extra dev
 
       - name: Run lint
-        run: hatch run lint
+        run: |
+          uv run ruff check .
+          uv run ruff format --check .
 
       - name: Run tests
-        run: hatch run test -x --strict-markers
+        env:
+          MUJOCO_GL: osmesa
+        run: uv run pytest tests/ -x --strict-markers
diff --git a/.gitignore b/.gitignore
@@ -8,4 +8,5 @@ camera_samples
 build
 dist
 .strands_robots
-.coverage
+.coverage
+MUJOCO_LOG.txt
diff --git a/.python-version b/.python-version
@@ -0,0 +1 @@
+3.12
diff --git a/AGENTS.md b/AGENTS.md
@@ -2,7 +2,7 @@
 
 ## Overview
 
-`strands-robots` is a robot control library for [Strands Agents](https://strandsagents.com). It provides policy inference, teleoperation, calibration, and simulation tools for physical robots.
+`strands-robots` is a robot control and simulation library for [Strands Agents](https://strandsagents.com). It provides policy inference, simulation (MuJoCo), dataset recording (LeRobot format), teleoperation, and calibration tools for 38+ robots.
 
 ## Project Dashboard
 
@@ -13,17 +13,20 @@
 > create GitHub issues and add them to this board with Status + Priority set.
 > Never track work only in local markdown — the board is the source of truth.
 
-
-All work is tracked here. When creating follow-up items, create GitHub issues and add them to the board with Status and Priority set.
-
 ## Repository Structure
 
 ```
 strands_robots/
+├── __init__.py            # Public API: Robot, list_robots, resolve_name
+├── factory.py             # Robot("so100") → sim or hardware dispatch
+├── robot.py               # HardwareRobot class (real robot control)
+├── _async_utils.py        # Coroutine resolution helpers
+├── utils.py               # require_optional(), shared utilities
+│
 ├── policies/              # Policy providers (pluggable via registry)
 │   ├── base.py            # Abstract Policy base class
 │   ├── factory.py         # create_policy() factory + registry
-│   ├── mock.py            # MockPolicy for testing
+│   ├── mock.py            # MockPolicy for testing (random actions)
 │   ├── groot/             # NVIDIA GR00T N1.5/N1.6 inference
 │   │   ├── policy.py      # Gr00tPolicy (ZMQ + HTTP modes)
 │   │   ├── client.py      # Gr00tInferenceClient
@@ -33,43 +36,210 @@ strands_robots/
 │       ├── policy.py      # LerobotLocalPolicy (RTC support)
 │       ├── processor.py   # ProcessorBridge (pre/post pipelines)
 │       └── resolution.py  # Policy class resolution (v0.4/v0.5)
-├── registry/              # JSON registry for policy discovery
-├── tools/                 # Strands @tool functions
-│   ├── gr00t_inference.py # GR00T inference tool
-│   ├── lerobot_calibrate.py
-│   ├── lerobot_camera.py
-│   ├── lerobot_teleoperate.py
-│   ├── pose_tool.py
-│   └── serial_tool.py
-├── robot.py               # Core Robot class
-└── utils.py               # Shared utilities (require_optional, etc.)
-
-tests/                     # Unit tests (run with: hatch run test)
-tests_integ/               # Integration tests (run with: hatch run test-integ)
+│
+├── simulation/            # Simulation backends
+│   ├── base.py            # SimulationBackend ABC
+│   ├── factory.py         # create_simulation() dispatch
+│   ├── models.py          # SimWorld, SimRobot, SimObject, SimCamera dataclasses
+│   ├── model_registry.py  # URDF/MJCF path resolution
+│   └── mujoco/            # MuJoCo backend (primary)
+│       ├── simulation.py  # Simulation(AgentTool) — 35 actions via NL
+│       ├── backend.py     # _ensure_mujoco() lazy loader
+│       ├── mjcf_builder.py# Procedural MJCF XML generation
+│       ├── policy_runner.py # run_policy, eval_policy, replay_episode
+│       ├── recording.py   # start/stop_recording → LeRobot dataset
+│       ├── rendering.py   # RGB + depth offscreen rendering
+│       ├── randomization.py # Domain randomization (colors, physics, lighting)
+│       ├── scene_ops.py   # Inject/eject objects & cameras into live scenes
+│       └── tool_spec.json # AgentTool JSON schema (35 actions)
+│
+├── assets/                # Robot asset manager
+│   ├── __init__.py        # resolve_model_path(), list_available_robots()
+│   └── download.py        # Auto-download from MuJoCo Menagerie
+│
+├── dataset_recorder.py    # DatasetRecorder → LeRobot v3 parquet + video
+│
+├── registry/              # JSON registry for robots + policies
+│   ├── robots.json        # 38 robots, 120+ aliases, asset paths
+│   ├── robots.py          # get_robot(), list_robots(), resolve_name()
+│   ├── policies.json      # Policy provider registry
+│   ├── policies.py        # get_policy_info(), list_policies()
+│   └── loader.py          # JSON loader utilities
+│
+└── tools/                 # Strands @tool functions (for Agent use)
+    ├── download_assets.py # Download robot meshes from Menagerie/GitHub
+    ├── gr00t_inference.py # GR00T inference tool
+    ├── lerobot_calibrate.py
+    ├── lerobot_camera.py
+    ├── lerobot_teleoperate.py
+    ├── pose_tool.py
+    └── serial_tool.py
+
+tests/                     # Unit tests
+tests_integ/               # Integration tests (GPU + model weights)
+```
+
+## Setup & Development
+
+### Using uv (recommended)
+
+```bash
+# Clone and enter
+git clone git@github.com:strands-labs/robots.git && cd robots
+
+# Create env + install dev deps (uses .python-version=3.12 + uv.lock)
+uv sync --extra dev
+
+# Install with simulation support
+uv sync --extra sim --extra dev
+
+# Install with everything (sim + lerobot + groot)
+uv sync --extra all --extra dev
+
+# Or one-shot editable install
+uv pip install -e ".[all,dev]"
 ```
 
-## Development
+> **Note**: Python >=3.12 is required (enforced by `requires-python` and `.python-version`).
+> `uv.lock` is committed — all contributors get identical dependency versions.
+
+### Optional extras
+
+| Extra | What it installs | When you need it |
+|-------|-----------------|------------------|
+| `sim` | `mujoco`, `robot_descriptions`, `opencv`, `Pillow` | Simulation (MuJoCo) |
+| `lerobot` | `lerobot>=0.5` | LeRobot policy inference + dataset recording |
+| `groot-service` | `pyzmq`, `msgpack` | NVIDIA GR00T inference |
+| `all` | All of the above | Full development |
+| `dev` | `sim` + `pytest`, `ruff`, `mypy` | Running tests + linting |
+
+## Testing
+
+### Run unit tests
 
 ```bash
-# Install with all optional deps
-pip install -e ".[all]"
+# All unit tests (34 tests, ~1s)
+uv run pytest tests/ -v
+
+# Specific test files
+uv run pytest tests/test_factory.py -v      # 22 tests — Robot factory, registry, mode detection
+uv run pytest tests/test_mujoco_e2e.py -v   # 12 tests — MuJoCo physics, rendering, policy loop
+uv run pytest tests/test_registry.py -v     # Registry resolution, aliases
+uv run pytest tests/test_policies.py -v     # Policy creation, mock policy
+uv run pytest tests/test_utils.py -v        # Utility functions
+```
+
+### Run integration tests (needs GPU + model weights)
+
+```bash
+uv run pytest tests_integ/ -v --timeout=300
+```
+
+### What the tests cover
+
+**`test_factory.py`** (22 tests):
+- Name resolution: canonical, alias, case-insensitive, hyphen-to-underscore
+- `list_robots(mode=)`: all, sim, real, both — verifies registry filtering
+- Robot registry: so100 exists, all aliases valid, robot count, descriptions
+- Auto-detect mode: defaults to sim, env override (`STRANDS_ROBOT_MODE`), case-insensitive
+- Robot factory: `Robot()` is callable (AgentTool), unknown backend raises, newton raises NotImplementedError
+- URDF path passthrough: `Robot("so100", urdf_path="/custom/path.xml")`
+- Top-level import: `from strands_robots import Robot`
+
+**`test_mujoco_e2e.py`** (12 tests):
+- Simulation ABC: all required methods exist on base class
+- Shared dataclasses: SimWorld, SimRobot, SimObject, SimCamera, TrajectoryStep
+- Physics: step advances time, position actuators move joints, contacts detected, reset zeros time
+- Rendering: RGB frames (H×W×3 uint8), depth frames (H×W float32)
+- Mock policy loop: generates actions, full observe→act loop, loop with rendering
+- Domain randomization: color randomization changes model properties
+
+### Manual E2E validation
+
+```bash
+# Quick smoke test — Robot → MuJoCo → physics
+uv run python3 -c "
+from strands_robots import Robot
+sim = Robot('unitree_g1')
+print(sim.get_state()['content'][0]['text'])
+sim.step(n_steps=100)
+sim.render(width=320, height=240)
+sim.destroy()
+print('✅ MuJoCo E2E works')
+"
+
+# Full Agent integration — natural language → simulation
+uv run python3 -c "
+from strands_robots import Robot
+from strands import Agent
+robot = Robot('so100')
+agent = Agent(tools=[robot])
+result = agent('Get the simulation state and run mock policy for 1 second in fast mode on so100')
+print(result)
+robot.destroy()
+"
+
+# Policy + video recording
+uv run python3 -c "
+from strands_robots import Robot
+sim = Robot('so100')
+result = sim.run_policy(
+    robot_name='so100',
+    policy_provider='mock',
+    instruction='pick up the red cube',
+    duration=2.0,
+    fast_mode=True,
+    record_video='/tmp/so100_demo.mp4',
+    video_fps=30,
+)
+print(result['content'][0]['text'])
+sim.destroy()
+"
+
+# Dataset recording (LeRobot v3 format)
+uv run python3 -c "
+from strands_robots import Robot
+sim = Robot('so100')
+sim.start_recording(repo_id='local/demo', task='pick cube', root='/tmp/demo_dataset')
+sim.run_policy(robot_name='so100', policy_provider='mock', instruction='pick cube', duration=2.0, fast_mode=True)
+sim.stop_recording()
+sim.destroy()
+# Verify: /tmp/demo_dataset/meta/info.json + data/chunk-000/file-000.parquet
+import json
+info = json.load(open('/tmp/demo_dataset/meta/info.json'))
+print(f'✅ Dataset: {info[\"total_frames\"]} frames, {info[\"total_episodes\"]} episodes')
+"
+```
+
+### Supported robots for simulation
+
+Any robot in `registry/robots.json` with an `asset` field works. Assets are auto-downloaded from [MuJoCo Menagerie](https://github.com/google-deepmind/mujoco_menagerie) on first use via `robot_descriptions`.
+
+```bash
+# List all robots
+uv run python3 -c "from strands_robots import list_robots; [print(r['name']) for r in list_robots(mode='sim')]"
+```
+
+Key robots tested: `so100`, `unitree_g1` (30 joints), `panda` (Franka), `unitree_h1`, `aloha`.
+
+## The 5-Line Promise
 
-# Run tests
-hatch run test              # unit tests
-hatch run test-integ        # integration tests (needs GPU + model weights)
+```python
+from strands_robots import Robot
+from strands import Agent
 
-# Lint & format
-hatch run lint              # ruff check, ruff format --check, mypy
-hatch run format            # ruff check --fix, ruff format
+robot = Robot("so100")            # → MuJoCo sim, auto-downloads assets
+agent = Agent(tools=[robot])      # → 35 simulation actions as AgentTool
+agent("pick up the red cube")     # → agent orchestrates sim via natural language
 ```
 
 ## Key Conventions
 
-1. **Python 3.10+** — `requires-python = ">=3.10"` (lerobot extras require 3.12+)
-2. **Dependency bounds** — `>=1.0` deps: cap major. `<1.0` deps: cap minor. E.g. `lerobot>=0.5.0,<0.6.0`
+1. **Python 3.12+** — `requires-python = ">=3.12"` (pinned in `.python-version`)
+2. **Dependency bounds** — `>=1.0`: cap major. `<1.0`: cap minor. E.g. `lerobot>=0.5.0,<0.6.0`
 3. **`__init__.py` must be thin** — exports only, no logic
 4. **Imports at file top** — unless lazy-loading heavy deps with documented reason
-5. **Raise on fatal errors** — never warn-and-continue if the system will behave unexpectedly
+5. **Raise on fatal errors** — never warn-and-continue if behavior will be wrong
 6. **No silent defaults on error** — returning zero-valued actions on failure is forbidden
 7. **Use `require_optional()`** — from `strands_robots/utils.py` for all optional deps
 8. **Integration tests required** — each policy needs `tests_integ/` tests with real inference
@@ -79,7 +249,7 @@ hatch run format            # ruff check --fix, ruff format
 ## PR Workflow
 
 1. Create feature branch from `main`
-2. Make changes, run `hatch run format && hatch run lint && hatch run test`
+2. Make changes, run `uv run ruff check . && uv run ruff format --check . && uv run pytest tests/ -v`
 3. All tests must pass, lint must be clean
 4. Open PR from your fork, address all review comments
 5. Track follow-up items as issues on the [project board](https://github.com/orgs/strands-labs/projects/2)