Control robots with natural language through Strands Agents. Integrates NVIDIA Isaac GR00T for vision-language-action policies and LeRobot for universal robot support.
graph LR
A[Natural Language<br/>'Pick up the red block'] --> B[Strands Agent]
B --> C[Robot Tool]
C --> D[Policy Provider<br/>GR00T/Mock]
C --> E[LeRobot<br/>Hardware Abstraction]
D --> F[Action Chunk<br/>16 timesteps]
F --> E
E --> G[Robot Hardware<br/>SO-101/GR-1/G1]
classDef input fill:#2ea44f,stroke:#1b7735,color:#fff
classDef agent fill:#0969da,stroke:#044289,color:#fff
classDef policy fill:#8250df,stroke:#5a32a3,color:#fff
classDef hardware fill:#bf8700,stroke:#875e00,color:#fff
class A input
class B,C agent
class D,F policy
class E,G hardware
flowchart TB
subgraph Agent["🤖 Strands Agent"]
NL[Natural Language Input]
Tools[Tool Registry]
end
subgraph RobotTool["🦾 Robot Tool"]
direction TB
RT[Robot Class]
TM[Task Manager]
AS[Async Executor]
end
subgraph Policy["🧠 Policy Layer"]
direction TB
PA[Policy Abstraction]
GP[GR00T Policy]
MP[Mock Policy]
CP[Custom Policy]
end
subgraph Inference["⚡ Inference Service"]
direction TB
DC[Docker Container]
ZMQ[ZMQ Server :5555]
TRT[TensorRT Engine]
end
subgraph Hardware["🔧 Hardware Layer"]
direction TB
LR[LeRobot]
CAM[Cameras]
SERVO[Feetech Servos]
end
NL --> Tools
Tools --> RT
RT --> TM
TM --> AS
AS --> PA
PA --> GP
PA --> MP
PA --> CP
GP --> ZMQ
ZMQ --> TRT
TRT --> DC
AS --> LR
LR --> CAM
LR --> SERVO
classDef agentStyle fill:#0969da,stroke:#044289,color:#fff
classDef robotStyle fill:#2ea44f,stroke:#1b7735,color:#fff
classDef policyStyle fill:#8250df,stroke:#5a32a3,color:#fff
classDef infraStyle fill:#bf8700,stroke:#875e00,color:#fff
classDef hwStyle fill:#d73a49,stroke:#a72b3a,color:#fff
class NL,Tools agentStyle
class RT,TM,AS robotStyle
class PA,GP,MP,CP policyStyle
class DC,ZMQ,TRT infraStyle
class LR,CAM,SERVO hwStyle
from strands import Agent
from strands_robots import Robot, gr00t_inference
# Create robot with cameras
robot = Robot(
tool_name="my_arm",
robot="so101_follower",
cameras={
"front": {"type": "opencv", "index_or_path": "/dev/video0", "fps": 30},
"wrist": {"type": "opencv", "index_or_path": "/dev/video2", "fps": 30}
},
port="/dev/ttyACM0",
data_config="so100_dualcam"
)
# Create agent with robot tool
agent = Agent(tools=[robot, gr00t_inference])
# Start GR00T inference service
agent.tool.gr00t_inference(
action="start",
checkpoint_path="/data/checkpoints/model",
port=8000,
data_config="so100_dualcam"
)
# Control robot with natural language
agent("Use my_arm to pick up the red block using GR00T policy on port 8000")pip install strands-robotsOr from source:
git clone https://github.com/cagataycali/strands-robots
cd strands-robots
pip install -e .🐳 Jetson Container Setup (Required for GR00T)
GR00T inference requires the Isaac-GR00T Docker container on Jetson platforms:
# Clone jetson-containers
git clone https://github.com/dusty-nv/jetson-containers
cd jetson-containers
# Run Isaac GR00T container (background)
jetson-containers run $(autotag isaac-gr00t) &
# Container exposes inference service on port 5555 (ZMQ) or 8000 (HTTP)Tested Hardware:
- NVIDIA Thor Dev Kit (Jetpack 7.0)
- NVIDIA Jetson AGX Orin (Jetpack 6.x)
See Jetson Deployment Guide for TensorRT optimization.
sequenceDiagram
participant User
participant Agent as Strands Agent
participant Robot as Robot Tool
participant Policy as GR00T Policy
participant HW as Hardware
User->>Agent: "Pick up the red block"
Agent->>Robot: execute(instruction, policy_port)
loop Control Loop @ 50Hz
Robot->>HW: get_observation()
HW-->>Robot: {cameras, joint_states}
Robot->>Policy: get_actions(obs, instruction)
Policy-->>Robot: action_chunk[16]
loop Action Horizon
Robot->>HW: send_action(action)
Note over Robot,HW: 20ms sleep (50Hz)
end
end
Robot-->>Agent: Task completed
Agent-->>User: "✅ Picked up red block"
The Robot class is a Strands AgentTool that provides async robot control with real-time status reporting.
| Action | Parameters | Description | Example |
|---|---|---|---|
execute |
instruction, policy_port, duration |
Blocking execution until complete | "Pick up the cube" |
start |
instruction, policy_port, duration |
Non-blocking async start | "Wave your arm" |
status |
- | Get current task status | Check progress |
stop |
- | Interrupt running task | Emergency stop |
Natural Language Examples:
# Blocking execution (waits for completion)
agent("Use my_arm to pick up the red block using GR00T policy on port 8000")
# Async execution (returns immediately)
agent("Start my_arm waving using GR00T on port 8000, then check status")
# Stop running task
agent("Stop my_arm immediately")Robot Constructor Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
tool_name |
str |
required | Name for this robot tool |
robot |
str|RobotConfig |
required | Robot type or config |
cameras |
Dict |
None |
Camera configuration |
port |
str |
None |
Serial port for robot |
data_config |
str |
None |
GR00T data config name |
control_frequency |
float |
50.0 |
Control loop Hz |
action_horizon |
int |
8 |
Actions per inference |
Manages GR00T policy inference services running in Docker containers.
| Action | Parameters | Description | Example |
|---|---|---|---|
start |
checkpoint_path, port, data_config |
Start inference service | "Start GR00T on port 8000" |
stop |
port |
Stop service on port | "Stop GR00T on port 8000" |
status |
port |
Check service status | "Is GR00T running?" |
list |
- | List all running services | "List inference services" |
find_containers |
- | Find GR00T containers | "Find available containers" |
TensorRT Acceleration:
agent.tool.gr00t_inference(
action="start",
checkpoint_path="/data/checkpoints/model",
port=8000,
use_tensorrt=True,
trt_engine_path="gr00t_engine",
vit_dtype="fp8", # ViT: fp16 or fp8
llm_dtype="nvfp4", # LLM: fp16, nvfp4, or fp8
dit_dtype="fp8" # DiT: fp16 or fp8
)LeRobot-based camera management with OpenCV and RealSense support.
| Action | Parameters | Description | Example |
|---|---|---|---|
discover |
- | Find all cameras | "Discover cameras" |
capture |
camera_id, save_path |
Single image capture | "Capture from /dev/video0" |
capture_batch |
camera_ids, async_mode |
Multi-camera capture | "Capture from all cameras" |
record |
camera_id, capture_duration |
Record video | "Record 10s video" |
preview |
camera_id, preview_duration |
Live preview | "Preview camera 0" |
test |
camera_id |
Performance test | "Test camera speed" |
Natural Language Examples:
# Discover available cameras
agent("Find all connected cameras")
# Capture from multiple cameras
agent("Capture images from front and wrist cameras")
# Record demonstration
agent("Record 30 seconds of video from the front camera")Low-level serial communication for Feetech servos and custom protocols.
| Action | Parameters | Description | Example |
|---|---|---|---|
list_ports |
- | Discover serial ports | "List serial ports" |
feetech_position |
port, motor_id, position |
Move servo | "Move motor 1 to center" |
feetech_ping |
port, motor_id |
Ping servo | "Ping motor 1" |
send |
port, data/hex_data |
Send raw data | "Send FF FF to robot" |
monitor |
port |
Monitor serial data | "Monitor /dev/ttyACM0" |
Record demonstrations for imitation learning with LeRobot.
| Action | Parameters | Description | Example |
|---|---|---|---|
start |
robot_type, teleop_type |
Start teleoperation | "Start teleoperation" |
stop |
session_name |
Stop session | "Stop recording" |
list |
- | List active sessions | "List teleop sessions" |
replay |
dataset_repo_id, replay_episode |
Replay episode | "Replay episode 5" |
Recording Demonstrations:
agent.tool.lerobot_teleoperate(
action="start",
robot_type="so101_follower",
robot_port="/dev/ttyACM0",
teleop_type="so101_leader",
teleop_port="/dev/ttyACM1",
dataset_repo_id="my_user/cube_picking",
dataset_single_task="Pick up the red cube",
dataset_num_episodes=50
)Store, retrieve, and execute named robot poses.
| Action | Parameters | Description | Example |
|---|---|---|---|
store_pose |
pose_name |
Save current position | "Save as 'home'" |
load_pose |
pose_name |
Move to saved pose | "Go to home pose" |
list_poses |
- | List all poses | "List saved poses" |
move_motor |
motor_name, position |
Move single motor | "Move gripper to 50%" |
incremental_move |
motor_name, delta |
Small movement | "Move elbow +5°" |
reset_to_home |
- | Safe home position | "Reset to home" |
| Robot | Config | Cameras | Description |
|---|---|---|---|
| SO-100/SO-101 | so100, so100_dualcam, so100_4cam |
1-4 | Single arm desktop robot |
| Fourier GR-1 | fourier_gr1_arms_only |
1 | Bimanual humanoid arms |
| Bimanual Panda | bimanual_panda_gripper |
3 | Dual Franka Emika arms |
| Unitree G1 | unitree_g1 |
1 | Humanoid robot platform |
GR00T Data Configurations
| Config | Video Keys | State Keys | Description |
|---|---|---|---|
so100 |
video.webcam |
state.single_arm, state.gripper |
Single camera |
so100_dualcam |
video.front, video.wrist |
state.single_arm, state.gripper |
Front + wrist |
so100_4cam |
video.front, video.wrist, video.top, video.side |
state.single_arm, state.gripper |
Quad camera |
fourier_gr1_arms_only |
video.ego_view |
state.left_arm, state.right_arm, state.left_hand, state.right_hand |
Humanoid arms |
bimanual_panda_gripper |
video.right_wrist_view, video.left_wrist_view, video.front_view |
EEF pos/quat + gripper | Dual arm EEF |
unitree_g1 |
video.rs_view |
state.left_arm, state.right_arm, state.left_hand, state.right_hand |
G1 humanoid |
classDiagram
class Policy {
<<abstract>>
+get_actions(observation, instruction)
+set_robot_state_keys(keys)
+provider_name
}
class Gr00tPolicy {
+data_config
+policy_client: ZMQ
+get_actions()
}
class MockPolicy {
+get_actions()
Returns random actions
}
class CustomPolicy {
+get_actions()
Your implementation
}
Policy <|-- Gr00tPolicy
Policy <|-- MockPolicy
Policy <|-- CustomPolicy
from strands_robots import create_policy
# GR00T policy (requires container)
policy = create_policy(
provider="groot",
data_config="so100_dualcam",
host="localhost",
port=8000
)
# Mock policy (for testing)
policy = create_policy(provider="mock")strands-robots/
├── strands_robots/
│ ├── __init__.py # Package exports
│ ├── robot.py # Universal Robot class (AgentTool)
│ ├── policies/
│ │ ├── __init__.py # Policy ABC + factory
│ │ └── groot/
│ │ ├── __init__.py # Gr00tPolicy implementation
│ │ ├── client.py # ZMQ inference client
│ │ └── data_config.py # 6 embodiment configurations
│ └── tools/
│ ├── gr00t_inference.py # Docker service manager
│ ├── lerobot_camera.py # Camera operations
│ ├── lerobot_calibrate.py # Calibration management
│ ├── lerobot_teleoperate.py # Recording/replay
│ ├── pose_tool.py # Pose management
│ └── serial_tool.py # Serial communication
├── test.py # Integration example
└── pyproject.toml # Package configuration
#!/usr/bin/env python3
from strands import Agent
from strands_robots import Robot, gr00t_inference, lerobot_camera, pose_tool
# 1. Create robot with dual cameras
robot = Robot(
tool_name="orange_arm",
robot="so101_follower",
cameras={
"wrist": {"type": "opencv", "index_or_path": "/dev/video0", "fps": 15},
"front": {"type": "opencv", "index_or_path": "/dev/video2", "fps": 15},
},
port="/dev/ttyACM0",
data_config="so100_dualcam",
)
# 2. Create agent with all robot tools
agent = Agent(
tools=[robot, gr00t_inference, lerobot_camera, pose_tool]
)
# 3. Start inference service
agent.tool.gr00t_inference(
action="start",
checkpoint_path="/data/checkpoints/gr00t-wave/checkpoint-300000",
port=8000,
data_config="so100_dualcam",
)
# 4. Interactive control loop
while True:
user_input = input("\n🤖 > ")
if user_input.lower() in ["exit", "quit"]:
break
agent(user_input)
# 5. Cleanup
agent.tool.gr00t_inference(action="stop", port=8000)We welcome contributions! Please see:
- GitHub Issues for bug reports
- Pull Requests for contributions
Apache-2.0 - see LICENSE file.