Skip to content

portal-cornell/X-Sim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

20 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

X-Sim: Cross-Embodiment Learning
via Real-to-Sim-to-Real

Oral @ CoRL 2025

Prithwish Dan*, Kushal Kedia*, Angela Chao, Edward W. Duan, Maximus A. Pace,
Wei-Chiu Ma, Sanjiban Choudhury

* Equal Contribution
Cornell University

arXiv python Project Page


X-Sim Overview

πŸ“‹ Table of Contents

πŸ“ Project Structure

X-Sim/
β”œβ”€β”€ real_to_sim/
β”‚   β”œβ”€β”€ FoundationPose/                    # Object Tracking
β”‚   β”œβ”€β”€ collect_human_demo.py              # Human Data Collection Script
β”œβ”€β”€ simulation/
β”‚   β”œβ”€β”€ ManiSkill/                    # ManiSkill simulation environment
β”‚   β”œβ”€β”€ scripts/                     # RL training and data generation scripts
β”œβ”€β”€ diffusion_policy/
β”‚   β”œβ”€β”€ scripts/                    # Diffusion policy training and evaluation
β”‚   β”œβ”€β”€ cfgs/                      # Configuration files
β”‚   β”œβ”€β”€ utils/                     # Shared utilities for diffusion policy
β”œβ”€β”€ run_pipeline.py               # Automated pipeline execution script
β”œβ”€β”€ setup.sh                      # Installation script
└── README.md                     # Project documentation

πŸ›  Installation

Environment Setup

bash setup.sh
# Create conda env & install packages & download assets

πŸ”„ Pipeline Overview

X-Sim's pipeline consists of three main phases:

Phase 1: Real-to-Sim

  • Real-to-Sim: Construct photorealistic simulation and track object poses from human videos

Phase 2: RL Training in Sim

  • RL Training: Learn robot policies with object-centric rewards

Phase 3: Sim-to-Real

  • Synthetic Data Collection: Generate RGB demonstration trajectories using trained state-based policies
  • Diffusion Policy Training: Train image-conditioned policies on synthetic data
  • Auto-Calibration:
    • Auto-Calibration Data: Deploy policy on real robot and obtain paired sim rollouts
    • Training with Auxiliary Loss: Fine-tune with calibration auxiliary loss

πŸš€ Quick Start

Real-to-Sim Setup

For detailed instructions on environment scanning, object tracking, and human demo collection, see our Real-to-Sim Pipeline Documentation.

Full Pipeline

Run the complete X-Sim pipeline for any task with a single command:

python run_pipeline.py --env_id "Mustard-Place"

What this does:

  1. RL Training: Trains policies with object-centric rewards
  2. Synthetic Data Generation: Collects demonstration trajectories
  3. Image-Conditioned Diffusion Policy: Trains on synthetic data
  4. Auto-Calibration Data: Converts real trajectories into corresponding sim trajectories (Requires real robot deployment ⚠️)
  5. Calibrated Training: Trains with auxiliary loss using paired real-to-sim data

Output: All results saved to experiments/pipeline/<task_name>/

🎯 Available Tasks

X-Sim supports the following manipulation tasks:

Task Name Environment ID Description
Mustard Place Mustard-Place Place mustard on left side of kitchen
Corn in Basket Corn-in-Basket Place corn into basket
Letter Arrange Letter-Arrange Arrange letters next to each other
Shoe on Rack Shoe-on-Rack Place shoe onto shoe rack
Mug Insert Mug-Insert Insert mug into holder

To add your own tasks, refer to files in simulation/ManiSkill/mani_skill/envs/tasks/xsim_envs

πŸ“– Detailed Usage

Step 1: Real-to-Sim Pipeline

Before training, you need to capture and process real-world data. See our Real-to-Sim Pipeline Documentation for:

  • Environment scanning with 2D Gaussian Splatting
  • Object mesh creation with Polycam
  • Human demonstration collection with ZED camera
  • Object pose tracking with FoundationPose

Step 2: RL Training

Train reinforcement learning policies with object-centric rewards:

cd simulation
python -m scripts.rl_training \
    --env_id="<TASK_NAME>" \
    --exp-name="<EXPERIMENT_NAME>" \
    --num_envs=1024 \
    --seed=0 \
    --total_timesteps=<TIMESTEPS> \
    --num_steps=<STEPS> \
    --num_eval_steps=<EVAL_STEPS>

Step 3: Synthetic Data Collection

Generate demonstration trajectories using the trained RL policies:

cd simulation
python -m scripts.data_generation_rgb \
    --evaluate \
    --num_trajectories=<NUM_TRAJ> \
    --trajectory_length=<TRAJ_LENGTH> \
    --randomize_init_config \
    --checkpoint="<PATH_TO_RL_CHECKPOINT>" \
    --env_id="<TASK_NAME>-Eval" \
    --randomize_camera

Step 4: Image-Conditioned Diffusion Policy Training

Train diffusion policies on the synthetic demonstration data:

cd diffusion_policy
python -m scripts.dp_training_rgb \
    --config_path=cfgs/sim2real.yaml \
    --dp.use_aux_loss=0 \
    --save_dir=<SAVE_DIRECTORY> \
    --dataset.paths=["<PATH_TO_SYNTHETIC_DATA>"] \
    --eval.env_id="<TASK_NAME>-Eval" \
    --eval_freq=5 \
    --eval.num_episodes=10 \
    --num_epoch=60 \
    --epoch_len=10000

Step 5: Auto-Calibration Data Generation

Create real-sim paired RGB dataset using real rollout data and replaying it in sim:

cd diffusion_policy
python -m scripts.auto_calibration \
    --input_dir="<PATH_TO_REAL_ROLLOUTS>" \
    --env_id="<TASK_NAME>-Eval"

Note: You should adapt diffusion_policy/scripts/eval_dp.py to your robot hardware for real-world deployment.

Step 6: Calibrated Policy - Training with Auxiliary Loss

Fine-tune the policy with calibration auxiliary loss:

cd diffusion_policy
python -m scripts.dp_training_rgb \
    --config_path=cfgs/sim2real.yaml \
    --dp.use_aux_loss=1 \
    --dp.aux_loss_weight=0.1 \
    --dp.distance_type="contrastive_cosine" \
    --save_dir=<SAVE_DIRECTORY> \
    --dataset.paths=["<PATH_TO_SYNTHETIC_DATA>"] \
    --dataset.real_pairing="<PATH_TO_REAL_DATA>" \
    --dataset.sim_pairing="<PATH_TO_SIM_PAIRING>" \
    --eval.env_id="<TASK_NAME>-Eval" \
    --eval_freq=5 \
    --eval.num_episodes=10 \
    --epoch_len=10000 \
    --num_epoch=60

Evaluation

Evaluate trained diffusion policies:

cd diffusion_policy
python -m scripts.eval_dp \
    --checkpoint_path="<PATH_TO_DP_CHECKPOINT>" \
    --env_id="<TASK_NAME>-Eval" \
    --save-videos

πŸ“š Citation

If you find this work useful, please cite:

@article{dan2025xsim,
    title={X-Sim: Cross-Embodiment Learning via Real-to-Sim-to-Real}, 
    author={Prithwish Dan and Kushal Kedia and Angela Chao and Edward Weiyi Duan and Maximus Adrian Pace and Wei-Chiu Ma and Sanjiban Choudhury},
    year={2025},
    eprint={2505.07096},
    archivePrefix={arXiv},
    primaryClass={cs.RO},
    url={https://arxiv.org/abs/2505.07096}
}

For more information, visit our project page.

About

X-Sim: Cross-Embodiment Learning via Real-to-Sim-to-Real

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •