dexbotic/docs/RL.md at main · learningLogisticsLab/dexbotic

Dexbotic extends Vision-Language-Action (VLA) models with SimpleVLA-RL algorithm for RL post-training.

Installation

🐳 Docker (Recommended)

We strongly recommend using Docker as a unified, consistent, and reproducible environment for training and deployment. This approach not only ensures reliability across workflows but also minimizes potential issues arising from CUDA version differences and Python dependency conflicts.

See dockerfile/Dockerfile.RL for more details.

Prerequisites

Ubuntu 20.04 or 22.04
NVIDIA GPU: RTX H20 (8 GPUs recommended for training; 1 GPU for deployment)
NVIDIA Docker installed

Step 1: Clone the Repository

git clone git@gitlab.dexmal.com:robotics/dexbotic.git

Step 2: Start Docker

docker run -it --rm --gpus all \
  -v /path/to/dexbotic:/dexbotic \
  dexmal/dexbotic:rl \
  bash

Step 3: Activate Dexbotic Environment

cd /dexbotic
conda activate dexbotic-rl
pip install -e .

Launch RL Post-Training

deepspeed playground/benchmarks/libero/libero_simplevla_rl.py \
    --task=train \
    --sft_model_path=/path/to/sft-checkpoint \
    --dataset_name=libero_10

Note: The rollout process in RL post-training may take some time to collect enough trajectories for per-step updates. Please be patient.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installation

🐳 Docker (Recommended)

Launch RL Post-Training

FilesExpand file tree

RL.md

Latest commit

History

RL.md

File metadata and controls

Installation

🐳 Docker (Recommended)

Launch RL Post-Training