Code Repository for the paper "Rethinking Visual-Language-Action Model Scaling: Alignment, Mixture, and Regularization"

1. Pre-trained Models

You can download the pre-trained checkpoints from Hugging Face:

EEF Relative + OXE: Download here
EEF Relative + RealSimEEFJoint: Download here

2. Pre-training

To run the pre-training stage, use the following scripts:

# Stage 2 Pre-training
bash shell/pretrain-M1-240k-stage2.sh

# Pre-training
bash shell/pretrain-M1-240k.sh

3. Post-training

LIBERO Benchmark

First, download the required LIBERO datasets:

Preprocess the data: python src/data_postprocessor/libero.py

Run the post-training script:

bash shell/relative-post-libero-full-eef_relative-5shot.sh

RoboCasa Benchmark

First, download the RoboCasa dataset:

RoboCasa Human Single

Preprocess the data: python src/data_postprocessor/robocasa_human.py

Run the post-training script:

bash shell/relative-post-robocasa-full-eef_relative.sh

4. Evaluation

LIBERO Evaluation

Ensure the LIBERO environment is installed.

Run the evaluation script:

bash shell/eval-libero-relative.sh

RoboCasa Evaluation

Ensure the RoboCasa environment is installed.

Run the evaluation script:

bash shell/eval-robocasa-relative.sh

Acknowledgments

We thank the authors of the following projects for their contributions to the robotics and machine learning communities:

BeingH0.5: VLA framework
InternVL: Vision-Language model backbone
Bagel: Training framework
Qwen: Language model
LIBERO: Benchmark for lifelong robot learning
RoboCasa: Large-scale simulation benchmark for everyday tasks

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
configs		configs
shell		shell
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code Repository for the paper "Rethinking Visual-Language-Action Model Scaling: Alignment, Mixture, and Regularization"

1. Pre-trained Models

2. Pre-training

3. Post-training

LIBERO Benchmark

RoboCasa Benchmark

4. Evaluation

LIBERO Evaluation

RoboCasa Evaluation

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Code Repository for the paper "Rethinking Visual-Language-Action Model Scaling: Alignment, Mixture, and Regularization"

1. Pre-trained Models

2. Pre-training

3. Post-training

LIBERO Benchmark

RoboCasa Benchmark

4. Evaluation

LIBERO Evaluation

RoboCasa Evaluation

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages