Skip to content

部分数据集上微调loss收敛效果具有明显区别 #20

@Ziyang-Zhang-6657

Description

@Ziyang-Zhang-6657

作者您好,我在尝试学习复现您的工作时遇到了一个问题,我仔细对比训练和测试scripts并未发现原因:
在Spatial集上训练时loss收敛效果优秀,且测试精度正常
训练指令如下:
torchrun
--standalone
--nnodes 1
--nproc-per-node 1
vla-scripts/finetune.py
--vla_path "/opt/data/private/VLA/openvla-7b-oft-finetuned-libero-spatial"
--data_root_dir /opt/data/private/EmbodiedAI/VLA/VLA-Adapter/data/libero
--dataset_name libero_spatial_no_noops
--use_l1_regression True
--use_diffusion False
--use_film False
--num_images_in_input 2
--use_proprio True
--batch_size 4
--learning_rate 5e-4
--num_steps_before_decay 30000
--max_steps 40005
--save_freq 20000
--save_latest_checkpoint_only False
--image_aug True
--lora_rank 32

训练loss曲线:

Image

而在Object, Goal, 10(Long)集上训练loss收敛效果明显较差,且测试精度均为0
训练指令如下:
torchrun
--standalone
--nnodes 1
--nproc-per-node 1
vla-scripts/finetune.py
--vla_path "/opt/data/private/VLA/openvla-7b-oft-finetuned-libero-object"
--data_root_dir /opt/data/private/EmbodiedAI/VLA/VLA-Adapter/data/libero
--dataset_name libero_object_no_noops
--use_l1_regression True
--use_diffusion False
--use_film False
--num_images_in_input 2
--use_proprio True
--batch_size 4
--learning_rate 5e-4
--num_steps_before_decay 30000
--max_steps 40005
--save_freq 20000
--save_latest_checkpoint_only False
--image_aug True
--lora_rank 32

torchrun
--standalone
--nnodes 1
--nproc-per-node 1
vla-scripts/finetune.py
--vla_path "/opt/data/private/VLA/openvla-7b-oft-finetuned-libero-goal"
--data_root_dir /opt/data/private/EmbodiedAI/VLA/VLA-Adapter/data/libero
--dataset_name libero_goal_no_noops
--use_l1_regression True
--use_diffusion False
--use_film False
--num_images_in_input 2
--use_proprio True
--batch_size 4
--learning_rate 5e-4
--num_steps_before_decay 30000
--max_steps 40005
--save_freq 20000
--save_latest_checkpoint_only False
--image_aug True
--lora_rank 32

torchrun
--standalone
--nnodes 1
--nproc-per-node 1
vla-scripts/finetune.py
--vla_path "/opt/data/private/VLA/openvla-7b-oft-finetuned-libero-10"
--data_root_dir /opt/data/private/EmbodiedAI/VLA/VLA-Adapter/data/libero
--dataset_name libero_10_no_noops
--use_l1_regression True
--use_diffusion False
--use_film False
--num_images_in_input 2
--use_proprio True
--batch_size 4
--learning_rate 5e-4
--num_steps_before_decay 30000
--max_steps 40005
--save_freq 20000
--save_latest_checkpoint_only False
--image_aug True
--lora_rank 32

训练loss曲线:

Image Image Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions