-
Notifications
You must be signed in to change notification settings - Fork 87
Description
Hello, when i resume training the teacher policy based on the checkpoint which had already achieved a success rate of 60%, the new checkpoints i got achieved 0 success rate when i performed evaluation. Is it normal? Isn’t it true that the longer a policy is trained, the better the results should be?
The train command i used:
${ISAACLAB_PATH:?}/isaaclab.sh -p scripts/rsl_rl/train_teacher_policy.py
--num_envs 4096
--headless
--reference_motion_path third_party/human2humanoid/data/h1/amass_all.pkl
--teacher_policy.resume_path logs/teacher/25_09_15_19-32-23
--teacher_policy.checkpoint model_76500.pt
--headless --teacher_policy.resume
The evaluation command is as follows:
${ISAACLAB_PATH}/isaaclab.sh -p scripts/rsl_rl/eval.py
--num_envs 1024
--headless
--teacher_policy.resume_path logs/teacher/25_09_15_19-32-23
--teacher_policy.checkpoint model_25000.pt