resume train for teacher policy

Hello, when i resume training the teacher policy based on the checkpoint which had already achieved a success rate of 60%, the new checkpoints i got  achieved 0 success rate when i performed evaluation. Is it normal? Isn’t it true that the longer a policy is trained, the better the results should be?
The train command i used:
${ISAACLAB_PATH:?}/isaaclab.sh -p scripts/rsl_rl/train_teacher_policy.py     
--num_envs 4096
--headless     
--reference_motion_path third_party/human2humanoid/data/h1/amass_all.pkl 
--teacher_policy.resume_path logs/teacher/25_09_15_19-32-23 
--teacher_policy.checkpoint model_76500.pt 
--headless --teacher_policy.resume

The evaluation command is as follows:
${ISAACLAB_PATH}/isaaclab.sh -p scripts/rsl_rl/eval.py    
 --num_envs 1024 
--headless    
 --teacher_policy.resume_path logs/teacher/25_09_15_19-32-23     
--teacher_policy.checkpoint model_25000.pt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resume train for teacher policy #43

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

resume train for teacher policy #43

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions