Hello!
Thanks so much for sharing the codes!
I just have a question about sac_discrete_per.py: why the importance-sampling weights are not used to update the Q-function and the policy function?
|
weight_update = [min(l1.item(), l2.item()) for l1, l2 in zip(q_value_loss1, q_value_loss2)] |
I have seen other SAC-PER repositories that used the weights to calculate the q_value_loss and policy_loss.
Thanks.
Hello!
Thanks so much for sharing the codes!
I just have a question about sac_discrete_per.py: why the importance-sampling weights are not used to update the Q-function and the policy function?
Popular-RL-Algorithms/sac_discrete_per.py
Line 162 in 5926245
I have seen other SAC-PER repositories that used the weights to calculate the q_value_loss and policy_loss.
Thanks.