Can you explain why this reward function (```-cos(theta)-0.1*sin(theta) ... ```) is used for pendulum? https://github.com/WilsonWangTHU/POPLIN/blob/edd8dba50f9049c6164eda774602bef0c299cb51/dmbrl/config/gym_pendulum.py#L104 And why does it need to be different from the original reward function from openai-gym?
Can you explain why this reward function (
-cos(theta)-0.1*sin(theta) ...) is used for pendulum?POPLIN/dmbrl/config/gym_pendulum.py
Line 104 in edd8dba
And why does it need to be different from the original reward function from openai-gym?