-
| Hello, 
 For the moment, I have to manually do each of them manually on my own package with this kind of function : function Flux.testmode!(lh::LearnedHeuristic, mode = true)
    lh.agent.policy.explorer.is_training = !mode   #freeze explorer's epsilon value
    lh.trainMode = !mode     #stop filling the trajectory with evaluation samples
    Flux.testmode!(lh.agent, mode)   #stop updating the weights and biaises 
endI am certainly doing something wrong, can someone help me ? | 
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
| Unfortunately, no. And this is by design. Because in RL, the  Assuming you've read the tutorial, you'll see an  For example: Note that the second line created another instance, though I reused the symbol  So back to your question, I would remove the constraint of  Let me know if you are still unsure how to do it. | 
Beta Was this translation helpful? Give feedback.
Unfortunately, no. And this is by design. Because in RL, the
testmode!is kind of vague. For example, in some cases, we may still want to explore the action space with a small epsilon value, not simply set it to zero.Assuming you've read the tutorial, you'll see an
Agentis a wrapper of anAbstractPolicyand it is in the training mode naturally. So to test theAgent, I usually extract the inner policy and use it to interact with an environment (of course I still need to do some extra work here, like modify the exploration rate and set the model totestmode!).For example:
ReinforcementLearning.jl/src/ReinforcementLearningExperiments/deps/experiments/experiments/DQN/Dopamine_DQN_Atari.jl