Skip to content

Commit a8ab515

Browse files
authored
Update Training-PPO.md
1 parent 4fc3039 commit a8ab515

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

Documents/Training-PPO.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ If you already know some policy that is better than random policy, you might giv
6767
2. In your trainer parameters, set `useHeuristicChance` to larger than 0.
6868
3. Use [TrainerParamOverride](TrainerParamOverride.md) to decrease the `useHeuristicChance` over time during the training.
6969

70+
Note that your AgentDependentDeicision is only used in training mode. The chance of using it in each step for agent with the script attached depends on `useHeuristicChance`.
7071

7172

7273

0 commit comments

Comments
 (0)