-
Notifications
You must be signed in to change notification settings - Fork 185
Description
Hi,
while working on a PyTorch DQN agent for BSuite experiments, I noticed quite bad results on the mnist and mountain car experiments. I see that a similar question was addressed here, but the thread was closed.
To further investigate, I created a new conda environment, downloaded and installed a fresh copy of BSuite and ran the DQN agent from the baselines. The only settings I've changed were "bsuite_id" to "SWEEP" and the save path.
When you compare the results from both agents with the barplot on page 16 of the BSuite manuscript, you notice that both agents have worse performance on mnist and mountaincar and better performance on catch.

Were there any changes on the environments that I missed? The DQN agent from the manuscript did use the default parameters from the baseline directory, correct?
Thanks,
Peter