Add Efficient Numpy Replay Buffer#17
Conversation
|
Hm, this is very similar to the replay buffer that I have in original DrQ (https://github.com/denisyarats/drq/blob/master/replay_buffer.py). |
|
The replay buffer in the original DrQ is actually quite different and should use around x6 times the amount RAM (since each observation is 9 x 84 x 84 and is saved both in the observation and next_observation np arrays, while my implementation saves each observation 3 x 84 x 84 only once). I have tested my implementation extensively on both my home and lab machines and I have not experienced any slow down whatsoever ^^ |
|
I've experienced multiple times crush. The only clue is about the dataloader, but I cannot locate the error. Thanks for your contribution, I'd like to try your implementation!!! |
The default replay buffer requires very high RAM and results in frequent crashes due to Pytorch's data-loader memory leak issue. Thus, I have efficiently re-implemented DrQv2's replay buffer entirely in NumPy, taking only about 20gb of RAM for storing all 1000000 transitions. Moreover, with this implementation, there is no need to wait for a trajectory to be completed before adding a new transition to the memory used for sampling.
FPS of this NumPy implementation appears to be identical (perhaps, very slightly higher) on all machines I have tested this on. Potentially, this could also lead to (very minimal) performance gains since the agent can now sample replay transitions from its latest trajectory.
I have kept the original dataloader replay buffer as default. The new replay_buffer can be used by running
train.pywith the replay_buffer=numpy option.