Building a few RL implementations on my falling blocks game.
Imitation learning algorithm based on my own game play of the game (used as a benchmark).
Basic REINFORCE algorithm with average reward baseline for unbiased (high variance) agent.
| Name | Name | Last commit date | ||
|---|---|---|---|---|