Change model to use recurrent features

Recurrent Neural networks allow processing of past information. This is important for coup, and may be a more suitable implementation than just including the last k actions in the state.

Additionally, Ray seems to have some settings to this built in.
https://docs.ray.io/en/latest/rllib/rllib-models.html