vanilla-policy-gradient

A PyTorch implementation of vanilla policy gradient with a value function baseline estimate. Code is based on Lecture 3 of Foundations of Deep RL by Pieter Abbeel.

Installation

conda env create --name <name> -f environment.yml

conda activate <name>

Usage

To train the agent and visualize the results, run the following command: python train.py or with custom arguments: python train.py --env CartPole-v1 --gamma 0.999 --learning_rate 0.001 --episodes 2000 --state_dim 4 --action_dim 2 --hidden_dim 64 --num_hidden_layers 0 --activation relu --shared False

The Agent.py module can also be used independently.

Command Line Arguments

--env: Name of the environment (default: CartPole-v1)
--gamma, -g: Discount factor (default: 0.999)
--learning_rate, -lr: Learning rate (default: 0.001)
--episodes, -e: Number of episodes to train (default: 2000)
--state_dim, -sd: Dimension of the state space (default: 4)
--action_dim, -ad: Dimension of the action space (default: 2)
--hidden_dim, -hd: Dimension of the hidden layer (default: 64)
--num_hidden_layers, -hl: Number of hidden layers (default: 0)
--activation, -a: Activation function (default: relu) (options: relu, tanh)
--shared, -s: Use shared weights for policy and value function (default: False)

Notes

The code only runs on CPU for now. GPU support can be added by moving the model and tensors to the GPU.
The networks only support discrete action spaces for now.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.gitattributes		.gitattributes
README.md		README.md
environment.yml		environment.yml
policyGradientAgent.py		policyGradientAgent.py
testAgent.py		testAgent.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vanilla-policy-gradient

Installation

Usage

Command Line Arguments

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Languages

rob-pitkin/vanilla-policy-gradient

Folders and files

Latest commit

History

Repository files navigation

vanilla-policy-gradient

Installation

Usage

Command Line Arguments

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages