Guided Policy Optimization

This repository provides an implementation of Guided Policy Optimization (GPO), as introduced in our paper: Guided Policy Optimization under Partial Observability. Our implementation is built upon Brax (https://github.com/google/brax), utilizing its environment and single-file PPO implementation as a foundation.

📁 Repository Structure

Structure of the repository

envs/: Partially observable environments with injected noise, modified from Brax.
scripts/: Training scripts for each environment, using hyperparameters from the GPO paper.
GPO.py: Implementation of Guided Policy Optimization, including both GPO-Penalty and GPO-Clip variants.
config.py: Configuration files for training.
wrappers.py: Environment wrappers for partial observability and preprocessing.

🛠️ Installation

1. Create a new conda environment

conda create -n your_env_name python=3.10
conda activate your_env_name

2. Install Brax

Clone the Brax repository, navigate to the root directory, and run:

pip install --upgrade pip
pip install -e .

3. Install Jax JAX installation depends on your hardware (CPU, GPU, or TPU). Please refer to the official JAX installation guide and follow the instructions that match your setup.

Note: Although a requirements.txt file is provided, it may contain redundant packages. For best results, we recommend following the official installation guides for Brax and JAX.

🚀 Training

To train an agent (e.g., on the Ant environment), run:

cd scripts
chmod +x ./train_ant.sh
./train_ant.sh

Training logs and results will be saved in the directory structure: results/env_name/seed/. You can view progress using TensorBoard.

📖 Citation

If you find this repository or GPO useful in your research, please consider citing our paper:

@article{li2025guided,
  title={Guided Policy Optimization under Partial Observability},
  author={Li, Yueheng and Xie, Guangming and Lu, Zongqing},
  journal={arXiv preprint arXiv:2505.15418},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Guided Policy Optimization

📁 Repository Structure

🛠️ Installation

🚀 Training

📖 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
envs		envs
scripts		scripts
GPO.py		GPO.py
LICENSE		LICENSE
README.md		README.md
config.py		config.py
requirements.txt		requirements.txt
wrappers.py		wrappers.py

Folders and files

Latest commit

History

Repository files navigation

Guided Policy Optimization

📁 Repository Structure

🛠️ Installation

🚀 Training

📖 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages