This repository provides the official implementation of Multi-Agent Guided Policy Optimization (MAGPO), as introduced in our paper.
Our implementation is based on Mava, and follows its concise single-file JAX implementation style. Please refer to the original Mava repository for general infrastructure details and design philosophy.
- Release MAGPO code
- Add support for HAPPO
- Release the CoordSum environment used in the paper
- Release all experimental configurations and results
The installation process is the same as in Mava. We recommend using uv for dependency management.
# Clone the repository
git clone https://github.com/instadeepai/Mava.git
cd Mava
# Create a virtual environment and install all dependencies
uv sync
# Activate the virtual environment
source .venv/bin/activateTo install with a GPU or TPU aware version of JAX
uv sync --extra cuda12 # GPU aware JAX
uv sync --extra tpu # TPU aware JAXAlternatively with pip, create a virtual environment and then:
pip install -e ".[cuda12]" # GPU aware JAX (leave out the [cuda12] if you don't have a GPU or are on Mac)For more detailed installation options, including Docker builds, please refer to Mava's detailed installation guide.
To train a multi-agent system with MAGPO, run one of the system files. For example:
python mava/systems/gpo/anakin/rec_magpo.py
We use Hydra for config management.
Default configurations can be found in mava/configs/ directory.
To run on a specific environment, use command-line overrides. Example: training on Level-based Foraging:
python mava/systems/gpo/anakin/rec_magpo.py env=lbf
Training on RWARE with a specific scenario:
python mava/systems/gpo/anakin/rec_magpo.py env=rware env/scenario=tiny-4ag
More examples can be found in Mava's Quickstart notebook.
If you find this repository or GPO useful in your research, please consider citing our paper:
@misc{li2025multiagentguidedpolicyoptimization,
title={Multi-Agent Guided Policy Optimization},
author={Yueheng Li and Guangming Xie and Zongqing Lu},
year={2025},
eprint={2507.18059},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2507.18059},
}