MAGPO

This repository provides the official implementation of Multi-Agent Guided Policy Optimization (MAGPO), as introduced in our paper.

Our implementation is based on Mava, and follows its concise single-file JAX implementation style. Please refer to the original Mava repository for general infrastructure details and design philosophy.

📌 TODO

Release MAGPO code
Add support for HAPPO
Release the CoordSum environment used in the paper
Release all experimental configurations and results

🛠️ Installation

The installation process is the same as in Mava. We recommend using uv for dependency management.

# Clone the repository
git clone https://github.com/instadeepai/Mava.git
cd Mava
# Create a virtual environment and install all dependencies
uv sync
# Activate the virtual environment
source .venv/bin/activate

To install with a GPU or TPU aware version of JAX

uv sync --extra cuda12  # GPU aware JAX
uv sync --extra tpu  # TPU aware JAX

Alternatively with pip, create a virtual environment and then:

pip install -e ".[cuda12]"  # GPU aware JAX (leave out the [cuda12] if you don't have a GPU or are on Mac)

For more detailed installation options, including Docker builds, please refer to Mava's detailed installation guide.

🚀 Training

To train a multi-agent system with MAGPO, run one of the system files. For example:

python mava/systems/gpo/anakin/rec_magpo.py

We use Hydra for config management. Default configurations can be found in mava/configs/ directory. To run on a specific environment, use command-line overrides. Example: training on Level-based Foraging:

python mava/systems/gpo/anakin/rec_magpo.py env=lbf

Training on RWARE with a specific scenario:

python mava/systems/gpo/anakin/rec_magpo.py  env=rware env/scenario=tiny-4ag

More examples can be found in Mava's Quickstart notebook.

📖 Citation

If you find this repository or GPO useful in your research, please consider citing our paper:

@misc{li2025multiagentguidedpolicyoptimization,
      title={Multi-Agent Guided Policy Optimization}, 
      author={Yueheng Li and Guangming Xie and Zongqing Lu},
      year={2025},
      eprint={2507.18059},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2507.18059}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
docker		docker
experiment_data		experiment_data
mava		mava
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
commitlint.config.js		commitlint.config.js
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MAGPO

📌 TODO

🛠️ Installation

🚀 Training

📖 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MAGPO

📌 TODO

🛠️ Installation

🚀 Training

📖 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages