Bi-Level Policy Optimization with Nyström Hypergradients

This repository implements Bi-Level Policy Optimization (BLPO) using Nyström hypergradients. Experiments are conducted in both discrete(gymnax) and continuous(brax) environments.

Prerequisites

Python 3.11 (Tested)
JAX: Follow the JAX Installation Guide to set up the appropriate version for your hardware (CPU/GPU).

Installation

Install JAX
Visit the JAX installation guide for detailed instructions on installing JAX for your system.

Install Required Dependencies
Install the necessary packages using pip:

pip install flax==0.10.1 numpy==2.1.3 optax==0.2.3 distrax==0.1.5 gymnax==0.0.8 wandb==0.19.1 brax==0.12.1

To Run BLPO:

export PYTHONPATH=$(pwd):$PYTHONPATH
python BiLevel_RL/continuous/nystrom_ppo.py

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
Baselines		Baselines
BiLevel_RL		BiLevel_RL
archive		archive
core		core
scripts		scripts
wandb		wandb
.gitignore		.gitignore
README.md		README.md
config.py		config.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bi-Level Policy Optimization with Nyström Hypergradients

Prerequisites

Installation

To Run BLPO:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Arnie-He/BLPO

Folders and files

Latest commit

History

Repository files navigation

Bi-Level Policy Optimization with Nyström Hypergradients

Prerequisites

Installation

To Run BLPO:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages