Skip to content
/ BLPO Public

Bi-Level Policy Optimization with Nyström Hypergradients

Notifications You must be signed in to change notification settings

Arnie-He/BLPO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bi-Level Policy Optimization with Nyström Hypergradients

This repository implements Bi-Level Policy Optimization (BLPO) using Nyström hypergradients. Experiments are conducted in both discrete(gymnax) and continuous(brax) environments.

Prerequisites

  • Python 3.11 (Tested)
  • JAX: Follow the JAX Installation Guide to set up the appropriate version for your hardware (CPU/GPU).

Installation

  1. Install JAX
    Visit the JAX installation guide for detailed instructions on installing JAX for your system.

  2. Install Required Dependencies
    Install the necessary packages using pip:

    pip install flax==0.10.1 numpy==2.1.3 optax==0.2.3 distrax==0.1.5 gymnax==0.0.8 wandb==0.19.1 brax==0.12.1
    
    
    

To Run BLPO:

export PYTHONPATH=$(pwd):$PYTHONPATH
python BiLevel_RL/continuous/nystrom_ppo.py

About

Bi-Level Policy Optimization with Nyström Hypergradients

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •