Skip to content

User guide

yonkshi edited this page Nov 25, 2021 · 4 revisions

Basic example (dedo.demo)

We provide a very simple way to quickly visualize every single task and deformable object that comes with DEDO. dedo.demo is a basic example with allows one to load all tasks with a basic preset policy (drag object forward). Here is an example to visualize the HangBag task. To see the full list of arguments, please refer to Arguments Reference

python -m dedo.demo --env=HangBag-v1 --viz --debug

<<<<<<< HEAD

DEDO Architectures

=======

DEDO Architecture

2d07d5b41210ddbb4acd09a41eb13fde42b7217a assets/imgs/header.jpg

Reinforcement Learning Runner Examples

DEDO provides two examples of integrating and training with popular Reinforcement Learning libraries

Stable Baseline 3 (dedo.run_rl_sb3)

dedo.run_rl_sb3 is an example of using DEDO environment with Stable Baselines 3 for training with various popular RL algorithms. Example:

python -m dedo.run_rl_sb3 --env=HangGarment-v0 \
    --logdir=/tmp/dedo --num_play_runs=3 --num_envs=8 --rl_algo=PPO --viz --debug

Initiates a training HangGarment-v0 with Stable Baselines 3's PPO implementation with 8 vectorized environments.

For more configuration, please refer to Arguments Reference. For a list of tasks, refer to Tasks Overview

Ray RLLib (dedo.run_rllib)

dedo/run_rllib.py is an example of using DEDO environment with RLLib. To see the full list of arguments, please refer to Arguments Reference

python -m dedo.run_rllib --env=HangGarment-v0 \
    --logdir=/tmp/dedo --num_play_runs=3 --rl_algo=PPO --viz --debug

Initiates a training HangGarment-v0 with RLLib's PPO implementation with 8 vectorized environments. For a list of supported algorithms and additional configuration, please refer to Arguments Reference. For a list of tasks, refer to Tasks Overview

Visualize Results

One can visualize the training results through tensorboard

tensorboard --logdir=/tmp/dedo --bind_all --port 6006 \
  --samples_per_plugin images=1000

Alternative, one could train with --enable_wandb flag enabled and the logs will also be uploaded to Weights and Biases (wandb.ai)

Advanced usage

DEDO uses the standard OpenAI Gym interface, so it can easily adapted to any implementation reinforcement learning algorithms. dedo/demo.py provides a basic way to initialize DEDO gym environment. Additionally, dedo/utils/train_utils.py provides basic helper functions for logging.

VAE Runner Example (dedo.run_svae)

DEDO provides an example for training a few different flavors of Variational Autoencoder networks (VAE, SVAE, PRED and DSA). This example is useful for users who would like to collect samples for self-supervised learning or representation learning.

python -m dedo.run_svae --env=HangGarment-v0 \
    --logdir=/tmp/dedo --num_play_runs=3 --unsup_algo=SVAE --viz --debug

For a list of supported algorithms and additional configuration, please refer to Arguments Reference. For a list of tasks, refer to Tasks Overview

Preset demonstrations (dedo.demo_preset)

DEDO provides a set of hardcoded trajectories that show the completion of each task. This helps users to visualize the objective of each task as well as potentially useful for expert demostration based learning algorithms.

For example

images/gifs/HangGarment-v1.gif

python -m dedo.demo_preset  --env=HangBag-v1 --viz --debug

Since trajectories are hard coded, so not all objects have a preset. Detailed trajectory information can be found in dedo/utils/preset_info.py For a list of supported algorithms and additional configuration, please refer to Arguments Reference. For a list of tasks, refer to Tasks Overview

Data Collection (dedo.datacollect)

There is also a data collection script for collecting observation sequences and save them as numpy arrays. This is useful for preparing a static dataset, training unsupervised learning algorithms.

python -m dedo.datacollect --cam_resolution=400 --env=ProcHangCloth-v0 --max_episode_len=999 --logdir=/tmp/  --dtype='float16'

The above script collects 999 episodes worth of trajectories, stores the np.float16 trajectories in the /tmp/ directory

In additional to the standard arugments, one can also specify the output file datatype (e.g. float16, uint8 etc) with --dtype. --bundle_size to specify how many parallel trajectories are bundled into one npy file. This helps reducing I/O constraint by grouping batches into a single file.