Releases: leggedrobotics/rsl_rl
v2.3.2
Overview
Full Changelog: v2.3.1...v2.3.2
Added
- Adds gradient cap for teacher student distillation by @alessandroassirelli98 in #91
Fixed
- Fixes unexpected keyword argument
learning_rate
withinRandomNetworkDistillation
by @ozhanozen in #87
New Contributors
- @alessandroassirelli98 made their first contribution in #91
- @ozhanozen made their first contribution in #87
v2.3.1
Overview
Full Changelog: v2.3.0...v2.3.1
Added
- Changes ETA to hh:mm:ss format by @renezurbruegg in #75
Fixed
- Fixes git repository code storage function by @Mayankm96 in #83
- Fixes padding shape in split_and_pad_trajectories to support arbitrary additional dimensions by @bikcrum in #77
- Disable distribution mean gradient propagation into action noise std for StudentTeacher by @flferretti in #82
New Contributors
- @renezurbruegg made their first contribution in #75
- @flferretti made their first contribution in #82
v2.3.0
Overview
RSL RL now supports distributed training. Additionally, a new distillation algorithm allows for student-teacher training.
Full Changelog: v2.2.4...v2.3.0
Added
- Adds Student-Teacher Distillation by @ClemensSchwarke in bbce4ef
- Adds Distillation for recurrent networks by @ClemensSchwarke in d3dbcc3
- Adds Multi-GPU training for PPO and Distillation by @Mayankm96 in 6f8460a
Fixed
- Changes WandB runner name to the log directory name by @Mayankm96 in b9f9e69
- Renames
rnn_hidden_size
tornn_hidden_dim
for naming consistency by @ClemensSchwarke
Breaking Changes
- Renamed
actor_critic
topolicy
to be more general and align with other architectures and algorithms by @ClemensSchwarke in bbce4ef
v2.2.4
Overview
Full Changelog: v2.2.3...v2.2.4
Fixed
- Accounts for start_iter when computing ETA by @PeterMitrano in #29
- Fixes parsing if
rnd
andsymmetry
configs not available by @pascal-roth in #72
New Contributors
- @PeterMitrano made their first contribution in #29
- @pascal-roth made their first contribution in #72
v2.2.3
Overview
This release adds some new parameters to PPO which help make the training more stable.
Full Changelog: v2.2.2...v2.2.3
Added
- Adds flag for per-batch advantage normalization by @Mayankm96 in #68
- Adds support for log-std parameter in ActorCritic by @Mayankm96 in #67
Fixed
- Fixes mean_entropy logging by dividing by num_updates by @bikcrum in #65
- Corrects disabling of arguments when creating Normal distribution by @Mayankm96 in #69
New Contributors
v2.2.2
Overview
Full Changelog: v2.2.1...v2.2.2
Fixed
- Fixes bug in ActorCriticRecurrent hidden state reset by @jnskkmhr in #50
- Stops gradient propagation through ActorCritic std-dev by @Mayankm96 in #66
- Removes unused attributes from VecEnv in 8818338
- Fixes weight schedule dict for RND in 6909a47
New Contributors
v2.2.0
Overview
This release adds the following additional features to our PPO implementation based on our research:
- Random Network Distillation (RND) - Encourages exploration by adding a curiosity driven intrinsic reward.
- Symmetry-based Augmentation - Makes the learned behaviors more symmetrical.
We thank the authors of these works for helping in adding these valuable contributions to the library.
Full Changelog: v2.1.2...v2.2.0
v2.1.2
Overview
A patch fix for local installation of the library. Previously, it was missing setup.py
or setup.cfg
which prohibited it from installation locally in editable mode. We have now added a dummy setup.py
to fix this issue.
Full Changelog: v2.1.1...v2.1.2
v2.1.1
Overview
We’re excited to announce that the rsl-rl library is now available on PyPI! You can install it easily with:
pip install rsl-rl-lib
Full Changelog: v2.0.1...v2.1.1
Added
- Adds support for pip packaging by @kellyguo11 in #54
- Update license in all files and maintainers by @ClemensSchwarke in #56
Fixed
- Saves internal count of EmpiricalNormalization for resuming training by @tasdep in #30
- Fixes error caused by non UTF-8 characters in git diff by @fan-ziqi in #31
New Contributors
- @kellyguo11 made their first contribution in #54
- @ClemensSchwarke made their first contribution in #56
- @tasdep made their first contribution in #30
- @fan-ziqi made their first contribution in #31
Release v2.0.1
Overview
Full Changelog: v2.0.0...v2.0.1
Fixed
- Fixed RL device setting in the on policy runner
- Fixes issue with splitting and padding of trajectories for recurrent network architecture training
- Updates wandb and neptune logging by @Mayankm96 in #18