Learning to Swim: Reinforcement Learning for 6-DOF Control of Thruster-driven Autonomous Underwater Vehicles

Note: you are recommended to not be in a conda environment when setting up and running Isaac Sim. If you are in an environment, you can run:

conda deactivate

To install, requires IsaacSim v4.5.0 and IsaacLab v2.2.0:

Install IsaacSim v4.5.0 (https://docs.isaacsim.omniverse.nvidia.com/4.5.0/installation/download.html)
- Download and unzip the archived binaries into a new folder (i.e. "IsaacSim")

Before installing Isaac Lab, check your operating system version (https://isaac-sim.github.io/IsaacLab/main/source/setup/installation/index.html)

Install IsaacLab v2.2.0, for Ubuntu 20.04 (https://isaac-sim.github.io/IsaacLab/main/source/setup/installation/binaries_installation.html#installing-isaac-lab)
This installation README has been validated for commit 0d520b2.
```
git clone https://github.com/isaac-sim/IsaacLab.git
```

Soft link Isaac lab and Isaac sim

cd <IsaacLab_Path> 
ln -s <IsaacSim_Path> _isaac_sim

Install dependencies (assuming Ubuntu here):

# these dependency are needed by robomimic which is not available on Windows
sudo apt install cmake build-essential

Finish installing Isaac Lab:

<IsaacSim_Path>/kit/python/bin/python3 -m pip install --upgrade pip
./isaaclab.sh --install # or "./isaaclab.sh -i"

Verify the installation worked!

# Option 1: Using the isaaclab.sh executable
# note: this works for both the bundled python and the virtual environment
./isaaclab.sh -p scripts/tutorials/00_sim/create_empty.py

Clone this repository:

If using docker container:

cd <IsaacLab_Path>/source/isaaclab_tasks/isaaclab_tasks/direct/isaac-warpauv-env
git clone https://github.com/warplab/isaac-auv-env.git

If using workstation install:

git clone https://github.com/warplab/isaac-auv-env.git
cd <IsaacLab_Path>/source/isaaclab_tasks/isaaclab_tasks/direct/
ln -s <isaac-auv-env_Path> isaac-auv-env

(Note: if using a workstation install, you can follow the docker instructions as well, but the soft link seems cleaner for local development. Docker is painful when working with links)

To run training:

./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-WarpAUV-Direct-v1 --num_envs 2048

Additional notes:

To import a URDF file into USD format for IsaacLab, you can first export a ROS xacro file into URDF, and then import that URDF file into the IsaacSim URDFImporter Workflow.

rosrun xacro xacro --inorder -o <output.urdf> <input.xacro>
./isaaclab.sh -p scripts/tools/convert_urdf.py  <input_urdf> <output_usd> --merge-joints --make-instance

Generally converges in about 400 iterations with 2048 environments and achieves mean total reward ~95-100. Lowering action penalty often helps if there are issues with convergence.

To cite:

@inproceedings{caiLearningSwimReinforcement2025,
  title = {Learning to {{Swim}}: {{Reinforcement Learning}} for 6-{{DOF Control}} of {{Thruster-driven Autonomous Underwater Vehicles}}},
  booktitle = {2025 {{IEEE International Conference}} on {{Robotics}} and {{Automation}} ({{ICRA}})},
  author = {Cai, Levi and Chang, Kevin and Girdhar, Yogesh},
  date = {2025},
  url = {https://arxiv.org/abs/2410.00120},
  eventtitle = {2025 {{IEEE International Conference}} on {{Robotics}} and {{Automation}} ({{ICRA}})}
}

Migration from 4.0 to 4.5 notes
First you must rename the API references (https://isaac-sim.github.io/IsaacLab/main/source/refs/migration.html). There is a python script linked on that site that will do this for you; set the correct directory within the script!
Then, I needed to explicitly configure the obervation_space, action_space, and state_space variables:

observation_space: gym.spaces.Space = gym.spaces.Box(low=-np.inf, high=np.inf, shape=(17,), dtype=np.float64)
action_space: gym.spaces.Space = gym.spaces.Box(low=-1.0, high=1.0, shape=(6,), dtype=np.float64)
state_space: gym.spaces.Space = gym.spaces.Box(low=-np.inf, high=np.inf, shape=(17,), dtype=np.float64)

Finally, I had to set articulation_enabled=False in assets/warpauv.py:

import isaaclab.sim as sim_utils

from isaaclab.assets import RigidObjectCfg

import os
USD_PATH = os.path.join(os.path.dirname(__file__), "../data/warpauv/warpauv.usd")

WARPAUV_CFG = RigidObjectCfg(
    prim_path="{ENV_REGEX_NS}/Robot",
    spawn=sim_utils.UsdFileCfg(
        usd_path=USD_PATH,
        rigid_props=sim_utils.RigidBodyPropertiesCfg(
            disable_gravity=False,
            max_depenetration_velocity=10.0,
            enable_gyroscopic_forces=True,
        ),
        articulation_props=sim_utils.ArticulationRootPropertiesCfg(
            articulation_enabled=False,
        ),

        copy_from_source=False,
    ),
    init_state=RigidObjectCfg.InitialStateCfg(
        pos=(0.0, 0.0, 5),
    )
)
"""Configuration for the WarpAUV."""

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
agents		agents
assets		assets
custom_workflows		custom_workflows
data/warpauv		data/warpauv
imgs		imgs
weights		weights
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
asymmetric_noise_cfg.py		asymmetric_noise_cfg.py
rigid_body_hydrodynamics.py		rigid_body_hydrodynamics.py
thruster_dynamics.py		thruster_dynamics.py
warpauv_env.py		warpauv_env.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning to Swim: Reinforcement Learning for 6-DOF Control of Thruster-driven Autonomous Underwater Vehicles

About

Uh oh!

Releases 1

Packages

Contributors 4

Uh oh!

Languages

License

warplab/isaac-auv-env

Folders and files

Latest commit

History

Repository files navigation

Learning to Swim: Reinforcement Learning for 6-DOF Control of Thruster-driven Autonomous Underwater Vehicles

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 4

Uh oh!

Languages

Packages