ContactExplorer is a exploration method for dexterous manipulation. It defines contact as the intersection between object surface points and hand keypoints, and maintains a hash-conditioned counter of which fingers touch which object regions.
- Coverage reward (count-based) rewards novel contact patterns.
- Reaching reward (energy-based) steers the hand toward under-explored regions.
- Results: faster training and higher success on singulation, retrieval, in-hand reorientation, and bimanual tasks β with sim-to-real transfer. See the paper.
- Overview
- Installation
- Training and Evaluation
- Repository Structure
- CCGE Reward Architecture
- Citation
- License
conda create -n ccge python=3.8 # mamba also works
conda activate ccgeDownload IsaacGym and extract:
wget https://developer.nvidia.com/isaac-gym-preview-4
tar -xvzf isaac-gym-preview-4Install IsaacGym Python API:
pip install -e isaacgym/pythonTest installation:
python 1080_balls_of_solitude.py # or
python joint_monkey.pyFor libpython error:
- Check conda path:
conda info -e
- Set LD_LIBRARY_PATH:
export LD_LIBRARY_PATH=</path/to/conda/envs/your_env/lib>:$LD_LIBRARY_PATH
Install IsaacGymEnvs and following dependencies:
pip install --no-build-isolation -r requirements.txtTwo types of dexterous hands are provided (LEAP and Allegro). You may choose one of them to train.
| Task | Training Script |
|---|---|
| Singulation | train_<hand_type>_singulation.sh |
| Table Top | train_<hand_type>_table_top.sh |
| Inhand | train_<hand_type>_inhand.sh |
| Retri | train_<hand_type>_cube_in_box.sh |
| Bimanual | train_bimanual.sh |
Set mode=eval and point to a trained run directory. Start from the corresponding train_*.sh and append the eval flags:
python src/train.py \
mode=eval \
task=<TaskName> \
train=<TrainCfgName> \
... \
--model_dir=logs/PPO/<run_dir> \
--resume_iter=<checkpoint_iter> \
--eval_times=5 \
--vis_env_num=0Notes:
--model_diris required for evaluation and should containmodel_*.pt.--resume_iteris optional (defaults to the latest checkpoint).
Observation and action spaces are set via Hydra overrides in launch scripts:
obs_space="['allegro_hand_dof_position']"
action_space="['wrist_translation','wrist_rotation','hand_rotation']"Available keys are task-specific β see the corresponding file in src/tasks/.
Training scripts pass a reward_type string to src/train.py, e.g.:
reward_type="target+bonus+success+reach+energy_reach+contact_coverage"
The CCGE exploration signal consists of energy_reach and contact_coverage. To ablate exploration, remove them:
reward_type="target+bonus+success+reach"
src/β core library codetasks/β Isaac Gym task environments, reward logic, and curiosity modulesalgorithms/β PPO and intrinsic-reward componentsutils/β config loading, logging, helpers- Entry points:
train.py
cfg/β Hydra configstask/β task/environment configstrain/β training configs
graph TD
subgraph Inputs
KP["Hand Keypoints<br/><i>(L points from URDF)</i>"]
PC["Canonical Object<br/>Point Cloud + Normals<br/><i>(M points)</i>"]
SFB["State Feature Bank<br/><i>LearnedHashStateBank /<br/>PushBox2DStateBank</i>"]
end
PC -->|K-means + FPS| CL["Surface Clusters<br/><i>(K clusters)</i>"]
PC --> CRM
CL --> CRM
KP --> CRM
SFB -->|state ID| CRM
subgraph CRM ["CuriosityRewardManager"]
POT["Energy-based Reaching Reward Ξ¦<br/><i>novelty-weighted kernel</i>"]
CB["Contact Coverage Reward<br/><i>cluster novelty</i>"]
RM["Running-Max Tracker<br/><i>per state Γ keypoint</i>"]
end
CRM --> REW["<b>CCGE Reward = Energy-based Reaching Reward + Contact Coverage Reward</b>"]
Each task must supply these tensors per step (N = num envs, L = keypoints, M = object points):
| Tensor | Shape | How to get it |
|---|---|---|
keypoint_positions_with_offset |
(N, L, 3) |
Index rigid-body states by keypoint link indices, apply local offsets via quat_apply |
keypoint_contact_mask |
(N, L) bool |
(dist_to_surface < threshold) & (contact_force > threshold) |
object_root_positions |
(N, 3) |
From root_states |
object_root_orientations |
(N, 4) |
From root_states (xyzw quaternion) |
| Canonical point cloud | (M, 3) |
Loaded from dataset (object frame) |
| Canonical normals | (M, 3) |
Loaded from dataset |
For the full step-by-step integration guide (keypoint setup, contact mask, CuriosityRewardManager init, reward computation, reset handling, and config), see src/tasks/README.md.
This repository builds upon or incorporates code from the following open-source projects:
- UniDexFPM for hand-arm task environments.
- IsaacGymEnvs for base task environments and Isaac Gym utilities.
- WoCoCo for intrinsic baseline implementation references.
- ARCTIC and ContactDB for simulation assets.
Please refer to the respective repositories and their licenses for more details.
If you find our work useful, please consider citing us!
@article{liu2026contactcoverageguidedexplorationgeneralpurpose,
title={Contact Coverage-Guided Exploration for General-Purpose Dexterous Manipulation},
author={Zixuan Liu and Ruoyi Qiao and Chenrui Tie and Xuanwei Liu and Yunfan Lou and Chongkai Gao and Zhixuan Xu and Lin Shao},
year={2026},
journal={arXiv preprint arXiv:2603.10971},
}
This project is licensed under the MIT License - see the LICENSE file for details.



