An Orchestration Learning Framework for Ultrasound Imaging: Prompt-Guided Hyper-Perception and Attention-Matching Downstream Synchronization
This repository provides the official PyTorch implementation for our work published in Medical Image Analysis, 2025. The framework introduces:
- Prompt-Guided Hyper-Perception for incorporating prior domain knowledge via learnable prompts.
- Attention-Matching Downstream Synchronization to seamlessly transfer knowledge across segmentation and classification tasks.
- Support for diverse ultrasound datasets with both segmentation and classification annotations (
$M^2-US$ dataset). - Distributed training and inference pipelines based on the Swin Transformer backbone.
For more details, please refer to the paper (temporary free link, expires on July 17, 2025) and Project Page,
An orchestration learning framework for ultrasound imaging: Prompt-guided hyper-perception and attention-matching Downstream Synchronization
Zehui Lin, Shuo Li, Shanshan Wang, Zhifan Gao, Yue Sun, Chan-Tong Lam, Xindi Hu, Xin Yang, Dong Ni, and Tao Tan. Medical Image Analysis, 2025.
- Clone the repository
git clone https://github.com/Zehui-Lin/PerceptGuide
cd PerceptGuide- Create a conda environment
conda create -n PerceptGuide python=3.10
conda activate PerceptGuide- Install the dependencies
pip install -r requirements.txtOrganize your data directory with classification and segmentation sub-folders. Each sub-folder should contain a config.yaml file and train/val/test lists:
data
├── classification
│ └── DatasetA
│ ├── 0
│ ├── 1
│ ├── config.yaml
│ ├── train.txt
│ ├── val.txt
│ └── test.txt
└── segmentation
└── DatasetB
├── imgs
├── masks
├── config.yaml
├── train.txt
├── val.txt
└── test.txt
Use the examples provided in the codebase as a reference when preparing new datasets.
The repository bundles several ultrasound datasets. Their licenses and redistribution conditions are listed below. You can download the preprocessed datasets which allow for redistribution from here.
| Dataset | License | Redistribution | Access |
|---|---|---|---|
| Appendix | CC BY-NC 4.0 | Included in repo | link |
| BUS-BRA | CC BY 4.0 | Included in repo | link |
| BUSIS | CC BY 4.0 | Included in repo | link |
| UDIAT | Private License | Not redistributable | link |
| CCAU | CC BY 4.0 | Included in repo | link |
| CUBS | CC BY 4.0 | Included in repo | link |
| DDTI | Unspecified License | License unclear | link |
| TN3K | Unspecified License | License unclear | link |
| EchoNet-Dynamic | Private License | Not redistributable | link |
| Fatty-Liver | CC BY 4.0 | Included in repo | link |
| Fetal_HC | CC BY 4.0 | Included in repo | link |
| MMOTU | CC BY 4.0 | Included in repo | link |
| kidneyUS | CC BY-NC-SA | Included in repo | link |
| BUSI | CC0 Public Domain | Included in repo | link |
| HMC-QU | CC BY 4.0 | Included in repo | link |
| TG3K | Unspecified License | License unclear | link |
Notes
- Private-license datasets (UDIAT, EchoNet-Dynamic) cannot be redistributed here; please request access through the provided links.
- Unspecified/unclear-license datasets (TN3K, TG3K, DDTI) may have redistribution restrictions. Download them directly from the source or contact the data owners for permission.
Critical Update (2025-11-10): The v1.0.0 data bundle contained significant issues in the BUS-BRA (incorrect patient-level splits) and Appendix (labeling errors) datasets. A corrected version v1.0.1 has been released. We strongly advise all users to download the latest version.
You can download the corrected and preprocessed datasets from our latest release:
We employ torch.distributed for multi-GPU training (single GPU is also supported):
python -m torch.distributed.launch --nproc_per_node=1 --master_port=1234 omni_train.py --output_dir exp_out/trial_1 --promptFor evaluation, run:
python -m torch.distributed.launch --nproc_per_node=1 --master_port=1234 omni_test.py --output_dir exp_out/trial_1 --promptDownload the Swin Transformer backbone and place it in pretrained_ckpt/:
The folder structure should look like:
pretrained_ckpt
└── swin_tiny_patch4_window7_224.pth
Notice on Pre-trained Model Availability
To ensure full compliance with the Research Use Agreement of the EchoNet-Dynamic dataset, which was used in training, the pre-trained model checkpoint previously available for download has been removed.
The dataset's license agreement places restrictions on the distribution of derivative works. We apologize for any inconvenience this may cause and encourage users to train the model from scratch using the provided source code.
If you find this project helpful, please consider citing:
@article{lin2025orchestration,
title={An orchestration learning framework for ultrasound imaging: Prompt-guided hyper-perception and attention-matching Downstream Synchronization},
author={Lin, Zehui and Li, Shuo and Wang, Shanshan and Gao, Zhifan and Sun, Yue and Lam, Chan-Tong and Hu, Xindi and Yang, Xin and Ni, Dong and Tan, Tao},
journal={Medical Image Analysis},
pages={103639},
year={2025},
publisher={Elsevier}
}This repository is built upon the Swin-Unet codebase. We thank the authors for making their work publicly available.