🎉 This is the official implementation of our IEEE TASLP paper:
UL-UNAS: Ultra-Lightweight U-Nets for Real-Time Speech Enhancement via Network Architecture Search
- [2026-2-13] Added streaming inference code
ulunas_onnximplemented by Kailai Shen. - [2026-2-4] The manuscipt is accepted by IEEE TASLP.
- [2026-2-3] The updated paper is uploaded to arxiv.
- [2026-2-1] The pre-trained checkpoint is released.
- [2026-1-28] The model implementation is released.
To run inference on audio files, use:
python inference --input_dir <input_dir> --output_dir <output_dir> [options]| Argument | Requirement / Default | Description |
|---|---|---|
--input_dir |
required | Path to the input directory containing audio files. |
--output_dir |
required | Path to the output directory where enhanced files will be saved. |
--device |
default: cuda:0 |
Torch device to run inference on, e.g., cuda:0, cuda:1, or cpu. |
--extension |
default: .wav |
Audio file extension to process. |
The training script can refer to the SEtrain repository.
@ARTICLE{ulunas,
author={Rong, Xiaobin and Yang, Leyan and Wang, Dahan and Hu, Yuxiang and Zhu, Changbao and Chen, Kai and Lu, Jing},
journal={IEEE Transactions on Audio, Speech and Language Processing},
title={UL-UNAS: Ultra-Lightweight U-Nets for Real-Time Speech Enhancement via Network Architecture Search},
year={2026},
volume={34},
number={},
pages={1085-1096},
keywords={Computational modeling;Computer architecture;Convolution;Speech enhancement;Computational efficiency;Performance evaluation;Computational complexity;Artificial intelligence;Time-frequency analysis;Time-domain analysis;Speech enhancement;ultra-lightweight;neural architecture search;computational complexity},
doi={10.1109/TASLPRO.2026.3661271}}Xiaobin Rong: xiaobin.rong@smail.nju.edu.cn