Efficient end-to-end workflow for deploying DNNs on an SoC/FPGA by integrating hyperparameter tuning through Bayesian optimization with an ensemble of compression techniques (quantization, pruning, and knowledge distillation).
The proposed workflow is based on three stages: DNN training and compression, integration with a hardware synthesis tool for ML, and hardware assessment.
Check the file requirements.txt inside the environment folder.
- Vivado Design Suite - HLx Editions 2019.1, 2019.2, 2022.2
Repository tree:
- The workflow comprises the following folders:
- 00-environment
- 01-compressionAndTraining
- 02-hls4mlIntegration
- 03-assessmentFramework
- 04-integrationPYNQ*
- *The folder integrationPYNQ is for those willing to integrate the ML IP core into the PYNQ framework.
- Available under reasonable request.
| Branch | Purpose |
|---|---|
main |
Clean refactor of original code |
backup_original |
Legacy structure linked to initial publication |
full_compression |
Combined pipeline for QAT + KD + pruning in a single loop |
| coming soon... | Separate branches for isolated quantization, pruning, and KD |
If this codebase is linked to a publication, cite it as:
@ARTICLE{10360204,
author={Molina, Romina Soledad and Morales, Iván René and Crespo, Maria Liz and Costa, Veronica Gil and Carrato, Sergio and Ramponi, Giovanni},
journal={IEEE Embedded Systems Letters},
title={An End-to-End Workflow to Efficiently Compress and Deploy DNN Classifiers on SoC/FPGA},
year={2024},
volume={16},
number={3},
pages={255-258},
doi={10.1109/LES.2023.3343030}}
Built on top of:
Have fun!! And remember, this is a methodology to facilitate the training and compression process when targetting resource-constrained devices, it is not (yet ;) ) an automatic process.


