Machine learning accelerators have been used extensively to compute models with high performance and low power. Unfortunately, the development pace of ML models is much faster than the accelerator design cycle, leading to frequent changes in the hardware architecture requirements, rendering many accelerators obsolete. Existing design tools and frameworks can provide quick accelerator prototyping, but only for a limited range of models that fit into a single hardware device. With the emergence of large language models such as GPT-3, there is an increased need for hardware prototyping of large models within a many-accelerator system to ensure the hardware can scale with ever-growing model sizes.
MASE provides an efficient and scalable approach for exploring accelerator systems to compute large ML models by directly mapping onto an efficient streaming accelerator system. Over a set of ML models, MASE can achieve better energy efficiency to GPUs when computing inference for recent transformer models.
- Fast Prototyping Next-Generation Accelerators for New ML Models using MASE: ML Accelerator System Exploration, link
@article{cheng2023fast, title={Fast prototyping next-generation accelerators for new ml models using mase: Ml accelerator system exploration}, author={Cheng, Jianyi and Zhang, Cheng and Yu, Zhewen and Montgomerie-Corcoran, Alex and Xiao, Can and Bouganis, Christos-Savvas and Zhao, Yiren}, journal={arXiv preprint arXiv:2307.15517}, year={2023}} - MASE: An Efficient Representation for Software-Defined ML Hardware System Exploration, link
@article{zhangmase, title={MASE: An Efficient Representation for Software-Defined ML Hardware System Exploration}, author={Zhang, Cheng and Cheng, Jianyi and Yu, Zhewen and Zhao, Yiren}}
This repo contains the following directories:
components- Internal hardware libraryscripts- Installation scriptsmachop- MASE's software stackhls- HLS component of MASEmlir-air- MLIR AIR for ACAP devicesdocs- DocumentationDocker- Docker container configurations
First, make sure the repo is up to date:
make syncStart with the docker container by running the following command under the repo:
make shellIt may take long time to build the docker container for the first time. Once done, you should enter the docker container. To build the tool, run the following command:
cd /workspace
make buildThis should also take long time to finish.
If you would like to contribute, please check the wiki for more information.
First, make sure the repo is up to date:
make syncInstall conda by following the instructions here, then build and activate the environment as follows.
conda env create -f machop/environment.yml
conda activate maseOptionally, you can verify the CLI utility is running.
./machop/ch --versionIn this example, we'll use the command-line interface (CLI) to train a toy model on the Jet Substructure (JSC) dataset, quantize it to integer arithmetic and evaluate the quantized model on the test split of the dataset.
First, train the toy model over 10 epochs by running the following command.
./machop/ch train jsc-tiny jsc --max-epochs 10 --batch-size 256Now, transform the model to integer arithmetic and save the model checkpoint. The toml configuration file specifies the required arguments for the quantization flow.
./machop/ch transform jsc-tiny jsc --config ./machop/configs/examples/jsc_toy_by_type.tomlFinally, evaluate the quantized model performance.
./machop/ch test jsc-tiny jsc --load <path/to/checkpoint>See the Machop README for a more detailed introduction.
- Subscribe Mase Weekly Dev Meeting (Wednesday 4:30 UK time). Everyone is welcomed!
- Direct Google Meet link
- Join the Mase Slack
- If you want to discuss anything in future meetings, please add them as comments in the meeting agenda so we can review and add them.
If you think MASE is helpful, please donate for our work, we appreciate your support!

