Skip to content

l1cacheDell/cache-oblivious-gemm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cache-Oblivious SGEMM

Pre-requirement

# general preparation
python -m pip install -U pip setuptools wheel
python -m pip install -U pybind11 numpy
sudo apt install -y build-essential

# install package
python setup.py install

Running the benchmark

Before running the benchmark, you need to configure openmp to make sure each thread will be bind to each physical CPU core.

All we need to do is to simply export some environment variables:

# env
export OMP_NUM_THREADS=<your physical CPU core>       # 8 recommended
export OMP_PLACES=cores                               # mapping to physical cores
export OMP_PROC_BIND=spread                           # spread to different places
export OMP_DYNAMIC=FALSE                              # disallow dynamic changing threads

Then running the benchmark:

python main.py

Performance

示例图片

For more details, ref: