SIMTization for Force Calculation of Lennard-Jones Potential This project is forked from lj_simd Usage NVIDIA GPU (CUDA implementations) $ cd cuda $ make NVIDIA GPU (OpenACC implementations) $ cd openacc $ make AMD GPU (OpenCL implementations) $ cd opencl $ make Performance comparison of OpenACC with CUDA OpenACC (tuned) The optimal Verlet list data layout for GPUs is used. OpenACC (naive) pragma directives are simply added to original CPU source codes. @ Xeon E5-2680 v3 icpc version 18.0.0 20170811 implementationtime [s] Reference1.431335 AVX2 SIMD0.877171 @ Tesla K40t CUDA version 7.5 PGI compiler version 16.10 implementationtime [s] CUDA0.049346 OpenACC (tuned)0.168751 OpenACC (naive)0.305789 @ Tesla P100 CUDA version 8.0 PGI compiler version 17.1 implementationtime [s] CUDA0.017529 OpenACC (tuned)0.027165 OpenACC (naive)0.092830