We need valid tests on GPU for performance and accuracy. Current implementation uses KernelAbstractions.jl. I also hear about [JACC.jl](https://juliagpu.github.io/JACC.jl/stable/) recently. Also, we need multi-node tests for an extended scaling run.