forked from lammps/lammps
-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
The calculation suffers from a performance loss when a triclinic simulation box is used.
Command:
lmp -in lammps.in -suffix kk -k on g 1
Input of lammps:
package kokkos newton on neigh half
units metal
boundary p p p
# create the simulation system without reading external data file
atom_style atomic/kk
lattice fcc 3.6
region box block 0 4 0 4 0 4 # cubic
# region box prism 0 4 0 4 0 4 2.0 0.0 0.0 units lattice #triclinic
create_box 85 box
mass 1 1.008
mass 2 4.002602
mass 3 6.94
mass 4 9.0121831
mass 5 10.81
mass 6 12.011
mass 7 14.007
mass 8 15.999
mass 9 18.998403163
mass 10 20.1797
mass 11 22.98976928
mass 12 24.305
mass 13 26.9815385
mass 14 28.085
mass 15 30.973761998
mass 16 32.06
mass 17 35.45
mass 18 39.948
mass 19 39.0983
mass 20 40.078
mass 21 44.955908
mass 22 47.867
mass 23 50.9415
mass 24 51.9961
mass 25 54.938044
mass 26 55.845
mass 27 58.933194
mass 28 58.6934
mass 29 63.546
mass 30 65.38
mass 31 69.723
mass 32 72.63
mass 33 74.921595
mass 34 78.971
mass 35 79.904
mass 36 83.798
mass 37 85.4678
mass 38 87.62
mass 39 88.90584
mass 40 91.224
mass 41 92.90637
mass 42 95.95
mass 43 97.90721
mass 44 101.07
mass 45 102.9055
mass 46 106.42
mass 47 107.8682
mass 48 112.414
mass 49 114.818
mass 50 118.71
mass 51 121.76
mass 52 127.6
mass 53 126.90447
mass 54 131.293
mass 55 132.90545196
mass 56 137.327
mass 57 138.90547
mass 58 140.116
mass 59 140.90766
mass 60 144.242
mass 61 144.91276
mass 62 150.36
mass 63 151.964
mass 64 157.25
mass 65 158.92535
mass 66 162.5
mass 67 164.93033
mass 68 167.259
mass 69 168.93422
mass 70 173.054
mass 71 174.9668
mass 72 178.49
mass 73 180.94788
mass 74 183.84
mass 75 186.207
mass 76 190.23
mass 77 192.217
mass 78 195.084
mass 79 196.966569
mass 80 200.592
mass 81 204.38
mass 82 207.2
mass 83 208.9804
mass 84 208.98243
mass 85 222.01758
create_atoms 28 box
# the model will automatically run on the same device as the kokkos code
pair_style metatomic/kk /users/qxu/repo/Phosphate-acid-in-water/omat.pt extensions /users/qxu/repo/Phosphate-acid-in-water/extensions_daint device cuda
pair_coeff * * 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85
# simulation settings
timestep 0.001 # 1fs timestep
# neighbor 1.0 bin
# neigh_modify one 10000
velocity all create 243 42 loop all
fix 1 all npt temp 243 243 $(100 * dt) iso 0 0 $(1000 * dt) drag 1.0
# output setup
thermo 10
run_style verlet/kk
# run the simulation for 10000 steps
run 200
Profiling of the cubic box:
PairMetatomic::compute ...
creating System from LAMMPS-kokkos data ...
converting kokkos neighbors with ghosts remapping ...
identifying ghosts and real atoms ... took 0.085789ms (rank 0)
filtering LAMMPS neighbor list ... took 13.1581ms (rank 0)
creating samples Labels (51188 pairs) ... took 0.151196ms (rank 0)
creating neighbors TensorBlock ... took 0.112573ms (rank 0)
converting kokkos neighbors with ghosts remapping took 20.7771ms (rank 0)
creating System from LAMMPS-kokkos data took 21.1419ms (rank 0)
running Model::forward ... took 20.1846ms (rank 0)
running Model::backward ... took 9.10778ms (rank 0)
storing model output in LAMMPS data structures ... took 0.096669ms (rank 0)
PairMetatomic::compute took 50.8255ms (rank 0) 200 181.82899 -1400.7064 0 -1394.7131 24441.465 2739.3901
Loop time of 11.9644 on 1 procs for 200 steps with 256 atoms
Performance: 1.444 ns/day, 16.617 hours/ns, 16.716 timesteps/s, 4.279 katom-step/s
98.8% CPU use with 1 MPI tasks x 1 OpenMP threads
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 11.906 | 11.906 | 11.906 | 0.0 | 99.51
Neigh | 0 | 0 | 0 | 0.0 | 0.00
Comm | 0.013182 | 0.013182 | 0.013182 | 0.0 | 0.11
Output | 0.00075473 | 0.00075473 | 0.00075473 | 0.0 | 0.01
Modify | 0.03727 | 0.03727 | 0.03727 | 0.0 | 0.31
Other | | 0.007495 | | | 0.06
Nlocal: 256 ave 256 max 256 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Nghost: 25071 ave 25071 max 25071 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Neighs: 0 ave 0 max 0 min
Histogram: 1 0 0 0 0 0 0 0 0 0
FullNghs: 1.62867e+06 ave 1.62867e+06 max 1.62867e+06 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Total # of neighbors = 1628672
Ave neighs/atom = 6362
Neighbor list builds = 0
Dangerous builds = 0
Total wall time: 0:00:14
Profiling of the triclinic box:
PairMetatomic::compute ...
creating System from LAMMPS-kokkos data ...
converting kokkos neighbors with ghosts remapping ...
identifying ghosts and real atoms ... took 0.106973ms (rank 0)
filtering LAMMPS neighbor list ... took 18.6482ms (rank 0)
creating samples Labels (55132 pairs) ... took 0.231641ms (rank 0)
creating neighbors TensorBlock ... took 0.12438ms (rank 0)
converting kokkos neighbors with ghosts remapping took 29.9312ms (rank 0)
creating System from LAMMPS-kokkos data took 30.2864ms (rank 0)
running Model::forward ... took 21.5174ms (rank 0)
running Model::backward ... took 9.66668ms (rank 0)
storing model output in LAMMPS data structures ... took 0.084605ms (rank 0)
PairMetatomic::compute took 61.881ms (rank 0) 200 3087.1674 -1232.8273 0 -1131.0701 559756.44 2508.6368
Loop time of 29.5822 on 1 procs for 200 steps with 256 atoms
Performance: 0.584 ns/day, 41.086 hours/ns, 6.761 timesteps/s, 1.731 katom-step/s
98.3% CPU use with 1 MPI tasks x 1 OpenMP threads
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 29.281 | 29.281 | 29.281 | 0.0 | 98.98
Neigh | 0.21461 | 0.21461 | 0.21461 | 0.0 | 0.73
Comm | 0.040001 | 0.040001 | 0.040001 | 0.0 | 0.14
Output | 0.00085428 | 0.00085428 | 0.00085428 | 0.0 | 0.00
Modify | 0.037675 | 0.037675 | 0.037675 | 0.0 | 0.13
Other | | 0.008544 | | | 0.03
Nlocal: 256 ave 256 max 256 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Nghost: 31705 ave 31705 max 31705 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Neighs: 0 ave 0 max 0 min
Histogram: 1 0 0 0 0 0 0 0 0 0
FullNghs: 1.91984e+06 ave 1.91984e+06 max 1.91984e+06 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Total # of neighbors = 1919842
Ave neighs/atom = 7499.3828
Neighbor list builds = 8
Dangerous builds = 0
Total wall time: 0:00:31
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels