Skip to content

Update benchmark README with GPU performance results #211

@krystophny

Description

@krystophny

Problem

The benchmark README (bench/spline_many/README.md) references historical single runs but lacks current, verified performance numbers.

Current GPU Performance (RTX 5060 Ti, nvfortran 25.11)

Spline CPU (pts/s) GPU OpenACC (pts/s) Speedup
1D 48.7M 807M ~16×
2D 8.4M 539M ~64×
3D 1.4M 102M ~73×

Detailed Results

1D Benchmark (order=5, num_points=2048, nq=8, npts=2M, niter=20, periodic=T):

  • CPU: 48.7M pts/s
  • OpenACC GPU: 807M pts/s

2D Benchmark (order=[5,5], num_points=[256,256], nq=8, npts=500K, niter=10, periodic=[T,T]):

  • CPU: 8.4M pts/s
  • OpenACC GPU: 539M pts/s

3D Benchmark (order=[5,5,5], num_points=[48,32,32], nq=8, npts=200K, niter=6, periodic=[T,T,T]):

  • CPU: 1.4M pts/s
  • OpenACC GPU: 102M pts/s

Proposed Changes

  1. Add a "Performance Results" section with tabular data
  2. Document test environment (GPU model, driver version, compiler version)
  3. Note that nvfortran is the recommended compiler for GPU acceleration
  4. Document GCC16 GPU status (currently has runtime issues - see Fix GCC16 OpenACC GPU offloading runtime error #209)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions