Skip to content

odiakun/lab_7_classes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Instruction lab 7 - CUDA vs CUDA Graphs

1. Clone the repository

In separate directory clone this repository.

This script runs benchmarks LU, EP in both CUDA and CUDA Graphs versions and collects results.

2. Run the benchmarks

chmod +x run_all.sh
./run_all.sh

Results, including timing and profiling data will be saved in the $(pwd)/profiling_outputs/ directory.
Profiling data itself will be saved to the $(pwd)/nsys_reports/.

3. Investigate the results

Compare total execution time and number of kernel launches for each benchmark(LU, EP) between CUDA and CUDA Graphs. You can extract this information from the .txt output or nsys stats.

Which implementation is faster?
Which one has fewer kernel lauches?
Are the performance differences consistent accross benchmarks?

To satisfy your curiosity

For more details about the implementation, benchmark evolution, and task descriptions, visit the original NPB-GPU repository.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages