feat: Implement CUDA acceleration for Ricci observable by JorenB · Pull Request #7 · JorenB/2d-cdt

JorenB · 2025-05-26T11:31:36Z

This commit introduces a CUDA-based implementation for the average sphere distance calculation within the Ricci observable. The goal is to leverage GPU parallelism to speed up this computationally intensive part of the simulation.

Key changes include:

Makefile Updates: Modified the Makefile to support NVCC for compiling .cu files. It now correctly handles CUDA source files, dependencies, and links against CUDA libraries.
CUDA Kernels (observables/ricci_cuda_kernels.cu, .hpp):
- Introduced pairwise_bfs_kernel, a CUDA kernel that computes
  distances between all pairs of vertices from two spheres (s1, s2).
- Each pair's distance is found using a BFS, implemented in the
  __device__ function calculate_distance_bfs_device. This BFS
  is depth-limited to 3*epsilon and uses thread-local fixed-size
  arrays for its queue and visited set to manage resources.
- A C++ wrapper function,
  RicciCUDATask::calculate_sum_and_count_distances_cuda,
  manages GPU memory allocation, data conversion (adjacency list to
  CSR format), H2D/D2H transfers, kernel launch, and cleanup.
Ricci Observable Modification (observables/ricci.cpp):
- The Ricci::averageSphereDistance method now calls the
  CUDA wrapper function instead of performing the BFS calculations
  on the CPU.
- The original logic for sphere generation and the specific averaging
  formula (sum_distances / (epsilon * count_distances)) are
  preserved.
- Added safety checks for empty data structures and zero epsilon.

This implementation aims to replace the previous CPU-bound calculation with a parallel GPU version. Further testing and validation in a compiled environment are needed to verify correctness, performance, and robustness, especially concerning the fixed-size limitations in the per-thread BFS.

This commit introduces a CUDA-based implementation for the average sphere distance calculation within the Ricci observable. The goal is to leverage GPU parallelism to speed up this computationally intensive part of the simulation. Key changes include: 1. **Makefile Updates:** Modified the Makefile to support NVCC for compiling .cu files. It now correctly handles CUDA source files, dependencies, and links against CUDA libraries. 2. **CUDA Kernels (`observables/ricci_cuda_kernels.cu`, `.hpp`):** * Introduced `pairwise_bfs_kernel`, a CUDA kernel that computes distances between all pairs of vertices from two spheres (s1, s2). * Each pair's distance is found using a BFS, implemented in the `__device__` function `calculate_distance_bfs_device`. This BFS is depth-limited to 3*epsilon and uses thread-local fixed-size arrays for its queue and visited set to manage resources. * A C++ wrapper function, `RicciCUDATask::calculate_sum_and_count_distances_cuda`, manages GPU memory allocation, data conversion (adjacency list to CSR format), H2D/D2H transfers, kernel launch, and cleanup. 3. **Ricci Observable Modification (`observables/ricci.cpp`):** * The `Ricci::averageSphereDistance` method now calls the CUDA wrapper function instead of performing the BFS calculations on the CPU. * The original logic for sphere generation and the specific averaging formula (`sum_distances / (epsilon * count_distances)`) are preserved. * Added safety checks for empty data structures and zero epsilon. This implementation aims to replace the previous CPU-bound calculation with a parallel GPU version. Further testing and validation in a compiled environment are needed to verify correctness, performance, and robustness, especially concerning the fixed-size limitations in the per-thread BFS.

JorenB self-assigned this May 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Implement CUDA acceleration for Ricci observable#7

feat: Implement CUDA acceleration for Ricci observable#7
JorenB wants to merge 1 commit intomainfrom
feat/ricci-cuda-observable

JorenB commented May 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JorenB commented May 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant