Skip to content

Build failures on FIDESlib 2.0.0 — LimbPartitionMGPU.cu errors and MGPU coupling in single-GPU builds #19

@lmvdz

Description

@lmvdz

Build failures on FIDESlib 2.0.0 — LimbPartitionMGPU.cu errors and MGPU coupling in single-GPU builds

Summary

Attempting to build FIDESlib 2.0.0 for a single-GPU target (RTX 4070 Ti, sm_89). We hit two blocking issues: (1) LimbPartitionMGPU.cu fails to compile on both CUDA 12.4 and CUDA 13, and (2) even when that file is excluded, the single-GPU RNSPoly.cpp path has unresolved references to MGPU symbols at link time. Wanted to flag this and ask about the supported single-GPU path before we go further — the library looks great and we're keen to adopt it.

Environment

  • Host: WSL2 Ubuntu 24.04
  • GPU: RTX 4070 Ti (Ada Lovelace, sm_89), FIDESLIB_ARCH=89-real
  • CUDA: tried both 13.0.88 and 12.4.131 (NVIDIA apt repo)
  • Compiler: gcc/g++-13 with CUDA 12.4 (CUDA 12.4 rejects gcc-14); gcc/g++-14 with CUDA 13
  • CMake: 4.3.1
  • OpenFHE: v1.4.2 built from source

Reproduction

Standard CMake configure + build against OpenFHE 1.4.2, FIDESLIB_ARCH=89-real, single GPU.

Expected

Clean build of libfideslib.a and fideslib-bench on a single-GPU host.

Actual

Issue 1 — LimbPartitionMGPU.cu will not compile (both CUDA 12.4 and CUDA 13)

First error, then a cascade:

  • src/CKKS/LimbPartitionMGPU.cu:1748: qualified name is not allowed on void LimbPartition::modupMGPU(...)
  • NTT_ is not a template
  • ALGO_SHOUP undefined
  • blockDimFirst undefined
  • Two identically-named cached_graph structs at lines 581 and 1751 with different fields
  • printf has already been defined
  • Hits the 100-error limit

Issue 2 — MGPU is not separable at link time

Excluding LimbPartitionMGPU.cu via list(FILTER ... EXCLUDE REGEX ...) gets libfideslib.a to ~85%, but fideslib-bench link fails with 15+ undefined references from src/CKKS/RNSPoly.cpp (the single-GPU path):

  • rescaleMGPU, modupMGPU, moddownMGPU
  • broadcastLimb0_mgpu, dotKSKfusedMGPU, modup_ksk_moddown_mgpu
  • fusedHoistRotate (MGPU overload)
  • Free symbols MEMCPY_PEER, GRAPH_CAPTURE

Suggests the MGPU path is woven into RNSPoly unconditionally — no #ifdef FIDESLIB_ENABLE_MULTI_GPU guard.

Minor patches we applied to get this far

  • CMakeLists.txt: -fopenmp=libomp (clang-only) → -fopenmp for gcc
  • src/CKKS/Context.cu:758: typo uint63_tuint64_t
  • FIDESLIB_ARCH=89-real (default list included 100-real;120-real = Blackwell, rejected by CUDA 12.4)
  • Shimmed missing nvtx3/nvtx3.hpp in CUDA 12.4 install with the CUDA 13 header

Asks

  1. Is there a supported single-GPU build path in 2.0.0 we missed?
  2. Are the LimbPartitionMGPU.cu errors a known regression from the 2.0 MGPU refactor?
  3. Any ETA on a 2.1 that decouples MGPU, or a hotfix branch we can track?
  4. Recommended commit/tag for a clean single-GPU build on 2.0 in the meantime?

Happy to share full logs or test patches. Thanks for the work on this — we want to adopt and are blocked on the build, not on the library itself.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions