Build failures on FIDESlib 2.0.0 — LimbPartitionMGPU.cu errors and MGPU coupling in single-GPU builds
Summary
Attempting to build FIDESlib 2.0.0 for a single-GPU target (RTX 4070 Ti, sm_89). We hit two blocking issues: (1) LimbPartitionMGPU.cu fails to compile on both CUDA 12.4 and CUDA 13, and (2) even when that file is excluded, the single-GPU RNSPoly.cpp path has unresolved references to MGPU symbols at link time. Wanted to flag this and ask about the supported single-GPU path before we go further — the library looks great and we're keen to adopt it.
Environment
- Host: WSL2 Ubuntu 24.04
- GPU: RTX 4070 Ti (Ada Lovelace, sm_89),
FIDESLIB_ARCH=89-real
- CUDA: tried both 13.0.88 and 12.4.131 (NVIDIA apt repo)
- Compiler: gcc/g++-13 with CUDA 12.4 (CUDA 12.4 rejects gcc-14); gcc/g++-14 with CUDA 13
- CMake: 4.3.1
- OpenFHE: v1.4.2 built from source
Reproduction
Standard CMake configure + build against OpenFHE 1.4.2, FIDESLIB_ARCH=89-real, single GPU.
Expected
Clean build of libfideslib.a and fideslib-bench on a single-GPU host.
Actual
Issue 1 — LimbPartitionMGPU.cu will not compile (both CUDA 12.4 and CUDA 13)
First error, then a cascade:
src/CKKS/LimbPartitionMGPU.cu:1748: qualified name is not allowed on void LimbPartition::modupMGPU(...)
NTT_ is not a template
ALGO_SHOUP undefined
blockDimFirst undefined
- Two identically-named
cached_graph structs at lines 581 and 1751 with different fields
printf has already been defined
- Hits the 100-error limit
Issue 2 — MGPU is not separable at link time
Excluding LimbPartitionMGPU.cu via list(FILTER ... EXCLUDE REGEX ...) gets libfideslib.a to ~85%, but fideslib-bench link fails with 15+ undefined references from src/CKKS/RNSPoly.cpp (the single-GPU path):
rescaleMGPU, modupMGPU, moddownMGPU
broadcastLimb0_mgpu, dotKSKfusedMGPU, modup_ksk_moddown_mgpu
fusedHoistRotate (MGPU overload)
- Free symbols
MEMCPY_PEER, GRAPH_CAPTURE
Suggests the MGPU path is woven into RNSPoly unconditionally — no #ifdef FIDESLIB_ENABLE_MULTI_GPU guard.
Minor patches we applied to get this far
CMakeLists.txt: -fopenmp=libomp (clang-only) → -fopenmp for gcc
src/CKKS/Context.cu:758: typo uint63_t → uint64_t
FIDESLIB_ARCH=89-real (default list included 100-real;120-real = Blackwell, rejected by CUDA 12.4)
- Shimmed missing
nvtx3/nvtx3.hpp in CUDA 12.4 install with the CUDA 13 header
Asks
- Is there a supported single-GPU build path in 2.0.0 we missed?
- Are the
LimbPartitionMGPU.cu errors a known regression from the 2.0 MGPU refactor?
- Any ETA on a 2.1 that decouples MGPU, or a hotfix branch we can track?
- Recommended commit/tag for a clean single-GPU build on 2.0 in the meantime?
Happy to share full logs or test patches. Thanks for the work on this — we want to adopt and are blocked on the build, not on the library itself.
Build failures on FIDESlib 2.0.0 —
LimbPartitionMGPU.cuerrors and MGPU coupling in single-GPU buildsSummary
Attempting to build FIDESlib 2.0.0 for a single-GPU target (RTX 4070 Ti, sm_89). We hit two blocking issues: (1)
LimbPartitionMGPU.cufails to compile on both CUDA 12.4 and CUDA 13, and (2) even when that file is excluded, the single-GPURNSPoly.cpppath has unresolved references to MGPU symbols at link time. Wanted to flag this and ask about the supported single-GPU path before we go further — the library looks great and we're keen to adopt it.Environment
FIDESLIB_ARCH=89-realReproduction
Standard CMake configure + build against OpenFHE 1.4.2,
FIDESLIB_ARCH=89-real, single GPU.Expected
Clean build of
libfideslib.aandfideslib-benchon a single-GPU host.Actual
Issue 1 —
LimbPartitionMGPU.cuwill not compile (both CUDA 12.4 and CUDA 13)First error, then a cascade:
src/CKKS/LimbPartitionMGPU.cu:1748: qualified name is not allowedonvoid LimbPartition::modupMGPU(...)NTT_ is not a templateALGO_SHOUP undefinedblockDimFirst undefinedcached_graphstructs at lines 581 and 1751 with different fieldsprintf has already been definedIssue 2 — MGPU is not separable at link time
Excluding
LimbPartitionMGPU.cuvialist(FILTER ... EXCLUDE REGEX ...)getslibfideslib.ato ~85%, butfideslib-benchlink fails with 15+ undefined references fromsrc/CKKS/RNSPoly.cpp(the single-GPU path):rescaleMGPU,modupMGPU,moddownMGPUbroadcastLimb0_mgpu,dotKSKfusedMGPU,modup_ksk_moddown_mgpufusedHoistRotate(MGPU overload)MEMCPY_PEER,GRAPH_CAPTURESuggests the MGPU path is woven into
RNSPolyunconditionally — no#ifdef FIDESLIB_ENABLE_MULTI_GPUguard.Minor patches we applied to get this far
CMakeLists.txt:-fopenmp=libomp(clang-only) →-fopenmpfor gccsrc/CKKS/Context.cu:758: typouint63_t→uint64_tFIDESLIB_ARCH=89-real(default list included100-real;120-real= Blackwell, rejected by CUDA 12.4)nvtx3/nvtx3.hppin CUDA 12.4 install with the CUDA 13 headerAsks
LimbPartitionMGPU.cuerrors a known regression from the 2.0 MGPU refactor?Happy to share full logs or test patches. Thanks for the work on this — we want to adopt and are blocked on the build, not on the library itself.