Skip to content

Dependency audit: downstream map, duplication report, and split proposals #258

@krystophny

Description

@krystophny

libneo Dependency Audit

Full audit of libneo's internal structure, downstream consumers, code duplication, and split/cleanup options.

Reading guide

  • The diagrams below are summaries for navigation.
  • The dependency matrices are the canonical detailed source.
  • The target inventory consolidates the old internal graph, external dependency table, and usage-based analysis into one place.

Architecture Overview

1. Internal target layers

flowchart LR
    classDef shared fill:#dbeafe,stroke:#2563eb,color:#111827
    classDef internal fill:#e5e7eb,stroke:#6b7280,color:#111827
    classDef single fill:#fed7aa,stroke:#c2410c,color:#111827

    subgraph T0["Tier 0: no libneo deps"]
        H5[hdf5_tools]
        INT[interpolate]
        ODE[odeint]
        POLY[neo_polylag]
    end

    subgraph T1["Tier 1"]
        NEO[neo core]
        FIELD[neo_field]
    end

    subgraph T2["Tier 2"]
        MAG[magfie]
        SPEC[species]
    end

    subgraph T3["Tier 3"]
        COLL[collision_freqs]
    end

    subgraph T4["Tier 4"]
        TRANS[transport]
    end

    subgraph T5["Tier 5"]
        MC[mc_efit]
        EFIT[efit_to_boozer]
    end

    INT --> NEO
    ODE --> NEO
    POLY --> NEO
    H5 --> FIELD
    INT --> FIELD
    NEO --> MAG
    H5 --> MAG
    NEO --> SPEC
    SPEC --> COLL
    COLL --> TRANS
    NEO --> MC
    MAG --> MC
    NEO --> EFIT
    MAG --> EFIT

    class NEO,MAG,H5 shared
    class INT,ODE,POLY,FIELD internal
    class SPEC,COLL,TRANS,MC,EFIT single
Loading

Legend:

  • Blue: shared targets with multiple downstream consumers
  • Gray: internal building blocks or glue
  • Orange: single-consumer or internal-only targets

2. Fortran downstream surface

flowchart LR
    classDef api fill:#dbeafe,stroke:#2563eb,color:#111827
    classDef downstream fill:#f3f4f6,stroke:#6b7280,color:#111827

    subgraph API["Fortran-facing libneo surfaces"]
        NEO[neo core]
        MAG[magfie]
        H5[hdf5_tools]
        SPEC[species]
        COLL[collision_freqs]
        TRANS[transport]
    end

    subgraph DOWN["Fortran downstream"]
        SIMPLE[SIMPLE]
        NEO2[NEO-2]
        KAMEL[KAMEL]
        MEPHIT[MEPHIT]
        TIAGO[tiago]
        NEORT[NEO-RT via NEO-2]
        BENCH[benchmark_orbit]
        SPARSE[sparse_draft]
    end

    NEO --> SIMPLE
    NEO --> NEO2
    NEO --> MEPHIT
    NEO --> TIAGO
    NEO --> BENCH
    MAG --> SIMPLE
    MAG --> NEO2
    MAG --> KAMEL
    MAG --> MEPHIT
    MAG --> BENCH
    H5 --> NEO2
    H5 --> MEPHIT
    H5 --> BENCH
    H5 --> SPARSE
    SPEC --> KAMEL
    COLL --> KAMEL
    TRANS --> KAMEL
    NEO2 --> NEORT

    class NEO,MAG,H5,SPEC,COLL,TRANS api
    class SIMPLE,NEO2,KAMEL,MEPHIT,TIAGO,NEORT,BENCH,SPARSE downstream
Loading

3. Python downstream surface

flowchart LR
    classDef api fill:#dcfce7,stroke:#16a34a,color:#111827
    classDef downstream fill:#f3f4f6,stroke:#6b7280,color:#111827

    subgraph API["Python-facing libneo surfaces"]
        EQ[eqdsk / read_eqdsk]
        FLUX[FluxConverter]
        BOOZ[BoozerFile / boozer]
        COILS[coils / CoilsFile]
        CHART[chartmap]
        VMEC[vmec_utils / vmec]
        MGRID[mgrid]
    end

    subgraph DOWN["Python downstream"]
        SIMPLE[SIMPLE]
        NEO2[NEO-2]
        KAMEL[KAMEL]
        NEORT[NEO-RT]
        GECKO[GECKO]
        NEOART[neoart_benchmark]
        RABE[rabe]
        DATA["$DATA scripts"]
    end

    CHART --> SIMPLE
    EQ --> NEO2
    FLUX --> NEO2
    BOOZ --> NEO2
    BOOZ --> KAMEL
    BOOZ --> NEORT
    COILS --> GECKO
    EQ --> NEOART
    FLUX --> NEOART
    BOOZ --> RABE
    EQ --> DATA
    FLUX --> DATA
    BOOZ --> DATA
    COILS --> DATA
    VMEC --> DATA
    MGRID --> DATA

    class EQ,FLUX,BOOZ,COILS,CHART,VMEC,MGRID api
    class SIMPLE,NEO2,KAMEL,NEORT,GECKO,NEOART,RABE,DATA downstream
Loading

Downstream Dependency Matrices

Fortran dependencies

Downstream neo magfie hdf5_tools Linkage
SIMPLE libneo_coordinates, libneo_kinds magfie_sub, magfie_can_boozer_sub -- FetchContent
NEO-2 LIBNEO::neo LIBNEO::magfie LIBNEO::hdf5_tools FetchContent
KAMEL KIM: --; QL-Balance: LIBNEO::neo KIM: field_sub (via kamel_equil); QL-Balance: -- -- FetchContent
MEPHIT LIBNEO::neo LIBNEO::magfie LIBNEO::hdf5_tools FetchContent
tiago LIBNEO::neo -- -- FetchContent
NEO-RT (transitive via NEO-2) (transitive) (transitive) Indirect
benchmark_orbit pre-built .a pre-built .a pre-built .a Static archive
sparse_draft -- -- raw file download (BROKEN) curl

Python dependencies

Downstream eqdsk FluxConverter BoozerFile coils chartmap vmec_utils Other
SIMPLE -- -- -- -- write_chartmap_from_vmec_boundary -- --
NEO-2 read_eqdsk FluxConverter BoozerFile (docs) -- -- -- --
KAMEL -- -- BoozerFile, efit_to_boozer, get_boozer_* -- -- -- --
NEO-RT -- -- BoozerFile -- -- -- --
GECKO -- -- -- CoilsFile, Filament -- -- --
neoart_benchmark read_eqdsk FluxConverter -- -- -- -- --
rabe -- -- BoozerFile, write_boozer_* -- -- -- --
$DATA scripts (37 files) read_eqdsk (21x) FluxConverter (16x) BoozerFile (15x) CoilsFile (3x) -- eqdsk2vmec (2x) mgrid (1x)

Target Inventory And Disposition

Clarification from downstream re-checks:

  • Within the KAMEL repo, KIM does not use the species / collision_freqs / transport chain, but QL-Balance does.
  • Current GORILLA and GORILLA_APPLETS snapshots do not use this libneo collision chain.

This table replaces the old internal dependency graph, external dependency table, and usage-based summary.

Target or group Internal layer External deps Downstream consumers Decision
neo core (libneo_kinds, math_constants, coordinates, VMEC support) Tier 1 BLAS, LAPACK, NetCDF, HDF5 via hdf5_tools SIMPLE, NEO-2, KAMEL, MEPHIT, tiago; benchmark_orbit via pre-built archive Keep in libneo
magfie Tier 2 BLAS, LAPACK, NetCDF, FFTW, HDF5 via hdf5_tools SIMPLE, NEO-2, KAMEL, MEPHIT; benchmark_orbit via pre-built archive Keep in libneo
hdf5_tools Tier 0 HDF5 NEO-2, MEPHIT, KAMEL (vendored), benchmark_orbit (pre-built archive), sparse_draft (broken raw download) Keep in libneo (all codes need HDF5/NetCDF anyway)
interpolate Tier 0 none internal dependency only Keep in libneo
odeint Tier 0 OpenMP internal dependency only Keep in libneo
neo_polylag Tier 0 none internal dependency only Keep in libneo
neo_field Tier 1 via hdf5_tools, interpolate no downstream consumers identified Keep in libneo
efit_to_boozer Tier 5 via neo and magfie NEO-2 (Python bindings), KAMEL (Python) Keep in libneo
MyMPILib standalone MPI NEO-2 only Move to NEO-2 (#265, itpplasma/NEO-2#68)
species Tier 2 none QL-Balance in the KAMEL repo; internal dependency of collision_freqs Keep in libneo; group as physics sub-library (#266)
collision_freqs Tier 3 GSL QL-Balance in the KAMEL repo; internal dependency of transport Keep in libneo; group as physics sub-library (#266)
transport Tier 4 none QL-Balance in the KAMEL repo Keep in libneo; group as physics sub-library (#266)
mc_efit Tier 5 via neo and magfie no external consumers; only used internally by efit_to_boozer executables Keep in libneo
contrib/minpack orphaned none none Removed (#262)

Python import frequency

Symbol or module Uses
read_eqdsk 21
FluxConverter 16
BoozerFile 15
CoilsFile 3
chartmap 2
vmec_utils / eqdsk2vmec 2
VMECGeometry 1
mgrid 1
stl_boundary 1

Code Duplication Report

Summary

Project Vendored files Symlinks Headline
KAMEL 6 0 All copies diverged from libneo
NEO-RT 5 21 Mostly symlinks; 5 real historical copies
MEPHIT 3 0 One trivial copy, one heavily diverged copy, one orphaned copy
sparse_draft 1 0 Broken raw download of hdf5_tools
KAMEL details
File in KAMEL File in libneo Status
common/hdf5_tools/hdf5_tools.f90 src/hdf5_tools/hdf5_tools.F90 DIVERGED: module renamed to KAMEL_hdf5_tools, 275 lines shorter, missing newer features (h5_add_int_3, h5_get_bounds_3, h5_add_string_1, etc.), added own h5_add_float_1 and h5_append_double_1
common/hdf5_tools/hdf5_tools_f2003.f90 src/hdf5_tools/hdf5_tools_f2003.f90 TRIVIAL DIFF: only use hdf5_tools -> use KAMEL_hdf5_tools
KiLCA/solver/VER_5_STABLE/plag_coeff.f90 src/plag_coeff.f90 PRE-MODERNIZATION: standalone subroutine, double precision instead of real(dp)
KIM/src/util/plag_coeff.f90 src/plag_coeff.f90 PRE-MODERNIZATION: same as above
QL-Balance/src/base/plag_coeff.f90 src/plag_coeff.f90 HEAVILY MODIFIED: wrapped in PolyLagrangeInterpolation module with KAMEL-specific variables
KiLCA/solver/VER_5_STABLE/binsrc.f90 src/binsrc.f90 PRE-MODERNIZATION: standalone subroutine, double precision

KAMEL repo note: KIM itself does not import libneo_species, libneo_collisions, or libneo_transport. The downstream consumer there is QL-Balance, which links LIBNEO::neo, LIBNEO::species, LIBNEO::collision_freqs, and LIBNEO::transport.

NEO-RT details

Most POTATO/SRC/ files are symlinks to libneo (resolved via FetchContent). Only these are actual copies:

File in NEO-RT File in libneo Status
POTATO/SRC/binsrc.f90 src/binsrc.f90 PRE-MODERNIZATION
POTATO/SRC/plag_coeff.f90 src/plag_coeff.f90 PRE-MODERNIZATION
POTATO/BOOZER_TO_EFIT/SRC/binsrc.f90 src/binsrc.f90 PRE-MODERNIZATION
POTATO/BOOZER_TO_EFIT/SRC/plag_coeff.f90 src/plag_coeff.f90 PRE-MODERNIZATION
POTATO/BOOZER_TO_EFIT/SRC/spl_three_to_five_mod.f90 src/spl_three_to_five.f90 DIVERGED: module renamed, uses double precision, non-recursive, extra variables (710 vs 690 lines)
MEPHIT details
File in MEPHIT File in libneo Status
src/fftw3.f90 src/fftw3.F90 TRIVIAL: include 'fftw3.f03' vs #include "fftw3.f03"
src/preload_for_SYNCH.f90 src/efit_to_boozer/preload_for_SYNCH.f90 HEAVILY DIVERGED: changed from program to subroutine, uses different modules, different API
src/field_line_integration_for_SYNCH.f90 (removed from libneo) ORPHANED: original was removed from libneo; this is a historical vendored copy
sparse_draft details
File Status
cmake/FetchHDF5Tools.cmake Downloads hdf5_tools.f90 from raw.githubusercontent.com/.../src/hdf5_tools/hdf5_tools.f90 and now returns 404 because the file was renamed to .F90. The pinned MD5 also matches a commit many versions behind. Build is broken.

Common pattern: most vendored copies are pre-modernization versions using double precision instead of real(dp), standalone subroutines instead of modules, and missing intent declarations. Downstream projects never updated their copies.

Decisions (2025-03-30 meeting)

Overall approach: keep monorepo, clean up, modularize internally

The team decided against splitting libneo into separate repositories. Instead:

  1. Keep everything in one repo with clean internal modularity
  2. Make sub-libraries independently buildable as CMake targets (e.g., LIBNEO::kinetics for the physics chain)
  3. Downstream codes adopt shared implementations from libneo instead of maintaining their own copies
  4. Whether to split into separate repos later is deferred -- first get the internal structure clean

Build system: CMake only, remove fpm

CMake is the sole build system. Remove fpm.toml from the repo (#263). fpm lacks multi-target support, system dependency detection, Python integration, and conditional compilation. It can be re-added if fpm matures. The file remains in git history.

Python: optional, stays in same repo

  • Fortran must build without Python at configure time (Make Python optional in the build system #264)
  • Python stays in the same repo but is optional
  • pip install -e . links against the already-built Fortran shared library (no second build)
  • Editable install picks up rebuilt shared libraries automatically
  • If Fortran is not yet built, pip install builds it as part of the process
  • This becomes the model for all our packages

Dead code removal

Physics modules: keep in libneo as a sub-library

species, collision_freqs, transport stay in libneo (#266). They form a natural hierarchy and logical unit. Downstream codes should adopt these shared implementations:

hdf5_tools: keep in libneo, but HDF5 Fortran becomes optional

hdf5_tools stays in libneo. However, HDF5 Fortran is not always available on all systems (NetCDF-4 pulls HDF5 C but not necessarily HDF5 Fortran bindings). The build must not hard-crash when HDF5 Fortran is missing (#268).

When HDF5 Fortran is not found, libneo builds a reduced version without hdf5_tools and HDF5-dependent components. Sub-targets that don't need HDF5 (like odeint, interpolate, neo_field) build normally.

Sub-packages with minimal dependencies

Each CMake sub-target only requires its own direct dependencies (#269). Downstream codes using FetchContent on individual sub-targets (e.g., LIBNEO::odeint) must not need to satisfy dependencies of unrelated parts (e.g., HDF5, FFTW, GSL). The aggregate LIBNEO::neo target continues to pull everything together when building the full library.

Python packages: all stay in libneo

eqdsk, FluxConverter, BoozerFile, coils, chartmap, vmec_utils, mgrid all stay. They are widely used across downstream codes and $DATA scripts. chartmap in particular is a general real-space Cartesian coordinate mapping tool and will also serve as the new Boozer file format.

Code duplication in downstream repos

Downstream projects should replace vendored copies with libneo's shared implementations via FetchContent. Tracked per-repo:

Tracked Issues

libneo

Downstream

Detailed Execution Plan (2026-03-30)

This section turns the decisions above into an implementation order with dependencies, scope boundaries, and issue links.

Issue Index

Meta and build-system issues

Related historical / context issues

Downstream adoption / follow-up issues

Chosen Direction

We are following the monorepo cleanup path from #258: keep one repository, make the internal CMake target graph clean, make Python optional, and only reconsider repo splits later if the cleaned target boundaries justify it.

Concretely, this means:

  • do not split libneo into multiple repos now
  • do not keep two first-class build systems; CMake is the only canonical build system per #263
  • make each CMake target require only its own direct dependencies per #269
  • make Python an optional consumer of the Fortran build per #264

Execution Order

Phase 1: Remove the remaining dead / misplaced pieces

  1. Finish the MyMPILib removal wave documented in #265, #207, #259, and itpplasma/NEO-2#66.
    Expected result:

    • no extra/MyMPILib dependency remains in libneo
    • no libneo test or target still assumes that MyMPILib lives in this repo
  2. Remove src/field/jorek_field.f90 from libneo per #267.
    Expected result:

    • neo_field no longer links to hdf5_tools
    • issue #76 can be closed as no longer applicable in libneo
  3. Keep contrib/minpack removed per #262 and ensure no stale references remain in docs, CI, or local helper scripts.

Why first:

  • this shrinks the graph before we try to make dependencies optional
  • #267 is the cleanest enabler for #268 and #269

Phase 2: Fix the CMake target graph

This is the main structural work and should be treated as one coordinated build-system wave across #266, #268, and #269.

  1. Make each sub-target discover only its own external dependencies.
    Examples from #269:

    • odeint should only require OpenMP when that target is built
    • interpolate should require no unrelated packages
    • neo_polylag should require no unrelated packages
    • neo_field should require only interpolate and neo_polylag after #267
    • species, collision_freqs, and transport should expose their own direct dependency chain
    • neo and magfie remain aggregate targets with the heavier dependency surface
  2. Make HDF5 Fortran optional per #268.
    Expected result:

    • no configure-time hard failure if HDF5 Fortran is absent
    • hdf5_tools and HDF5-dependent features are skipped cleanly when unavailable
    • reduced builds still succeed for users that only need odeint, interpolate, neo_polylag, neo_field, or other non-HDF5 parts
  3. Introduce an explicit physics grouping target per #266.
    Expected result:

    • keep species, collision_freqs, and transport as individual targets
    • add a grouped target such as LIBNEO::kinetics or LIBNEO::physics
    • make that grouped target independently consumable by downstream users
  4. Preserve one full aggregate build.
    Even after #268 and #269, the top-level full build should still produce the familiar complete libneo when all dependencies are available.

Why before Python:

  • #264 becomes much simpler once the Fortran graph is no longer monolithic
  • FetchContent consumers benefit immediately, independent of Python packaging

Phase 3: Remove the competing build story

Handle #263 as soon as the target graph cleanup is underway or merged.

Scope:

  • remove fpm.toml
  • remove or rewrite docs that imply fpm is supported for normal development or CI
  • make it unambiguous that CMake is the only supported build system

Why here:

  • doing this before the graph cleanup is possible, but it is mostly messaging
  • the real technical blocker is still the CMake target structure

Phase 4: Decouple Python from the core Fortran build

Implement #264 after the Fortran graph is fixed.

Required behavior from #264:

  • Fortran-only configure/build must not require Python
  • Python bindings must not be built from the top-level Fortran configure by default
  • editable installs should link against an existing shared libneo build instead of forcing a second full CMake build
  • if the Fortran library is not already built, pip install may build it, but that is the fallback path rather than the default dev loop

Related context to preserve while doing this work:

  • #88: efit_to_boozer should remain localized and not force tight coupling back into the main library build
  • #100: eliminate the current ambiguity between make/CMake and pip install -e .
  • #127: package builds should no longer rely on stray local artifacts; if a Python-facing Fortran binding is part of the package, it must be built or intentionally omitted in a clean, explicit way

Concrete deliverables for #264:

  • remove unconditional find_package(Python ...) from top-level Fortran configure paths
  • move Python-only wrapper generation and Python-only tests behind an explicit Python build path
  • separate pure Fortran CTest from Python/integration tests so that Python is not a hidden requirement of the main library build
  • ensure the editable Python install sees rebuilt shared libraries immediately

Phase 5: Downstream adoption and vendored-copy cleanup

Once the target graph and Python story are fixed, use the new structure to clean up downstream consumers.

Tracked downstream follow-up issues:

Goals:

  • replace vendored copies with shared libneo targets where practical
  • update FetchContent use so downstream projects only pull the targets they actually need
  • reduce duplicated binsrc, plag_coeff, hdf5_tools, and related historical forks over time

Scope Boundaries

In scope for the current cleanup wave

Explicitly not in scope for this wave

  • splitting libneo into multiple repositories now; that remains deferred by #258
  • a broad rewrite of downstream repos before libneo's own targets are cleaned up
  • reintroducing fpm as a supported parallel build system after #263

Suggested PR Breakdown

  1. PR A: #267 and close #76
  2. PR B: #268 plus the dependency-guard work from #269
  3. PR C: #266 physics grouping target
  4. PR D: #263 remove fpm.toml and clean docs
  5. PR E: #264 Python decoupling, including test separation and editable-install behavior
  6. PR F: downstream follow-up against itpplasma/KAMEL#125, itpplasma/NEO-2#68, itpplasma/SIMPLE#335, and itpplasma/GORILLA#31

Open Item Without Dedicated libneo Issue Yet

The sparse_draft raw-download breakage for hdf5_tools.F90 is tracked in this meta issue body but does not currently have its own dedicated libneo issue link. If that work is expected to be handled inside libneo rather than downstream, create a dedicated issue and link it here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions