feat(sbom): symlink-aware SBOM filesystem graph (fs_tree)#459
feat(sbom): symlink-aware SBOM filesystem graph (fs_tree)#459willis89pr wants to merge 229 commits intomainfrom
Conversation
- Add `fs_tree: nx.DiGraph` to `SBOM`, excluded from JSON serialization
- Populate `fs_tree` in SBOM constructor via `_add_software_to_fs_tree`, splitting each `installPath` into parent–child edges and tagging leaf nodes with `software_uuid`
- Introduce `SBOM._record_symlink(link, target, subtype)` to record symlink edges in both:
- the main relationship graph (`MultiDiGraph`) with `type="symlink"`
- the filesystem graph (`fs_tree`) with `type="symlink"` and optional `subtype` ("file" or "directory")
- Enhance `add_software_entries()` to scan each `installPath` and its immediate children for symlinks, invoking `_record_symlink` for both file- and directory–level symlinks
- Update `generate.py` to inject filename- and install-path symlinks into each `Software` entry before adding to SBOM, so they’re captured by `add_software_entries()`
- Refactor `elf_relationship` plugin to:
- Prefer `fs_tree`–based `get_software_by_path()` lookups for ELF dependencies
- Fall back to legacy `installPath` matching, then a directory-based symlink heuristic
- Emit detailed `logger.debug()` statements (via Loguru) indicating which resolution path was used
- Improve docstrings around RPATH/RUNPATH, DST substitution, and relationship phases
- Expand DST-handling helpers (`generate_search_paths`, `generate_runpaths`, `substitute_all_dst`) with clearer comments, normalization, and debug traces
- Update `.NET` relationship plugin to use `get_software_by_path` for absolute imports and cleaned-up probing logic
- Add comprehensive unit tests:
- `tests/sbomtypes/test_fs_tree.py` to verify `fs_tree` population and `get_software_by_path`
- `tests/relationships/test_elf_relationship.py` covering absolute, relative, system, origin, RPATH, and symlink heuristics
- Minor cleanup: prevent `fs_tree` from being serialized and remove unused whitespace
for more information, see https://pre-commit.ci
Add “# pylint: disable=redefined-outer-name” to the top of: - tests/relationships/test_elf_relationship.py - tests/sbomtypes/test_fs_tree.py This silences warnings about pytest fixtures shadowing outer-scope names.
for more information, see https://pre-commit.ci
- Documented _add_software_to_fs_tree method with explanation of behavior, arguments, and side effects - Enhanced safety: ensure final install path node exists before tagging - Normalized install paths to POSIX format for consistency - Added type hints for clarity - No logic changes to other methods; only added minor inline comments and spacing
for more information, see https://pre-commit.ci
…ationship - Introduced `normalize_path` utility in `surfactant.utils.paths` to standardize path handling across components. - Replaced all raw `PurePosixPath` and `PureWindowsPath` calls with `normalize_path` in: - `SBOM` class (`_sbom.py`): install path processing, software lookup, and symlink handling. - `dotnet_relationship.py`: resolving absolute paths for dependency resolution. - Added new utility module `utils.paths` and test suite `test_paths.py` to verify path normalization behavior across various cases.
for more information, see https://pre-commit.ci
- Removed redundant single-argument shortcut that bypassed normalization. - Updated normalize_path() to explicitly replace backslashes in all path parts. - Ensures consistent POSIX-style output for inputs like "C:\\Program Files\\App". - Fixes test failures caused by improper handling of Windows-style paths.
…tion - Replaced all manual `.as_posix()` conversions with `normalize_path(...)` to ensure consistent POSIX-style lookup keys. - Normalized candidate paths used in `sbom.get_software_by_path()` during .NET relationship resolution. - Updated codeBase path resolution to use structured path objects instead of prematurely stringifying. - Refactored `get_dotnet_probedirs()` to normalize all output paths and avoid path handling inconsistencies. - Added docstring to `get_dotnet_probedirs()` for clarity. Fixes failing .NET relationship tests caused by inconsistent path formats in `installPath` vs lookup paths.
🧪 SBOM Results (16/16)
|
… heuristic test - In `example_sbom` fixture, record a symlink from `/opt/alt/lib/libalias.so` to `/opt/alt/lib/libreal.so` for `sw8` to exercise the symlink handling logic - Add a new parametrized test `test_symlink_heuristic_match_edge` that clears existing fs_tree entries and verifies that the heuristic correctly matches symlinked dependencies when no direct matches exist
for more information, see https://pre-commit.ci
…d suppress pylint protected-access warning
for more information, see https://pre-commit.ci
…ionally removing fs_tree edge and node Updated `test_symlink_heuristic_match_edge` to defensively check for the existence of the symlink edge and node in `fs_tree` before attempting to remove them. This avoids `KeyError` raised by NetworkX when the edge does not exist, ensuring the test remains stable even if the graph structure changes upstream. Improves test resilience and correctness by explicitly targeting the intended symlink edge (`/opt/alt/lib/libalias.so` → `/opt/alt/lib/libreal.so`).
for more information, see https://pre-commit.ci
…k and logging - Updated `get_windows_pe_dependencies()` to use a modern three-phase resolution strategy: 1. Primary: Exact path match using `sbom.get_software_by_path()` (fs_tree) 2. Secondary: Legacy string-based matching on `installPath` and `fileName` 3. Tertiary: Heuristic fallback using shared directories and `fileName` match - Replaced `find_installed_software()` usage with normalized path lookups. - Introduced detailed `loguru.debug()` logging to trace each match attempt and outcome. - Enhanced `establish_relationships()` with structured import phase handling and debug output. - Improved `has_required_fields()` using a cleaner `any(...)` check with docstring and type hint. - Added full docstrings to clarify purpose and logic for maintainability. These changes bring PE relationship handling in line with ELF and .NET plugins, ensuring consistency, improved symlink resolution, and better match accuracy across Windows-style paths.
for more information, see https://pre-commit.ci
- Introduced test suite for `pe_relationship.py` covering: - Primary resolution via `fs_tree` using `get_software_by_path()` - Legacy fallback using `installPath` + `fileName` matching - Heuristic fallback using same-directory + filename pattern - Negative test case for unmatched DLLs - Unit test for `has_required_fields()` utility function - Includes thorough docstrings and inline comments for clarity and maintainability. - Ensures consistent behavior with ELF/.NET plugin resolution logic. File added: tests/relationships/test_pe_relationship.py
for more information, see https://pre-commit.ci
…d tests - Replaced legacy class-based resolution with dynamic 3-phase import matching: 1. Exact path resolution via sbom.get_software_by_path() (fs_tree) 2. Legacy fallback via installPath + fileName match 3. Heuristic fallback via shared directory and filename - Removed static _ExportDict and global class-to-UUID mapping - Added detailed logging and comments for maintainability - Introduced helper `class_to_path()` for FQCN to class file path test: - Added pytest suite covering all resolution phases: - fs_tree match - legacy installPath fallback - heuristic directory-based fallback - negative case with no match New file: tests/relationships/test_java_relationship.py
|
@nightlark Relationships: |
…tware; update tests - Make PE Phase 2 a true legacy fallback by delegating to windows_utils.find_installed_software() - Remove self-edge suppression to match legacy relationship emission behavior - Clarify PE resolver docstrings around fs_tree symlink traversal and legacy probing - Update tests to force Phase 2 execution and add directory-case mismatch regression - Remove dotnet_relationship_legacy module
for more information, see https://pre-commit.ci
…M-scoped - Tighten has_required_fields() to only accept dict metadata containing "javaClasses" to avoid type errors when non-dict metadata objects are present. - Add SBOM-scoped caching for the export→supplier lookup table using a weakref to the SBOM instance, preventing cross-run/test state leakage while avoiding repeated rebuilds. - Refactor Phase 2 fallback to mirror legacy export-dict behavior more closely by iterating javaClasses → javaImports directly (instead of building a set of imports). - Leave Phase 1 (fs_tree path resolution) as an explicit TODO placeholder and document the intended resolution order (fs_tree first, legacy fallback second). - Improve debug logging around legacy fallback resolution and final relationship emission.
for more information, see https://pre-commit.ci
Annotate _ExportDict._sbom_ref as Optional[weakref.ref[SBOM]] so pylint recognizes the weakref as callable when checking the cached SBOM instance.
- Suppress pylint false-positive on weakref call in SBOM-scoped export cache - Fix legacy debug tag to use [Java] instead of [PE] - Remove unused class_to_path helper (Phase 1 placeholder remains)
…se 2 test Remove unused java_class_path/test_sbom fixtures and the Phase 1 fs_tree test (Phase 1 intentionally not implemented). Rename the Phase 2 test to reflect the legacy export-dict fallback behavior.
for more information, see https://pre-commit.ci
| 3. **Gathered Filename Aliases:** additional names from `sw.fileName` | ||
| that were injected during the gather phase but are not canonical | ||
| basenames of the install paths (e.g., bash-completion stubs like | ||
| "runuser" for "su"). |
There was a problem hiding this comment.
Not sure when this last class would ever occur. If "runuser" is a small script just calling "su", then the hash won't match "su" and it will get its own "runuser" software entry (ideally at some point recognizizing that it executes "su" so a "Runs" relationship can be created).
| primary_basenames = {PurePosixPath(p).name for p in (sw.installPath or [])} | ||
| file_name_extras = set(sw.fileName or []) - primary_basenames | ||
| if file_name_extras: | ||
| file_symlinks |= file_name_extras | ||
| logger.debug( | ||
| f"[fs_tree] Added gathered filename aliases for {sw.UUID}: {sorted(file_name_extras)}" | ||
| ) |
There was a problem hiding this comment.
iirc the way we handle things with adding install paths a file name entry for the basename will always be added as well, so I think this case would ideally never be reached.
| # Skip path/symlink edges during merge as well | ||
| if str(rel_type).lower() == "symlink": | ||
| continue | ||
| if sbom_m.graph.nodes.get(src, {}).get("type") == "Path": |
There was a problem hiding this comment.
Should we make all of the node types lower case for consistency ("path" and "hash" instead of "Path" and "Hash")?
There was a problem hiding this comment.
Pull request overview
This PR introduces a symlink-aware filesystem graph (fs_tree) inside the SBOM model and updates multiple relationship plugins to resolve dependencies via path-based lookups (with symlink traversal) before falling back to legacy matching. It also adds path normalization utilities and a broad set of new/updated tests around filesystem graph behavior and relationship resolution.
Changes:
- Add SBOM
fs_tree(directory hierarchy + symlink/hash edges) plus lookup/recording helpers (get_software_by_path,record_symlink, pending symlink expansion, legacy symlink metadata injection). - Update relationship plugins (.NET/ELF/PE/Java) to prefer
fs_treelookups (with logging and fallbacks), and update merge/generate flows for path/symlink handling. - Add new path utilities (
normalize_path,basename_posix) and new tests validating path normalization,fs_treepopulation/lookup, and relationship resolution.
Reviewed changes
Copilot reviewed 18 out of 19 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| surfactant/sbomtypes/_sbom.py | Adds fs_tree, symlink/hash recording + traversal helpers, pending symlink expansion, and filters filesystem edges from serialized relationships. |
| surfactant/cmd/generate.py | Records symlinks/hashes during crawl and injects legacy symlink metadata derived from fs_tree. |
| surfactant/cmd/merge.py | Filters out Path nodes from root computation/system relationship attachment. |
| surfactant/utils/paths.py | Adds path normalization/basename helpers used across plugins and SBOM graph code. |
| surfactant/relationships/dotnet_relationship.py | Moves to fs_tree-first probing with legacy fallbacks; adds structured debug logging. |
| surfactant/relationships/elf_relationship.py | Adds fs_tree-first matching and clearer runpath/default-path logic with debug logging. |
| surfactant/relationships/pe_relationship.py | Adds fs_tree-first resolution (case-insensitive) with legacy fallback and debug logging. |
| surfactant/relationships/java_relationship.py | Makes export-dict caching SBOM-aware (weakref) and adds structured logging (fs_tree phase still TODO). |
| surfactant/relationships/_internal/windows_utils.py | Adds shared .NET probe-dir construction helper (get_dotnet_probedirs). |
| surfactant/output/cytrics_writer.py | Adds debug log when writing SBOM output. |
| tests/sbomtypes/test_fs_tree.py | New tests validating fs_tree construction, lookup, symlink traversal, and serialization filtering. |
| tests/utils/test_paths.py | New tests for normalize_path behavior and edge cases. |
| tests/relationships/test_dotnet_relationship.py | New .NET relationship tests covering multiple resolution paths. |
| tests/relationships/test_elf_relationship.py | New ELF relationship tests covering multiple scenarios (currently includes debug prints). |
| tests/relationships/test_pe_relationship.py | New PE relationship tests for fs_tree + fallbacks. |
| tests/relationships/test_java_relationship.py | New Java relationship tests for legacy export matching. |
| tests/symlink/test_resolve_links.py | Removes old symlink-resolution test (superseded by fs_tree behavior/tests). |
| tests/relationships/test_java.py | Removes old Java relationship test (replaced by new java_relationship tests). |
| .gitignore | Minor formatting change. |
| # Debug prints are helpful during bring-up, but can be noisy in CI. | ||
| # Keep them for now; if logs are cluttered, consider replacing with logger.debug or removing. | ||
| print(f"==== RUNNING: {label} ====") | ||
| sbom, case_map = example_sbom | ||
|
|
||
| # Retrieve the consumer under test and the expected supplier UUID | ||
| sw, expected_uuid = case_map[label] | ||
|
|
||
| # Pull the ELF metadata for this software (may include elfDependencies, elfRunpath/Rpath, etc.) | ||
| metadata = sw.metadata[0] if sw.metadata else {} | ||
| print("Dependency paths:", metadata.get("elfDependencies", [])) | ||
| print("fs_tree nodes:", list(sbom.fs_tree.nodes)) | ||
|
|
||
| # Optional trace: show how raw dependency strings normalize to POSIX and what fs_tree returns | ||
| for dep in metadata.get("elfDependencies", []): | ||
| norm = pathlib.PurePosixPath(dep).as_posix() | ||
| print(f"Trying lookup: {norm} ->", sbom.get_software_by_path(norm)) | ||
|
|
There was a problem hiding this comment.
This test includes multiple print(...) statements (and comments suggesting to keep them) which will pollute CI output and make failures harder to read. Please remove these prints or convert them to logger.debug (or pytest's caplog) so test output stays clean by default.
| def test_trailing_slash_is_preserved(): | ||
| """Strip trailing slashes from non-root POSIX paths.""" | ||
| assert normalize_path("C:/App/") == "C:/App" # PosixPath strips trailing slashes |
There was a problem hiding this comment.
The test name test_trailing_slash_is_preserved contradicts the assertion (normalize_path("C:/App/") == "C:/App"), i.e., the trailing slash is not preserved. Rename the test (and/or adjust the docstring) so the name reflects the intended behavior being asserted.
| # Symlink capture under each installPath --- | ||
| for raw in sw.installPath or []: | ||
| p = pathlib.Path(raw) | ||
|
|
||
| # If the installPath itself is a symlink (file or dir) | ||
| if p.is_symlink(): | ||
| real = p.resolve() | ||
| subtype = "file" if not p.is_dir() else "directory" | ||
| logger.debug(f"Found installPath symlink: {p} → {real} (subtype={subtype})") | ||
| # Call the helper to record this symlink in fs_tree | ||
| self._record_symlink(str(p), str(real), subtype=subtype) | ||
|
|
||
| # If it's a directory, scan immediate children for symlinks | ||
| if p.is_dir(): | ||
| for child in p.iterdir(): | ||
| if child.is_symlink(): | ||
| real = child.resolve() | ||
| subtype = "file" if not child.is_dir() else "directory" | ||
| logger.debug( | ||
| f"Found child symlink: {child} → {real} (subtype={subtype})" | ||
| ) | ||
| self._record_symlink(str(child), str(real), subtype=subtype) | ||
|
|
There was a problem hiding this comment.
SBOM.add_software_entries() is calling pathlib.Path(raw).is_symlink()/is_dir()/iterdir() on values from Software.installPath. installPath represents the intended install location, not necessarily a real path on the machine running Surfactant, so this will (a) produce nondeterministic results depending on host filesystem contents, (b) potentially scan outside the extraction root, and (c) slow down SBOM construction/merge. Symlink discovery should be driven by the crawl/extractor inputs (e.g., the symlink maps built during generate and recorded via record_symlink) or by paths relative to the extraction directory, not by probing host paths in SBOM model code.
| # Symlink capture under each installPath --- | |
| for raw in sw.installPath or []: | |
| p = pathlib.Path(raw) | |
| # If the installPath itself is a symlink (file or dir) | |
| if p.is_symlink(): | |
| real = p.resolve() | |
| subtype = "file" if not p.is_dir() else "directory" | |
| logger.debug(f"Found installPath symlink: {p} → {real} (subtype={subtype})") | |
| # Call the helper to record this symlink in fs_tree | |
| self._record_symlink(str(p), str(real), subtype=subtype) | |
| # If it's a directory, scan immediate children for symlinks | |
| if p.is_dir(): | |
| for child in p.iterdir(): | |
| if child.is_symlink(): | |
| real = child.resolve() | |
| subtype = "file" if not child.is_dir() else "directory" | |
| logger.debug( | |
| f"Found child symlink: {child} → {real} (subtype={subtype})" | |
| ) | |
| self._record_symlink(str(child), str(real), subtype=subtype) | |
| # Symlink capture based on installPath is intentionally *not* performed here. | |
| # `installPath` represents the intended installation location and may not | |
| # correspond to real paths on the host running Surfactant. Probing the host | |
| # filesystem here would lead to nondeterministic behavior and potential | |
| # traversal outside the extraction root. | |
| # | |
| # Symlink information should instead be recorded during crawl/extraction | |
| # (e.g., via `record_symlink`) and expanded using the dedicated methods | |
| # such as `expand_pending_dir_symlinks` and `expand_pending_file_symlinks`. |
| # Initialize fs_tree | ||
| self.fs_tree = nx.DiGraph() | ||
|
|
||
| # Populate from installPaths (if present) | ||
| for sw in self.software: | ||
| self._add_software_to_fs_tree(sw) | ||
|
|
There was a problem hiding this comment.
__post_init__() currently only populates fs_tree from Software.installPath nodes. Because fs_tree (and symlink edges) are excluded from serialization, SBOMs loaded from JSON will not have any of the symlink edges/synthetic alias paths needed for get_software_by_path() symlink traversal (especially for directory-symlink-expanded child paths). If the intent is for fs_tree lookups to work post-deserialization/merge, __post_init__ needs a rebuild step that reconstructs symlink/hash equivalence from persisted data (e.g., installPathSymlinks metadata and/or existing installPath aliases) without probing the host filesystem.
| # Initialize fs_tree | |
| self.fs_tree = nx.DiGraph() | |
| # Populate from installPaths (if present) | |
| for sw in self.software: | |
| self._add_software_to_fs_tree(sw) | |
| # Initialize and rebuild the filesystem tree from any persisted metadata | |
| self._rebuild_fs_tree_from_metadata() | |
| def _rebuild_fs_tree_from_metadata(self) -> None: | |
| """ | |
| Rebuild the in-memory filesystem tree (fs_tree) from persisted SBOM metadata. | |
| This method: | |
| - Initializes a fresh NetworkX DiGraph for fs_tree. | |
| - Re-adds all software install paths via _add_software_to_fs_tree. | |
| - Reconstructs symlink and hash-equivalence relationships from any | |
| available metadata on Software instances (e.g., installPathSymlinks), | |
| without probing the host filesystem. | |
| """ | |
| # Always start from a clean graph to avoid mixing prior state with | |
| # deserialized data. | |
| self.fs_tree = nx.DiGraph() | |
| for sw in self.software: | |
| # Primary install paths | |
| self._add_software_to_fs_tree(sw) | |
| # Optional symlink metadata: expected to be a mapping of | |
| # {symlink_path: target_path}. We use getattr to avoid assuming that | |
| # all Software instances provide this attribute. | |
| symlinks = getattr(sw, "installPathSymlinks", None) | |
| if not symlinks or not isinstance(symlinks, dict): | |
| continue | |
| for link_path, target_path in symlinks.items(): | |
| if not link_path or not target_path: | |
| continue | |
| # Normalize to a consistent POSIX-like form | |
| norm_link = normalize_path(link_path) | |
| norm_target = normalize_path(target_path) | |
| # Ensure both the symlink path and its target path have their | |
| # directory structure reflected in fs_tree. | |
| for norm in (norm_link, norm_target): | |
| parts = pathlib.PurePosixPath(norm).parts | |
| for i in range(1, len(parts)): | |
| parent = normalize_path(*parts[:i]) | |
| child = normalize_path(*parts[: i + 1]) | |
| self.fs_tree.add_edge(parent, child) | |
| if not self.fs_tree.has_node(norm): | |
| self.fs_tree.add_node(norm) | |
| # Mark the symlink relationship explicitly. Direction is from | |
| # the symlink path to the real target path. | |
| self.fs_tree.add_edge(norm_link, norm_target, is_symlink=True) | |
| # Associate the symlink node with the software UUID (for parity | |
| # with primary install paths). | |
| self.fs_tree.nodes[norm_link]["software_uuid"] = sw.UUID | |
| # Ensure hash-equivalence works for the symlink path as well. | |
| if getattr(sw, "sha256", None): | |
| try: | |
| self.record_hash_node(norm_link, sw.sha256) | |
| except Exception as e: # pylint: disable=broad-exception-caught | |
| logger.warning( | |
| f"[fs_tree] Failed to attach hash edge for symlink {norm_link}: {e}" | |
| ) |
| # dotnetAssemblyRef must present, otherwise we have no info on .NET imports | ||
| """ | ||
| Check whether the metadata includes .NET assembly references. | ||
| """ |
There was a problem hiding this comment.
has_required_fields() does "dotnetAssemblyRef" in metadata without checking that metadata is a dict; if a plugin emits non-dict metadata entries (or None), this will raise and prevent relationship processing. Align this with other relationship plugins by guarding with isinstance(metadata, dict) before key checks.
| """ | |
| """ | |
| if not isinstance(metadata, dict): | |
| return False |
Replace Unicode arrows (→, ↔), bullet points (•), smart quotes (' " "),
ellipsis (…), dashes (—, ‐), and other non-ASCII characters with their
ASCII equivalents to ensure compatibility with automated documentation
generation tools.
Addresses feedback from @nightlark in PR #459.
Co-Authored-By: Claude (claude-sonnet-4.5) <noreply@anthropic.com>
Remove "SurfActant plugin:" prefix from establish_relationships() docstrings in dotnet_relationship and pe_relationship modules, as the plugin nature is already clear from context. Addresses feedback from @nightlark in PR #459 (r2784541293). Co-Authored-By: Claude (claude-sonnet-4.5) <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
for more information, see https://pre-commit.ci
Symlink-aware SBOM filesystem graph; fs_tree lookup across relationship plugins; path utils & tests
Summary
This PR makes relationship resolution symlink-aware and more accurate by introducing a first-class filesystem graph (
fs_tree) inside the SBOM model and teaching the .NET/ELF/PE/Java plugins to resolve dependencies via exact path lookups before falling back to legacy heuristics. It also adds small ergonomics (path utils), targeted logging, safer error handling, and a comprehensive test suite.Motivation
Encoding the install tree and symlink edges in a graph lets us: (1) resolve by canonical path, (2) follow links deterministically, and (3) avoid spurious edges.
What changed
1) SBOM model: new
fs_treeand helper APIsAdd
fs_tree: nx.DiGraphtracking directory hierarchy and symlink edges (type="symlink", optionalsubtype="file|directory").New path and lookup helpers:
_add_software_to_fs_tree()builds path hierarchy and tags nodes withsoftware_uuid.get_software_by_path()normalizes paths and resolves entries viafs_treewith symlink traversal.get_symlink_sources_for_path()performs reverse traversal to find all symlinks pointing to a given target.record_symlink(),_add_symlink_edge(), andexpand_pending_dir_symlinks()/expand_pending_file_symlinks()handle immediate and deferred symlink creation.record_hash_node()andget_hash_equivalents()track content-equivalent files via SHA-256 nodes.inject_symlink_metadata()regenerates legacy-stylefileNameSymlinksandinstallPathSymlinksfields from the graph.Extend
add_software_entriesto merge duplicates, discover symlinks, attachContainsedges, and link identical hashes.Split graph builders:
build_rel_graph()for logical relationships;fs_treeis kept separate; filter outPath/symlinkedges fromto_dict_override().Added docstrings and safety checks across all new helpers.
2)
generate.py: symlink capture during crawlSoftwareentries before adding them.inject_symlink_metadata().3) Relationship plugins
.NET (
dotnet_relationship.py):normalize_path.sbom.get_software_by_path, handleapp.config(probing.privatePath,<codeBase href=...>), unmanaged imports viadotnetImplMap.ELF (
elf_relationship.py):DF_1_NODEFLIB.$ORIGINand$LIB."[ELF][final] {dependent} Uses {name} → UUID={match} [phase]"or"... → no match".DF_1_NODEFLIBcheck and explicitly logs default search paths when the flag is not set.printwithlogger.debugfor expanded runpaths.PE (
pe_relationship.py):Java (
java_relationship.py):4) Merge and graph hygiene
Pathnodes in root computation, merges, and relationship output.5) New path utilities
surfactant/utils/paths.pynormalize_path(*parts) → strensures consistent POSIX normalization across Windows/Unix.basename_posix(path) → str.6) Tests
.NET:
fs_tree.add_node().test_dotnet_culture_subdirdocstring (filtering only).test_dotnet_heuristic_matchfor Phase 3.ELF:
_record_symlinkto public API; expanded docstrings for clarity.uuid-4-consumer) to exercise system-path fallback; fixture also demonstrates alias mapping forlibalias.so.test_symlink_heuristic_match_edgeto force heuristic after clearing direct symlink edge.PE:
Java:
Added
tests/utils/test_paths.pyfornormalize_path.Added
tests/sbomtypes/test_fs_tree.pyto validatefs_treepopulation and lookup.Risk & compatibility
fs_treeis internal (non-serialized).Performance considerations
Logging & DX
logger.debugtraces for resolution phases.SBOM model note (post-deserialization)
__post_init__sofs_treelookups work consistently across load/merge workflows. This scans eachinstallPath(and directory children) to re-register symlinks and hashes into both graphs.Test plan
pytest -q tests/relationships/test_dotnet_relationship.py pytest -q tests/relationships/test_elf_relationship.py pytest -q tests/relationships/test_java_relationship.py pytest -q tests/relationships/test_pe_relationship.py pytest -q tests/utils/test_paths.py pytest -q tests/sbomtypes/test_fs_tree.py pytest -q # full suite