Skip to content

enhancement(license-detection): cache PositionSet on LicenseMatch to avoid redundant BitSet construction in filter passes #660

@abraemer

Description

@abraemer

Summary

LicenseMatch stores PositionSpan (lightweight range/discrete enum), but PositionSet (BitSet for O(1) membership) is constructed on-demand via qspan_set() / ispan_set() each time containment/overlap is checked in filter passes. When the same match is checked against many others during merge/overlap filtering, the BitSet is reconstructed repeatedly.

Proposal

Either cache the PositionSet on LicenseMatch after first construction, or construct all position sets once at the start of a filter pass and pass them through.

Benefits

  • Eliminates redundant BitSet construction in hot filter loops
  • BitSet's O(1) membership tests only pay off when amortized over many queries
  • Particularly impactful for the merge and overlap filter stages which do many pairwise comparisons

When

When profiling shows filter-pass time is significant, or when the filter pipeline is next restructured.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions