Skip to content

Releases: OpsiClear/gsmod

v0.1.7

10 Dec 16:44

Choose a tag to compare

Changed

  • Style Consistency Improvements
    • Unified inplace defaults: All methods now default to inplace=True for consistency
      • GSTensorPro methods (35 methods)
      • Filter.call(), FilterGPU.call(), FilterProcessor.apply() protocol
      • Exception: compose_with_transforms() remains inplace=False (composition utility)
    • Parameter naming unification: factor -> scale in scale methods, value -> gamma in adjust_gamma
    • Dict/list comprehensions used for cleaner code in GSTensorPro (apply_color_preset, compute_histogram)
  • TransformValues Non-Uniform Scale Support
    • TransformValues.scale now stores 3-tuple (sx, sy, sz) for per-axis scaling
    • from_scale() accepts both uniform (float) and per-axis (tuple) scales
    • from_matrix() extracts per-axis scales from column norms
    • is_neutral() properly checks 3-tuple scale against [1.0, 1.0, 1.0]
  • Rotation/Scale Center Point Support
    • New center parameter on TransformValues for rotation/scale center
    • Factory methods from_rotation_euler(), from_rotation_axis_angle(), from_scale() accept center
    • to_matrix() correctly applies T(center) @ SR @ T(-center) transformation
    • LearnableTransform correctly handles center in forward pass
  • Transform Pipeline Enhancements
    • New Transform.from_srt() factory for standard Scale-Rotate-Translate order
    • New rotate_euler_deg() method for degree-based rotation
    • Improved docstrings with center point usage examples
  • Docstring Format Standardized
    • All docstrings now use Sphinx reST format (:param:, :returns:) consistently
    • Removed Google-style Args/Returns in favor of Sphinx style
  • TransformConfig/LearnableTransform Updates
    • TransformConfig.scale now supports float or 3-tuple
    • LearnableTransform handles center point for rotation/scale around arbitrary points
  • Numba Kernel Optimizations
    • Filter kernels refactored with dict/list comprehensions
    • Color kernels use cleaner iteration patterns

Fixed

  • LearnableTransform now correctly applies SRT order with center support

v0.1.6

06 Dec 18:50

Choose a tag to compare

Changed

  • Enhanced SH-Aware Color Processing (CPU and GPU)
    • Brightness, Contrast, Saturation, Temperature, Gamma, Hue Shift now apply to BOTH sh0 (DC) and shN (higher-order SH coefficients)
    • Improved SH mode handling: brightness/contrast use corrected formulas for rendered RGB behavior
    • Vibrance now correctly converts SH->RGB->SH (required for adaptive saturation)
    • Split toning (shadow/highlight tints) remain DC-only (additive operations)
    • Conditional clamping: only applied when is_sh0_rgb=True (SH coefficients can be negative or >1)
    • Automatic SH->RGB->SH conversion for operations requiring RGB format
  • Dependencies Updated
    • gsply requirement updated from 0.2.11 to >=0.2.13
    • torch moved to optional dependency (install separately: pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu128)
  • Code Consolidation
    • _axis_angle_to_rotation_matrix_numpy() moved to shared/rotation.py (single canonical implementation)
    • filter/api.py and filter/apply.py now import from shared module
  • Documentation Updated
    • CLAUDE.md: Updated module references (color/ module structure, distributed */kernels.py files)
    • WORKFLOWS.md: Updated API references to current exports (GSDataPro, ColorValues, Filter, Pipeline)

Fixed

  • CRITICAL: Format Tracking in GSTensorPro
    • Fixed format auto-detection incorrectly changing format during initialization
    • _format now properly passed in all factory methods (from_gsdata, from_gstensor, clone, to)
    • Prevents LINEAR scales from being incorrectly detected as PLY log-scales
    • Affects: All GSTensorPro creation and conversion operations
  • Numerical Stability Improvements
    • LUT kernels: clamp to [0,1] before conversion to int() to avoid NaN/Inf issues
    • Ellipsoid filter: guard against division by zero with minimum radius (1e-7)
    • Logit conversion: use log subtraction form for better numerical stability
    • Consistent epsilon values: standardized to 1e-7 across codebase
  • FilterValues.is_neutral() Bug Fix
    • Now correctly checks invert parameter (invert=True is NOT neutral)
  • GPU Opacity/Scale Filtering
    • Improved threshold conversion with proper clamping to avoid boundary issues
    • Optimized: compute max_scales once when needed for both min/max filters

Performance

  • GPU scale filtering: eliminated redundant max_scales computation (2x evaluation -> 1x)

v0.1.5

03 Dec 18:40

Choose a tag to compare

Added

  • Spherical Harmonics (SH) Color Support (gsmod.color.sh_kernels, gsmod.color.sh_utils)
    • Full SH-aware color operations matching GPU ground truth behavior
    • Brightness and Saturation: Apply to BOTH sh0 (DC) and shN (higher-order SH bands)
    • All other operations: Apply to sh0 ONLY (contrast, gamma, temperature, tint, hue, vibrance, fade)
    • Numba-optimized kernels: apply_scale_to_sh_numba(), apply_matrix_to_sh_numba(), apply_contrast_to_sh_numba(), etc.
    • SH utilities: sh_to_rgb(), compute_luminance(), matrix builders for saturation/temperature/hue
    • CPU/GPU consistency: CPU now matches GPU SH handling exactly
  • Triton GPU Kernels (gsmod.torch.triton_kernels)
    • Fused GPU kernels for maximum performance on NVIDIA GPUs
    • Kernels: triton_adjust_brightness(), triton_adjust_contrast(), triton_adjust_gamma(), triton_adjust_saturation(), triton_adjust_temperature()
    • Graceful fallback to PyTorch operations if Triton unavailable
    • Block-based parallelization optimized for GPU architecture
  • GSTensorPro.from_gstensor() Factory Method
    • Convert gsply GSTensor to GSTensorPro while preserving format state
    • Recommended workflow: load with gsply, convert formats, then wrap with GSTensorPro
    • Preserves is_sh0_rgb, is_scales_ply, is_opacities_ply format tracking
    • Optional device/dtype conversion during wrapping
  • Format-Aware CPU Filtering (gsmod.filter.apply)
    • CPU filters now handle PLY format (logit opacities, log scales) correctly
    • Automatic threshold conversion: linear -> logit/log when filtering PLY format data
    • Matches GPU FilterGPU behavior for CPU/GPU consistency
    • Uses is_opacities_ply and is_scales_ply format properties from gsply 0.2.11
  • Transform Log-Space Kernel (gsmod.transform.kernels)
    • Added elementwise_add_scalar_numba() for log-space scale transforms
    • Enables efficient scale operations in PLY format (log space) where multiplication becomes addition
  • New Benchmarks
    • benchmarks/benchmark_sh_color.py: NumPy vs Numba performance for SH operations
    • benchmarks/benchmark_triton.py: PyTorch vs Triton kernel performance comparison
    • benchmarks/benchmark_gpu_vs_cpu.py: Comprehensive GPU vs CPU performance analysis

Changed

  • Color Application Refactored (gsmod.color.apply)
    • apply_color_values() now accepts shN parameter and returns tuple (sh0, shN)
    • Matches GPU ground truth: brightness/saturation on both sh0+shN, all else on sh0 only
    • Removed LUT-based implementation in favor of direct operations for better SH support
    • Breaking change: Return type changed from np.ndarray to tuple[np.ndarray, np.ndarray | None]
  • GPU Color Methods Enhanced (gsmod.torch.gstensor_pro)
    • All color adjustment methods now use Triton kernels when available
    • Shadows/highlights now use smoothstep curves matching supersplat shader
    • Better visual quality with smooth shadow/highlight transitions
    • Temperature and tint operations preserve format awareness
  • Dependencies Updated
    • gsply requirement updated from 0.2.10 to 0.2.11
    • Supports latest format query properties and performance improvements

Performance

  • SH color operations: 10-30x faster with Numba kernels vs pure NumPy
  • GPU color operations: Additional 10-20% speedup with Triton kernels (when available)
  • CPU-GPU consistency: No performance penalty for matching behavior

Fixed

  • CPU color operations now correctly handle SH formats matching GPU behavior
  • Shadow/highlight curves now use proper smoothstep interpolation (matches supersplat)
  • Format-aware filtering eliminates opacity/scale threshold bugs in PLY format

Documentation

  • Updated CLAUDE.md with SH color operation semantics
  • Documented brightness/saturation special case (both sh0 and shN)
  • Added GPU ground truth reference in color module docstrings

v0.1.4

27 Nov 05:20

Choose a tag to compare

[0.1.4] - 2025-11-26

Added

  • Auto-Correction Module (gsmod.color.auto)
    • Industry-standard automatic color correction algorithms (Photoshop/Lightroom/iOS Photos style)
    • auto_enhance(): Combined enhancement (exposure + contrast + white balance), like iOS Photos Auto
    • auto_contrast(): Percentile-based histogram stretching (0.1% clipping), like Photoshop Auto Contrast
    • auto_exposure(): 18% gray midtone targeting (0.45 in gamma space)
    • auto_white_balance(): Gray World and White Patch methods
    • compute_optimal_parameters(): Minimal adjustments to reach target statistics
    • AutoCorrectionResult: Dataclass with computed adjustments, converts to ColorValues via .to_color_values()
    • Self-referential analysis (no target histogram required)
  • Perceptual Loss Functions (gsmod.histogram.loss)
    • PerceptualColorLoss: Comprehensive loss addressing flat histogram problem
      • Contrast preservation (penalizes reduction below threshold)
      • Dynamic range matching (5th/95th percentiles)
      • Parameter regularization (keeps values near neutral)
    • ContrastPreservationLoss: Standalone contrast preservation loss
    • ParameterBoundsLoss: Soft penalty for extreme parameter values
    • create_balanced_loss(): Factory function for balanced defaults

Changed

  • Filter Atomic Class Architecture (gsmod.filter.atomic.Filter)
    • Rewritten to use FilterValues internally for fused kernel path
    • AND operations (Filter & Filter) now merge FilterValues for single kernel execution
    • OR/NOT operations correctly fall back to mask combination approach
    • Added internal helpers: _from_values() and _from_mask_fn()
    • Factory methods simplified to construct FilterValues directly
  • Pipeline Operation Merging (gsmod.pipeline.Pipeline)
    • Added _merge_operations() method to merge consecutive same-type operations
    • Color, transform, and filter operations merged using their + operator
    • Reduces number of kernel calls for better performance

Performance

  • Filter AND operations: 2.8x faster via merged FilterValues (single fused kernel)
  • Pipeline transform merge: 3.5x faster when consecutive transforms combined
  • Pipeline full chain: 1.6x faster overall with operation merging
  • Filter.get_mask(): 2.5x faster after removing logger.debug overhead

Documentation

  • Documented expected ~2% color merge difference due to LUT quantization
    • This is mathematically correct behavior (quantization artifacts)
    • Performance benefit outweighs minor precision difference

Fixed

  • Filter.to_values() now correctly supports AND combinations (returns merged FilterValues)
  • OR and NOT combinations correctly raise ValueError (cannot be represented as single FilterValues)

v0.1.3 feature update

26 Nov 21:16

Choose a tag to compare

[0.1.3] - 2025-11-26

Added

  • Unified Pipeline Class (gsmod.pipeline.Pipeline)
    • CPU pipeline class matching GPU PipelineGPU interface
    • Fluent API for chaining color, transform, and filter operations
    • Method chaining: .brightness(), .saturation(), .translate(), .scale(), .min_opacity(), etc.
    • Operations accumulated and executed in order when called
    • Single unified interface for all processing operations
  • Filter Atomic Class (gsmod.filter.atomic.Filter)
    • Immutable filter class with boolean operators (&, |, ~)
    • Factory methods: min_opacity(), max_opacity(), min_scale(), max_scale(), sphere(), box(), ellipsoid(), frustum()
    • Combine filters with logical operators for complex patterns
    • Direct mask computation via .get_mask() method
    • Apply filters via callable interface: filter(data, inplace=False)
  • Extended GSDataPro Filter Methods
    • Individual filter methods: filter_min_opacity(), filter_max_opacity(), filter_min_scale(), filter_max_scale()
    • Geometry filters: filter_within_sphere(), filter_outside_sphere(), filter_within_box(), filter_outside_box()
    • Advanced geometry: filter_within_ellipsoid(), filter_outside_ellipsoid(), filter_within_frustum(), filter_outside_frustum()
    • Transform methods: translate(), scale_uniform(), scale_nonuniform(), rotate_quaternion(), rotate_euler(), rotate_axis_angle(), transform_matrix()
    • Color adjustment: adjust_brightness() and other individual color methods
  • GPU Filter Enhancements (gsmod.torch.filter.FilterGPU)
    • Rotated box filtering: within_rotated_box(), outside_rotated_box()
    • Ellipsoid filtering: within_ellipsoid(), outside_ellipsoid()
    • Frustum filtering: within_frustum(), outside_frustum()
    • Optimized kernels: _filter_rotated_box(), _filter_ellipsoid(), _filter_frustum()
    • Axis-angle to rotation matrix conversion: _axis_angle_to_rotation_matrix()

Changed

  • Filter Architecture Simplified
    • Atomic Filter class now uses single fused kernel path
    • Removed redundant kernel implementations
    • Improved performance through kernel consolidation
  • Full CPU-GPU Parity
    • All filter operations now available on both CPU and GPU
    • Consistent API between GSDataPro and GSTensorPro
    • Unified behavior across backends

Fixed

  • Test Suite Stability
    • All 498 tests passing (with 55 skipped GPU tests when CUDA unavailable)
    • Improved test coverage to 61% overall
    • Enhanced equivalence tests between CPU and GPU implementations

Performance

  • Filter atomic operations use optimized Numba kernels for 40-100x speedup
  • Unified Pipeline reduces method call overhead
  • GPU filters maintain 100-180x speedup over CPU for large datasets

v0.1.2 Clean up

25 Nov 17:36

Choose a tag to compare

Changed

  • Dependencies
    • Updated gsply requirement from >=0.2.8 to ==0.2.10 (exact version pin)
  • Style Harmonization with gsply
    • All docstrings now use :returns: instead of :return: (consistent with gsply)
    • Enhanced module-level docstrings with performance notes and examples
    • color/__init__.py: Added performance metrics (1,091M colors/sec)
    • transform/__init__.py: Added performance metrics (698M Gaussians/sec)
    • filter/__init__.py: Added performance metrics (46M Gaussians/sec)
    • torch/__init__.py: Added GPU benchmark details (183x speedup, 1.09B Gaussians/sec)
  • Configuration Class Naming
    • Renamed GsproConfig to GsmodConfig for consistency with project name
    • Updated all references in codebase (config module, AGENTS.md, benchmarks, examples)
    • Export name updated: import GsmodConfig from gsmod.config
  • Configuration Class Documentation
    • Fixed misleading "will be deprecated" comment on ColorGradingConfig
    • Clarified that config classes (ColorGradingConfig, TransformConfig, etc.) are canonical and actively used
    • Reorganized torch module imports for better clarity

Removed

  • Deprecated Backward Compatibility Aliases (Breaking Change)
    • Removed LearnableColorGrading (use LearnableColor instead)
    • Removed SoftFilter (use LearnableFilter instead)
    • Removed GSTensorProLearn (use LearnableGSTensor instead)
    • Removed SoftFilterConfig (use LearnableFilterConfig instead)
    • Impact: Zero usage found in codebase, tests, benchmarks, or documentation
    • Migration: Update imports to use new standardized names with "Learnable" prefix
  • Property Aliases from ColorValues (Breaking Change)
    • Removed black_level, white_level (use brightness, contrast)
    • Removed lift, gain (use shadows, highlights)
    • Removed exposure (use brightness)
    • Removed midtones (use gamma)
    • Removed vibrancy (use vibrance)
    • Removed blacks, whites (use shadows, highlights)
    • Impact: Zero usage found in codebase
    • Rationale: Simplifies API, aligns with gsply's design philosophy of no property aliases

Fixed

  • Removed confusing "legacy" comment from Color pipeline import
  • Improved clarity of module organization and export structure

v0.1.1 improved method coverage

25 Nov 14:21

Choose a tag to compare

Added

  • Opacity adjustment module with format-aware opacity scaling
  • OpacityValues config dataclass with fade() and boost() factory methods
  • GaussianProcessor unified interface for auto-dispatching CPU/GPU operations
  • Shared rotation utilities module for code reuse between backends
  • Enhanced protocol definitions with ColorProcessor, TransformProcessor, FilterProcessor
  • Opacity support in GSDataPro and GSTensorPro classes
  • GSDataPro: Primary API with direct .color(), .filter(), .transform() methods
  • GSTensorPro: GPU tensor wrapper with same API as GSDataPro
  • Config values: ColorValues, FilterValues, TransformValues dataclasses
  • Presets: Built-in color, filter, and transform presets
  • Histogram computation with HistogramResult
  • Scene composition utilities: concatenate, compose_with_transforms, merge_scenes
  • Format verification with FormatVerifier
  • GPU acceleration with up to 183x speedup

Changed

  • Unified parameter semantics between CPU and GPU pipelines
  • Filter radius now uses absolute world units (not relative to scene bounds)
  • Improved Numba JIT compilation with automatic warmup
  • Rotation utilities moved to shared module for better code organization
  • Format property access now uses public API (is_opacities_ply, is_scales_ply)
  • Improved format tracking in GSDataPro and GSTensorPro
  • Fixed TransformValues.is_neutral() with robust float comparison

Performance

  • Color: 1,091M Gaussians/sec
  • Transform: 698M Gaussians/sec
  • GPU: 1.09B Gaussians/sec on RTX 3090 Ti

Initial release

24 Nov 07:06

Choose a tag to compare

See readme